3 Data Structures
3.1 Table relationships
This database contains several tables. The ERD for hmetal (Figure
1) shows the logical structure1 of the database and it's entities
(each entity is implemented as a database table) and relationships
between these tables and tables in other databases. This schema is
valid regardless of the database system chosen, and it can remain
correct even if the Database Management System (DBMS) is changed.
Each table represents an object, event, or concept in the real world
that is selected to be represented in the database. Each attribute of
a table is a defining property or quality of the table.
All of the table's attributes are shown in the ERD. The underlined
attributes represent the table's primary key [A primary key is an
attribute or a combination of attributes that contains an unique
value to identify that record ].
Note that Figure 1 shows the main tables only. Note that most
tables contain foreign keys [A foreign key is an attribute or a
combination of attributes that is a primary key in another table.].
These foreign keys define the relationships between the tables in
hmetal.
The hmetal database is implemented as a relational database; i.e.,
each table is a special case of the mathematical construct known as a
relation and hence elementary relation theory is used to deal with
the data within tables and the relationships between them. There are
three types of relationships possible between tables, but only one
exists in hmetal: one-to-many [A one-to-many relationship is where
one record (the parent) in a table relates to one or many records
(the child) in another table; e.g., one site in sites can have many
fish samples in fish_samples but one fish sample can only come].
These relationships can be seen in ERDs by connecting a single line
(indicating "many") from the child table; e.g.,
fish_samples, to the parent table; e.g., sites, with an arrowhead
(indicating "one") pointing to the parent. Note that the
word "many" applies to the possible number of records in
one table that one record in another table is associated with. For a
given instance, there might be zero, one, two, or more associated
records, but if it is ever possible to have more than one, we use the
word "many" to describe the association.
Every relationship has a mandatory or optional aspect to it. If a
relationship is mandatory, then it has to occur at least once, while
an optional relationship might not occur at all. For example, in
Figure 1, consider that relationship between the table sites and it's
child table fish_samples. The symbol "O" by the child
fish_samples means that a catch site can have zero or many sample
records, while the bar by the parent sites means that for every
sample there must be a matching catch site record.
These links are enforced by referential constraints [Also known as
integrity checks.]. Constraints do not allow orphans to exist in any
table; i.e., where a child record exists without a related parent
record. This may happen when: a parent record is deleted; the parent
record is altered so the relationship is lost; or a child record is
entered without a parent record 1 Also known as a database schema.
Constraints are shown in the table listings by the following
format:
|
Referential:
|
constraint name (attribute[, attribute])
|
|INSERT|
|DELETE|
|
|
|
parent table (attribute[, attribute])
|
|
Note that the typographical convention for the above format is
that square brackets [] may contain more than one item or none at
all. Items stacked between vertical lines | | are options of which
one must be chosen.
For example, consider the following constraint found in the table
fish_samples:
|
Referential:
|
Invalid sample location (samp_location)
|
INSERT
|
|
|
sites (samp_location)
|
|
This means that the value of the attribute samp_location in the
current record must already exist in the parent table sites or the
record will be rejected and the following message will be displayed:
*** User Error: insert constraint "Invalid sample location"
violation
For tables residing in external databases, the parent table name
will be prefixed by the name of the database.
Section 5 lists all the hmetal tables as implemented by the
Empress RDBMS. As can be seen in the listing of the tables, a table's
primary key has an unique index on it. Primary keys are generally
listed using the following format:
Indices: UNIQUE index_name ON (attribute[, attribute])
where attribute(s) make up the primary key and the index name is
the primary key name. These prevent records with duplicate keys from
being inserted into the tables; e.g., a record with an existing area
code.
The database listing (Tables 1-5) show that the tables also have
indices on many attributes. That is, attributes that are most likely
to be used as a searching key have like values linked together to
speed up searches. These indices are listed using the following
format:
Indices: NORMAL (2, 15) index_name ON ( attribute[, attribute])
Note that indices may be simple, pointing to one attribute or
composite pointing to more than one attribute. The numbers "...(2,
15)..." in the syntax are Empress RDBMS default values relating
to the amount of space allocated for the index.
3.2 Database design
The hmetal database is constructed around two main entities: a
sample site and a fish sample. A third entity has been added to
record a spatial component to these data.
The first table sites (Table 1) contains area codes and location
details of sites from which fish samples were taken for analysis.
In order to allow spatial analysis of
the data, a third table of positions areas (Table 2) was created.
This table assigns each of the sample sites with a latitude and
longitude based on the value of the area attribute.
Individual fish samples are recorded in the fish_samples table
(Table 3). Details recorded include sample date and location, species
code, fish length, weight, sex, and amounts of various organochlorine
insecticides and heavy metals such as DDT, zinc, arsenic, lead, and
mercury. All amounts were recorded in as total mg per kg-1 muscle
tissue. Fish samples are linked to the sites table by the attribute
samp_location.
A view v_main (Figure 2) allows users to perform a denormalised
multi-table join on all these tables to select complete records,
which include details fish samples, their sample sites, and
positions.