3 Data Structures

3.1 Table relationships

This database contains several tables. The ERD for squ_ce (Figure 1) shows the logical structure1 of the database and it's entities (each entity is implemented as a database table )and relationships between these tables and tables in other databases. This schema is valid regardless of the database system chosen, and it can remain correct even if the Database Management System (DBMS) is changed. Each table represents an object, event, or concept in the real world that is selected to be represented in the database. Each attribute of a table is a defining property or quality of the table. All of the table's attributes are shown in the ERD. The underlined attributes represent the table's primary key2.

Note that Figure 1 shows the main tables only. Note that most tables contain foreign keys3. These foreign keys define the relationships between the tables in squ_ce.

The squ_ce database is implemented as a relational database; i.e., each table is a special case of the mathematical construct known as a relation and hence elementary relation theory is used to deal with the data within tables and the relationships between them. There are three types of relationships possible between tables, but only one exists in squ_ce: one-to-many4. These relationships can be seen in ERDs by connecting a single line (indicating "many") from the child table; e.g., t_events, to the parent table; e.g., t_vessels, with an arrowhead (indicating "one")pointing to the parent.

Every relationship has a mandatory or optional aspect to it. If a relationship is mandatory, then it has to occur at least once, while an optional relationship might not occur at all. For example, in Figure 1, consider that relationship between the table t_vessels and it's child table t_events. The symbol "O" by the child t_events means that a vessel can have zero or many events records, while the bar by the parent t_events means that for every event there must be a matching vessel record.


Figure 1: Entity Relationship Diagram (ERD) for the squ_ce database
(Click to enlarge)


These links are enforced by referential constraints5. Constraints do not allow orphans to exist in any table; i.e., where a child record exists without a related parent record. This may happen when: a parent record is deleted; the parent record is altered so the relationship is lost; or a child record is entered without a parent record.

Constraints are shown in the table listings by the following format:

Referential: constraint name (attribute[, attribute])|INSERT|
|DELETE|
 parent table (attribute[, attribute]) 

Note that the typographical convention for the above format is that square brackets "[ ]" may contain more than one item or none at all. Items stacked between vertical lines | | are options of which one must be chosen.

For example, consider the following constraint found in the table t_events:

Referential: Invalid vessel key (vess_key) INSERT t_vessels (vess_key)

This means that the value of the attribute samp_location in the current record must already exist in the parent table sites or the record will be rejected and the following message will be displayed:

*** User Error:insert constraint "Invalid sample location"violation

For tables residing in external databases, the parent table name will be prefixed by the name of the database.

Section 5 lists all the squ_ce tables as implemented by the Empress RDBMS. As can be seen in the listing of the tables, a table's primary key has an unique index on it. Primary keys are generally listed using the following format:

Indices:

UNIQUE index_name ON (attribute[, attribute])

where attribute(s) make up the primary key and the index name is the primary key name. These prevent records with duplicate keys from being inserted into the tables; e.g., a record with an existing event key.

The database listing (Tables 1-5) show that the tables also have indices on many attributes. That is, attributes that are most likely to be used as a searching key have like values linked together to speed up searches. These indices are listed using the following format:

Indices:

NORMAL (2, 15) index_name ON (attribute[,attribute])

Note that indices may be simple, pointing to one attribute or composite pointing to more than one attribute. The numbers "…(2, 15)…" in the syntax are Empress RDBMS default values relating to the amount of space allocated for the index.

3.2 Database design

The squ_ce database is built around the premise that all fishing trips are based on events. Where an event is something (anything) that occurs at the particular position on earth and at a certain time onboard a vessel. In the instance of the MFish Catch and Effort system, there are three types of events:a fishing event (when a trawl or jig is made); a processing event(when an amount of fish is processed over a certain time period);and an environmental event (a weather or sea condition measurement). These three event type are specialisations of the generalise entity "events". This is modeled in the squ_cedatabase as a "GENSPEC" structure. Generalisation and specialisation are pictured in Figure 2 using a triangle containing the words "IS A TYPE OF" to connect the components to each other and to the higher-level entity. In a GENSPEC structure, attribute in common with all events are stored in the table t_events (Table 1), such as event start and finish date and time, start and finish position, vessel key, and statistical area. Each event, regardless of event type, is identified by an event_key attribute.


Figure 2: GENSPEC diagram for the "events" entity

Attributes specific to a fishing event are stored in the table t_fish_evnts (Table 2). The event_key attribute in t_fish_evnts can be used to link to the t_events table to get the appropriate time and position details. Details stored within t_fish_evnts include: target species code, fishing method code, and bottom and fishing gear depth.

The catch of fish as a result of a fishing event is stored in the table t_catches (Table 3). Each record in t_catches stored the catch weight for one species.

Fishing processing events specific attributes are stored in the table t_proc_evnts (Table 4). The event_key attribute in t_proc_evnts can be used to link to the t_events table to get the appropriate time and position details for the processing event.Fish processing events might relate to the processing of a catch from one trawl but is more likely to be the details for fish processing over a 24-hour period. Details recorded in t_proc_evnts include species code, processed state,number of units of processed fish, green weight, and conversion factors (from green weight to processed weight). Descriptions of the processed state codes can be found in the table t_proc_stat (Table 5).


Figure 3: ERD of the specialisations of the entity "events".
(Click to enlarge)

Environmental specific attributes are stored in the table t_environ (Table 6). The event_key attribute in t_environ can be used to link to the t_events table to get the appropriate time and position details. Such environmental variables recorded include wind speed and sea-surface temperature. For squid trawlers, these variables are recorded for the first trawl of the day; for squid jiggers, these variables are recorded at 0100 each morning.

A fundamental factor of all data derived from the MFish Catch and Effort databases is data versioning. Each record can have multiple versions ranging from the literal version to modified to incorporate error fixes or interpretations. Each record has a sequential version number associated with it, stored in the attribute data_ver. This number is incremented as each change is made to the record. The most current version of Each record is therefore the one with the highest data version number.The current data version numbers for each event record are stored in the table t_versions (Table 7).

Individual vessels are identified in the t_events table by a vessel key. The table t_vessels (Table 8) stores the individual vessel details, such as vessel name, registration number, radio call sign,and gross tonnage.


Figure 4: ERD of the squid jig view incorporating the v_fish and v_proc views.
(Click to enlarge)

Vessels in the arrow squid fishery are registered with MFish as either a domestic, foreign licensed, or charter vessel. Over time, the registration status of a vessel may change from one to any of the other registration types. The history of vessel registration types is stored in the table t_vess_reg (Table 9). The registration date is usually the date that the vessel was entered in to the MFish registration system; however, this date may not always be known. As the registration date is part of the primary key, a dummy date of1st January 1900 is entered for those vessels of unknown registration date. The type of vessel (i.e., whether the vessel is a trawler or a jigger) is stored as 3character attribute vess_type. The full descriptions of the vessel type codes are stored in the table t_vess_type (Table 10).

In additional to the core tables in squ_ce, there are numerous single-and multi-table views to simplify database queries. The simplest of these views allow users to always query the most current version of the records in the main data tables t_events, t_fish_evnts, t_catches,t_proc_evnts, t_environ with the views v_events (View 1), v_fish_evnts (View2), v_catches (View 3), v_proc_evnts(View 4), and v_environ (View 5) respectively.

The next three views allow users to utilise the GENSPEC structure efficiently. The views v_fish (View6), v_proc (View 7), and v_env (View 8)join the specialised entities of t_fish_evnts,t_proc_evnts,and t_environ to the generalised entity t_events (Figure 3).

The last view v_jigs (View 9)joins all the tables together to allow users to easily access all data from any night's jigging from any single vessel (Figure4).


1 Also known as a database schema.
2
A primary key is an attribute or a combination of attributes that contains an unique value to identify that record.
3 A foreign key is an attribute or a combination of attributes that is a primary key in another table.
4 A one-to-many relationship is where one record (the parent) in a table relates to one or many records (the child) in another table; e.g., one vessel in t_vessels can have many events in t_events but one event can only come from one vessel.
5 Also known as integrity checks.

Updated : 16 November 2007