Skip Navigation LinksHome > Research Opportunities > Research Database Documentation > rocklob > Data structures: rocklob database

3 Data structures

3.1 Table relationships

This database contains several tables. The ERD for rocklob (Figure 1) shows the physical data model structure1 of the database and its entities (each entity is implemented as a database table) and relationships between these tables. Each table represents an object, event, or concept in the real world that has been represented in the database. Each attribute of a table is a defining property or quality of the table.

 


Figure 1: Entity Relationship Diagram (ERD) showing the relationships between the tables that hold data from puerulus settlement collectors.

All of the table’s attributes are shown in the ERD. The underlined attributes represent the table’s primary key2. This schema is valid regardless of the database system chosen, and it can remain correct even if the Database Management System (DBMS) is changed. Most of the tables in the rocklob database also contain special attributes, called foreign keys3.

Section 5 shows a listing of all the rocklob tables as implemented by the EMPRESS RDBMS. As can be seen in the listing of the tables, a table’s primary key has an unique index on it. Primary keys are generally listed using the format:

Indices:	UNIQUE index_name ON (attribute [, attributes ])

where the attribute(s) make up the primary key (the key attributes) and the index name is the primary key name. Note that the typographical convention for the above format is that square brackets [] may contain more than one item or none at all.

The unique index prevents records with duplicate key values from being inserted into the table; e.g., a new survey being inserted with an existing survey number, and hence ensures that every record can be uniquely identified.

The rocklob database is implemented as a relational database. That is, each table is a special case of the mathematical construct known as a relation and hence elementary relation theory is used to deal with the data within tables and the relationships between them. All relationships in rocklob are of the type one-to-many4. This is shown in the ERD by connecting a single line (indicating ‘many’) from the child table (e.g., t_station) to the parent table (e.g., t_trip) with an arrowhead (indicating ‘one’) pointing to the parent. For example, consider the relationship between the tables t_trip (the parent table) and t_station (the child table). Any one trip in t_trip can have one or more stations in t_station, but any one station can only be a part of one trip.

Every relationship has a mandatory or optional aspect to it. That is, if a relationship is mandatory, then it has to occur and least once, while an optional relationship might not occur at all. For example, in Figure 1, consider that relationship between the table t_station and it’s child table t_phy. The symbol “O” across the relationship line by the child t_phy means that a station record can have zero or many phyllosoma catch records, while the bar across the relationship line by the parent t_station means that for every phyllosoma catch record there must be a matching station record.

These relationships are enforced in the database by the use of referential constraints5. Constraints do not allow orphans to exist in any table; i.e., where a child record exists without a related parent record. This may happen when: a parent record is deleted; the parent record is altered so the relationship is lost; or a child record is entered without a parent record. All constraints in rocklob prevent the latter from occurring. Constraints are shown in the table listings by the following format:

Referential:	error message (attribute[, attribute]) INSERT parent table (attribute[, attribute])

For example, consider the following constraint found in the table t_station:

Referential:	No such survey (survey) INSERT t_survey (survey)

This means that the value of the attribute survey in the current record must already exist in the parent table t_survey or the record will be rejected and the error message “No such survey” will be displayed. All tables are indexed. That is, attributes that are most likely to be used for searching, such as survey, have like values linked together to optimise search times.

Such indices are shown in the table listings (Section 5) by the following syntax:

Indices:	NORMAL (2, 15) index_name ON (attribute{, attributes})

Note that indices may be simple, pointing to just one attribute, or composite, pointing to more than one attribute. The numbers “…(2, 15)…” are EMPRESS default values relating to the amount of space allocated to index storage.

 

3.2 Database design

The rocklob database has been designed around two distinct sources of data: puerulus collector monitoring and phyllosoma trawl surveys.

3.2.1 Puerulus collector monitoring

Pueruli, post-pueruli, and young juveniles of J. edwardsii occur most abundantly in shallow waters. Around New Zealand, pueruli occur intertidally in crevices, holes, and indentations under boulders. Artificial crevice collectors, deployed by various methods (shore, sea floor, and surface/midwater), are used in locations around the country as a means of measuring abundance of settlement. Location and the method of collector deployment are stored in the table t_locations (Table 1).

Periodically (usually monthly) these collectors and ‘natural’ settlement (such as intertidal beaches, natural crevices and indentations in boulders) are checked for pueruli settlement. The details of such checks, including numbers of pueruli caught, are stored the table t_catch (Table 2). Each check is uniquely identified by the attributes: location, a 3-character code of the geographic location; date_checked; and collector_no, a unique sequential number given to each collector at a location. The total number of all lobsters caught for each collector during the check in stored in no_catch. At the time of the check, lobsters are not staged so there is no distinction made been pueruli, post-pueruli, and young juveniles until each lobster is measured and staged at a later date.

Historically, all lobsters caught by the collectors at certain locations were measured by carapace length. These length frequency data are stored in the table t_puer_lfreq (Table 3). Each record in this table stores the number of lobsters per millimetre length class. Lobsters were also recorded as being either pueruli or post-pueruli. So, for each millimetre length class, t_puer_lfreq records the numbers of pueruli (no_p) and the numbers of poet-pueruli (no_pp). The smaller length classes having only pueruli, the larger lengths having only post-pueruli, and in some uncommon cases, a length class may have counts for both.

Currently, all pueruli and post-pueruli lobsters are staged based on their life history. These staging data are stored in the table t_puer_stage (Table 4). Staging only takes place on lobsters up to 14.5mm carapace length as it is generally accepted that this is the maximum size that lobsters can grow in one month (the time between collector checks). Lobsters below this size are staged to a four-point scale. The number of all lobsters over this size is also recorded as a measure of post-settlement migration.

3.2.2 Phyllosoma trawl surveys

The phyllosoma trawl survey data model is an adaptation of the trawl survey data model (Mackay 1998).

Several trawl surveys have been carried out since the 1970s to define the extent of occurrence and to index abundance of advanced stage phyllosomas. Details of such trips are stored in the table t_trip (Table 5), including start/finish dates, areas surveyed, and parameters of trawl gear used.

Surveys may or may not have included strata as part of the methodology, but where they were involved no details of strata are recorded in rocklob. Unlike other stratified trawl surveys, phyllosoma surveys do not involve random stations nor are catches scaled up to any area.

These surveys all involve transects, with a number of stations (where trawl gear was deployed) occurring along the transect path. Station details are stored in the table t_station (Table 6). Catch details from each station are included within t_station; e.g., total volume caught, and volume sampled. Only 4 items within each catch are recorded so each has it’s own attribute within t_station for catch numbers. These attributes are: phy for Jasus sp. phyllosomas; puer for pueruli; ibacus for Ibacus sp. phyllosoma; and scyllarus for Scyllarus sp. phyllosoma.

Comments on stations are stored in a separate table t_stat_comm (Table 7). Having the comments in their own tables means that each station can have none or as many comments as a user wishes.

Measurements taken from individual phyllosoma are stored in t_phy_stage (Table 8). Currently, only life history stages are recorded for individual specimens, although, this table can be expanded to include other measurements if required.

Two views of this table (v_scyllarus and v_jasus) exist to provide phyllosoma stage data for Scyllarus sp. and Jasus sp. respectively. A view is a table that does not have any existence in its own right but is instead a snapshot of data derived from one or more base tables.

One of the aims of phyllosoma trawl surveys is to compare abundance data with changes in ocean climate. To this end, on RV Tangaroa surveys, physical oceanographic data such as surface temperature, salinity, and sea currents are logged real-time from the vessel’s sensors by the Data Acquisition System (DAS). DAS controls data logging automatically by monitoring the trawl gear warp length starting logging when the trawl gear is let out, stopping when the trawl gear is hauled. Data are recorded into discrete envelopes of time and each envelope is roughly comparable with a station. Each envelope is given a sequential envelope number, with summary information such as matching station number and start/finish time stored in t_evpsumm (Table 9).

Actual data recorded by the DAS for an envelope are stored in t_dasdata (Table 10) with each record having a timestamp (timestamp) to say when it was recorded, an identifier (tally_id) to say which sensor is being measured, and an observed value (obs) from the sensor.

Explanations for all phyllosoma and puerulus developments stages are stored in the table t_stage_codes (Table 11). From this table, there are two views, which show explanations of development stage codes for puerulus (v_puer_stage_codes) and phyllosoma (v_phy_stage_codes) exclusively.


1Also known as a database schema.
2A primary key is an attribute or a combination of attributes that contains an unique value to identify that record.
3 A foreign key is any attribute, or a combination of attributes, in a table that is a primary key of another table. Tables are linked together through foreign keys.
4 A one-to-many relationship is where one record (the parent) in a table relates to one or many records (the child) in another table; e.g., one survey in t_trip can have many stations in t_station but any one station can only come from one survey.
5Also known as integrity checks.

Updated : 16 November 2007