Skip Navigation LinksHome > Research Opportunities > Research Database Documentation > nonfish_bycatch > Appendix 2 - Data entry, error checking, and loading

Appendix 2 - Data entry, error checking, and loading

The data in nonfish_bycatch have come from the fishing industry.

This section outlines the flow of paper-recorded data, from collection through to its availability to researchers for analysis, and defines the separate tasks that are required to do this.

In summary, the nonfish_bycatch data are recorded on hand written paper forms. Each trip is identified by a unique trip_key, each tow or set by a unique station_key, and each capture by a unique catch_key.

1. Pre-key entry, visual checking and batching:

The data are forwarded via Mfish, to a project team member, who checks the forms, and forwards the data to key entry.

2. Key entry of data:

At this point, trained data entry operators key in the data from the collated forms to an electronic fixed format ASCII file format on computer by keyboard entry. NIWA uses the KEYS Data Emulator for data entry.

All data entry is verified, that is, each page of data is keyed in twice and the two results are cross-checked for mismatches. Any data entry operator errors are corrected at this point.

The electronic data files are transferred for error checking along with the original raw data file. At this point the data are now ready for error checking and formatting routines.

3. Data error checking, validation, and grooming:

Data files are put through a number of computer error checking (validation) routines that look for inaccuracies and inconsistencies within trips. Any errors detected are corrected. Data are then passed through these errorchecking routines until the data reach a satisfactory standard that will allow them to be inserted in the appropriate database tables.

The data are inserted into "working tables". This allows further checks of the integrity of the data, by taking advantage of relational databases ability to manipulate, match and compare related sets of data.

4. "Groomed", validated data loaded to database.

Available for analysis: The clean, groomed, and validated data are inserted into the appropriate database (in this case nonfish_bycatch) and now become available for extraction and analysis.

The clean electronic data files and raw paper data are then archived for safekeeping.



Updated : 16 November 2007