|National Solar Radiation Data Base User's Manual (1961-1990)|
8.0 Data Base Quality
Our software testing strategy employed conventional techniques, such as functional testing and logic analysis. Much of the initial processing in the data base production involved relatively simple data transfer or units conversion. These processes also involved decoding data from the source format and encoding for our data structures. Because of the complexity of some source data formats, automation of testing was difficult.
During functional testing, verification of the process was accomplished by comparing the output with that predicted from the input. The completeness of the verification, however, hinged on the proper selection of input. Our test data sets were usually divided into two categories:
2. Subsets of real data. These subsets were chosen to test the processing at critical boundaries, such as month or year boundaries, and site or file changes.
To produce a data base of 30 gigabytes, involving numerous processes and a variety of input data sources, some method was needed to track the production progress. We tracked progress through four tracking mechanisms: (1) Project Status Log, (2) Production Status (ProStat) Data Base, (3) File Status Log, (4) Process ID File.
Figure 9-1 gives an overview of these process and control mechanisms. The value of these process control procedures lies primarily in the detailed permanent record they provide of the data base production procedures and processes. Should questions arise regarding any of the data in the data base, these files will allow an exact determination of the processes used to arrive at specific data for any location and time. Furthermore, this information will be invaluable as a starting point for future upgrades of the data base.
This log aided the production team in tracking the data as they were processed. The values in this log indicated the number of hours that each element existed in any number of predefined production states. The states of production, for example, might represent that the datum was missing or had been interpolated.
The state of each element was stored in the same record as the data, and the Project Status Log was automatically updated by library routines that were invoked each time data were written to a data base file. Knowing that the production states followed a clear path from beginning to end, this log allowed the production team to verify that all data had been moved from state to state as expected.
This data base, which was formed from the Project Status Log, held information about the status of each site/year/element that helped project leaders make management decisions. Several status codes were devised to indicate the progress of production. A code was assigned only if all hours for that site/year/variable met the criteria for that code. This data base reduced the probability of overlooking a few hours of data that could not be processed using normal procedures.
This log aided the production team in keeping track of which processes had been applied to which files and where the files were located. Media transfer software supervised the movement of data between on-line disk to laser disk or tape and recorded the movement in the File Status Log via temporary File Status Update files. The File Status Log held the date on which each file was most recently modified and the file's current location, i.e., a tape or disk number.
A critical part of production control was the Process ID File. This file held an enumerated list of all processes authorized by the production team. The library routine that wrote to a binary file required that the calling program have a process authorization number; thus, a binary file could be written only if that authorization number resided in the Process ID File. Each time a binary file was modified by a process, the process ID number was written to a File Status Update file for that site-year. The File Status Log was updated from these update files nightly. Note that even if a process was changed, such as for a bug fix or enhancement, it was considered a new process with a new process ID and description in the process ID file.
10.0 Known Imperfections within the NSRDB
Table of Contents