National Solar Radiation Data Base User's Manual (1961-1990)

Table of Contents

8.0 Data Base Quality

Part 2: How the Data Base was Produced

9.0 The Production Process

NREL has maintained a detailed record of the processes used in producing the data base. This record will be documented in a permanent archive of the entire data base production. Here we only mention those aspects of production that affected the quality of the products.

9.1 Verification and Validation

Approximately half of the data base software development time was devoted to software testing. We used two fundamental concepts while testing: verification and validation. Verification determined if the software meets specifications; validation determined if specifications were correct. For the most part, validation was performed during the R & D part of the project and is reported in Volume 2 of the data base documentation.

Our software testing strategy employed conventional techniques, such as functional testing and logic analysis. Much of the initial processing in the data base production involved relatively simple data transfer or units conversion. These processes also involved decoding data from the source format and encoding for our data structures. Because of the complexity of some source data formats, automation of testing was difficult.

During functional testing, verification of the process was accomplished by comparing the output with that predicted from the input. The completeness of the verification, however, hinged on the proper selection of input. Our test data sets were usually divided into two categories:

Most of the initial inspection of input and output values was done by hand from printouts of test files. However, when problems were found and changes to the software were made, we verified the change by creating a new file of output values from the same input values, followed by an automated file comparison. This comparison (the "DIFFERENCE" command in VAX/VMS) outputs only the differences between the two files. This helped us verify not only that the desired change was made, but that no unwanted side effects occurred.

9.2 Production Control and Monitoring

To produce a data base of 30 gigabytes, involving numerous processes and a variety of input data sources, some method was needed to track the production progress. We tracked progress through four tracking mechanisms: (1) Project Status Log, (2) Production Status (ProStat) Data Base, (3) File Status Log, (4) Process ID File.

Figure 9-1 gives an overview of these process and control mechanisms. The value of these process control procedures lies primarily in the detailed permanent record they provide of the data base production procedures and processes. Should questions arise regarding any of the data in the data base, these files will allow an exact determination of the processes used to arrive at specific data for any location and time. Furthermore, this information will be invaluable as a starting point for future upgrades of the data base.

9.2.1 The Project Status Log

This log aided the production team in tracking the data as they were processed. The values in this log indicated the number of hours that each element existed in any number of predefined production states. The states of production, for example, might represent that the datum was missing or had been interpolated.

The state of each element was stored in the same record as the data, and the Project Status Log was automatically updated by library routines that were invoked each time data were written to a data base file. Knowing that the production states followed a clear path from beginning to end, this log allowed the production team to verify that all data had been moved from state to state as expected.

9.2.2 The Production Status (ProStat) Data Base

This data base, which was formed from the Project Status Log, held information about the status of each site/year/element that helped project leaders make management decisions. Several status codes were devised to indicate the progress of production. A code was assigned only if all hours for that site/year/variable met the criteria for that code. This data base reduced the probability of overlooking a few hours of data that could not be processed using normal procedures.

9.2.3 The File Status Log

This log aided the production team in keeping track of which processes had been applied to which files and where the files were located. Media transfer software supervised the movement of data between on-line disk to laser disk or tape and recorded the movement in the File Status Log via temporary File Status Update files. The File Status Log held the date on which each file was most recently modified and the file's current location, i.e., a tape or disk number.

9.2.4 Process ID File

A critical part of production control was the Process ID File. This file held an enumerated list of all processes authorized by the production team. The library routine that wrote to a binary file required that the calling program have a process authorization number; thus, a binary file could be written only if that authorization number resided in the Process ID File. Each time a binary file was modified by a process, the process ID number was written to a File Status Update file for that site-year. The File Status Log was updated from these update files nightly. Note that even if a process was changed, such as for a bug fix or enhancement, it was considered a new process with a new process ID and description in the process ID file.

10.0 Known Imperfections within the NSRDB

Table of Contents

Return to RReDC Homepage ( )