
Figure 1
High level view of the code-data dependencies in the gcamdata package. This plot of the system architecture shows nodes (“chunks”, units of code charged with processing data and producing specific outputs) and edges (data flows between chunks). Nodes are colored by discipline, e.g., agriculture and land use-related code is black, energy system code is blue, etc. For clarity neither the initial data inputs nor the final XML outputs (i.e. the GCAM input files) are shown; this means that seemingly isolated nodes or groups of nodes actually contribute data directly into the model.

Figure 2
An example of tracing data flow. Here the user has requested a data trace on a particular data object “L100.FAO_ag_Exp_t” (FAO agricultural exports by country, item, and year). The package prints detailed information about this object and its upstream and downstream dependencies, and graphs these relationships to show data flow (arrows). Raw data inputs are at the top, and the final XML product that flows into the GCAM model is at the bottom. Explanatory notes describe each step.
Table 1
Automatic package-level checks performed on the gcamdata data-handling functions (termed “chunks”) and their outputs.
| Category | Test |
|---|---|
| Behavior | Chunk responds to required messages from driver (DECLARE_INPUTS, DECLARE_OUTPUTS, MAKE) |
| Chunk doesn’t make forbidden calls (e.g., slow or deprecated R routines) | |
| Chunk handles changes in model time settings | |
| Chunk (package-level) constants are correctly formatted | |
| Data | Chunk declares a (possibly empty) list of input that can all be found, either as the product of another chunk or as a file input |
| Chunk declares a valid list of outputs | |
| Chunk uses only its declared inputs | |
| Chunk produces exactly its declared outputs | |
| All file inputs have metadata headers and are encoded (e.g., standard line endings) correctly | |
| All chunk outputs have title, description, units, comments, and precursor information attached | |
| All declared precursors are in the chunk input list, and each chunk input is the precursor of at least one output | |
| Chunk outputs match known good output set |
