Toward an Interactive Experiment-Model Approach
With more realistic representations of ecosystem processes, models become more complex and more parameters are required to be constrained. Developing an operational forecasting system is an interactive way to integrate experiments and models to accomplish this. The forecasting system would assimilate various data streams into models so as to improve model predictions. Forecasting outcomes can provide feedback to experiments from which data sets are needed to further improve model predictions. In turn, new data sets can then be fed back into models further constraining and improving model predictions.
In the SPRUCE project, we are developing a data assimilation and operational ecological forecasting system called ECOlogical Platform for Assimilation of Data (ECOPAD). Pretreatment data sets from different field campaigns are compiled, including large-collar in situ CO2 flux measurements across 4 years, aboveground NPP and carbon pool sizes from sampled vegetation, phenological data derived from PhenoCam imagery, and peat carbon from core samples. The data sets are then assimilated into the Terrestrial ECOsystem (TECO) model using a Markov chain Monte Carlo technique to constrain parameters. The TECO model is used because it simulates processes of canopy photosynthesis, plant growth, carbon transfer among compartments, and soil water dynamics. Unlike Earth system models, the TECO model is simple enough to overcome computational cost. With data assimilation, we can quantify how much uncertainty of forecasting could be reduced as more data become available. The relative contributions of external forcing versus model parameters to the uncertainty of forecasting can also be estimated. The projections of carbon cycles will be compared to future data streams to refine model structure, update model parameters, generate new scientific questions, and test competing hypotheses. The new data sets are then assimilated to enable new projections. This flow of work should be done regularly and automated in an operational system such as ECOPAD.
Although researchers have collected enormous amounts of data to understand various ecosystem processes over past decades, a major challenge is to combine understanding of multiple processes together to form a complete picture of how ecosystems will respond as a whole. Usually, empirical data from observations and manipulative experiments are scattered among individual teams and a significant proportion of them is not published in a timely manner, or is not available to modelers even after published. In the SPRUCE experiment, while individual teams collect data to answer questions related to specific ecosystem processes, they also work as a large group to confront models with data. ORNL is developing and deploying data and information management, and integration capability required for the collection, storage, processing, discovery, access, and delivery of data, including experimental data and model outputs. These capabilities and systems are designed to facilitate uncertainty associated with characterization and quantification. The systems will also be developed for assimilation of available measurements, synthetic analysis of results, model forcing and boundary condition data sets, and model results. Such an information system facilitates data-model integration and provides accessibility to model output, benchmarking analysis, visualization, and synthesis activities.