Marginal emissions data are playing an increasingly important role in addressing climate change. Companies use it to guide clean energy investments, policymakers rely on it to shape carbon reduction strategies, and researchers apply it to assess the environmental impact of electricity use. As the influence of this data grows, so does the need for independent, empirical validation to ensure it is as accurate and reliable as possible.
VERACI-T is a new working group that was launched to investigate the accuracy of electricity sector marginal emissions datasets in standardized ways through peer-reviewed research. This taskforce of energy experts, researchers, and industry leaders applies rigorous, proven validation techniques to test the accuracy of different marginal emissions factor (MEF) datasets, and makes all results free and open for public use.
By using real-world data and transparent methodologies, VERACI-T is helping to investigate what level of confidence is appropriate in emissions modeling, and to ensure that decisions based on this data are backed by the strongest possible evidence.
A key challenge in validating marginal emissions data is that it requires comparing the impact of having taken an action, to the impact of not taking it (counterfactual). Formally in science this is known as “causal inference.” In most fields that rely on causal inference data, testing whether a model is accurate through randomized controlled trials or natural experiments is standard procedure. Clinical trials verify whether a new drug is effective before it’s prescribed to patients. Economists use controlled experiments to understand the effects of policy decisions. Financial analysts stress-test models before putting millions of dollars at risk.
Yet, when it comes to electricity emissions data, that same scientific rigor is often missing. Sometimes groups will simply assert without evidence that a certain dataset is or is not accurate…or even assume that it’s somehow impossible to measure causal inference for emissions data, even though it is accepted practice in other fields.
VERACI-T is changing that. Instead of picking sides in the debate over which dataset is best, the working group is building an open, standardized framework for testing any marginal emissions model. VERACI-T’s techniques draw on established causal inference science widely used in other fields. Using natural experiments and empirical tests, VERACI-T researchers are making it possible for anyone to objectively observe whether any given model holds up based on real-world evidence, not assumptions.
VERACI-T has already conducted and published three major studies, each testing different aspects of model accuracy. The first paper has been peer reviewed and accepted into Renewable and Sustainable Energy Reviews, and the other two papers are currently in peer review.
The key to empirically validating any model is to work out claims it makes about something that is observable in the real world, and gauge how well the real-world evidence matches the claims of that model. VERACI-T’s first study started by working out some of the simplest possible claims that different models were making — explicitly or implicitly — about something observable in the real world. The real-world data on actual power plant behavior were taken from the US Clean Air Markets Program to describe a rubric of tests that can be applied to any marginal operating emission rate model.
The tests are divided into four categories:
The paper describes the tests and then demonstrates each on a variety of MEF models, presenting possible explanations when contraventions occur. While all tested models demonstrated expected behaviors when it came to being temporally correlated with peak net-demand and with the percentage of dispatched coal, some displayed anomalies when compared against the carbon intensities of the dirtiest active power plants. And, only two of the studied MEFs accurately predicted curtailment periods within their models.
The extensible test suite designed in this research provides valuable insight into MEF accuracy, as well as a framework for understanding model behavior.
The gold standard in causal inference is a randomized controlled trial (RCT). In the ideal RCT in marginal emissions, companies would randomly increase or decrease their net power consumption by large amounts repeatedly, so one could compare the resulting change in emissions. Of course, literally doing so would be a big ask for a power grid. A common technique in causal inference in such situations is to look for cases where a “natural experiment” causes what essentially is an unplanned RCT to happen by accident.
VERACI-T’s second study leveraged this technique using nuclear power plant outages. Unplanned nuclear power plant outages change the power grid’s net load by hundreds to thousands of megawatts while they last. And because these outages occur randomly, their effect is essentially the same as causing random large variations in load. This natural experiment allows the marginal emissions factors to be estimated in a manner that isolates the change in emissions due to a change in load, rather than due to unrelated factors like weather or temperature.
This technique was used to directly measure the actual short-run MEFs in six balancing authorities across the continental US. Anyone wondering if a marginal emissions model is accurate can apply it to this convenient list of nuclear outages to see if that model can correctly predict how real-world total emissions must have changed before, during, and after these outages.
The empirically measured MEFs produced by this study — available now for public use — can therefore serve as a benchmark for testing the accuracy of any marginal emissions model.
In Paper 3, VERACI-T researchers looked at the changes in emissions that occurred in response to changes in wind generation in ERCOT (a US balancing authority). Because changes in wind generation occur randomly throughout the day, and independently of changes in demand for electricity, they act as a different type of natural experiment (similar to RCTs and the nuclear outages described above).
While wind variation is smaller than nuclear outage variation, it’s also much more common, allowing researchers to test the robustness of marginal emissions models in a different way. Any model that makes accurate implicit claims and can accurately predict the effect of nuclear outages and can accurately predict the effects of random fluctuations in wind has therefore demonstrated three different types of accuracy. This method was also used to measure MEFs in small clusters of pricing nodes, a much smaller geographic boundary than had previously been studied using causal methods.
Five different signals were tested for their ability to predict when and where emissions would be the lowest:
Other than the average emissions rate, which showed no predictive capability, the other signals were all able to effectively identify times and regions where marginal emission rates are lower, and can be used to reduce total real-world emissions.
In addition, this study provided the first known empirical validation that a nodal-level emissions algorithm (REsurety’s) can accurately predict changes in emissions at a very local level. This also may be the first time that statistical regression-based models have been directly compared to economic/engineering dispatch-based models.
The study found that the two different marginal model types, while using wildly different methods, made similar predictions about real-world changes in emissions, and that both matched the actual change in emissions — strongly suggesting that the models are responding to actual causal effects rather than model bias.
VERACI-T’s work doesn’t stop here. The working group is currently turning its attention to build margin models, which attempt to predict how both operational (e.g. timing of load) and structural (e.g. new renewable energy) changes to electric grids influence long-term grid emissions via structural grid changes (e.g. building new power plants).
The upcoming build margin research will:
Looking further ahead, VERACI-T plans to explore the use of real randomized controlled trials, leverage non-public grid data to expand validation datasets to different regions, and continue to refine standardized frameworks for evaluating marginal emissions datasets.
Marginal emissions data is having a growing impact on climate action, shaping decisions that drive clean energy investments and emissions reductions. But ensuring that these data are accurate is essential for their effectiveness. VERACI-T invites researchers, energy experts, funders, and organizations with access to relevant data to contribute to this effort.
All research findings are freely available to the public at veraci-t.org. For more information or to get involved, contact the program coordinator at: contact@veraci-t.org
WattTime’s Pierre Christian, Nat Steinsultz, and Sam Koebrich also contributed to this article.