As decarbonization efforts pick up speed and more renewables are connected to the grid, the importance of correctly tracking and accounting for emissions has never been greater.
A research paper published by Greg Miller, the Research and Policy Lead at Singularity Energy and a team of collaborators, sheds light on how available databases for emissions can be used more efficiently by the EPA and many other stakeholders in the energy sector.
We sat down with him to dig deeper into the importance of hourly tracking of emissions, interregional power flows, state decarbonization policies and the underrepresentation of biomass emissions.
This interview has been edited for clarity and brevity.
How did your research paper about evaluating hourly emissions in the U.S. come about?
It started as a part of my PhD dissertation research. Back in 2020 I was working in a corporate sustainability role for calculating greenhouse gas inventories. I was becoming familiar with the data and starting to realize that with all these new renewables going online on the grid, there’s going to be differences in emissions during different times of day. I was wondering how that should be tracked and accounted for.
One of the barriers I realized existed was that there weren’t good datasets on what the carbon intensity of the electric grid was on an hour-by-hour basis. This research paper started out as a proposal for an EPA data challenge.
I had submitted a proposal about the data set they have, that’s been around for a quarter of a century and is the foundation on which a lot of this [emissions] accounting is based. But we probably need to start tracking things on an hourly basis.
It then outgrew the original data challenge, became a larger project. Later on Singularity Energy, became aware of it, thought it was important and decided to adopt and fund the research which was published last fall.
Does the research paper have a practical application for Singularity Energy?
Yes, the Open Grid emission dataset is open source, transparent and free. We want people to use it as a central repository for research on these issues.
But of course, it does have a lot of practical implications.
The data set is measured and validated with very accurate, high quality data, but that input data takes a long time to collect and report. It’s mostly based on data from the EPA, and the EIA. Data for January 2023 won’t become completely available until probably October 2024, so there’s a one to two year lag.
People want to be able to make, real time, or near real time decisions and be informed of what’s going on. At Singularity Energy we have real time data API for grid emissions, but it’s based on more estimated data, it uses approximations. They are still pretty good, but they’re still approximations.
One of the ideas for this data set is to use it to validate and improve these real time approximations, to make them very high quality. And also with high enough quality, so that they could be used for carbon accounting or being used in regulations. I think a huge opportunity is to make high quality data available in near real time for decision making on climate.
One of the key themes in the paper is hourly emissions. How do you go about estimating them?
There are two separate parts of this estimation problem. One is just estimating the total emissions coming out of power plants. And then there’s the allocation of those emissions to end consumers of electricity by tracing interregional power flows and other components.
In the U.S. we’re lucky to have the data set of continuous emissions monitoring. For a large portion of power generators in the U.S. it tracks emissions, fuel consumption and generation on an hour by hour basis.



Image credit: Evaluating the hourly emissions intensity of the US electricity system, Gregory J Miller, Gailin Pease, Wenbo Shi and Alan Jenn
Published 4 April 2023 • © 2023 The Author(s). Published by IOP Publishing Ltd
Environmental Research Letters, Volume 18, Number 4
Citation Gregory J Miller et al 2023 Environ. Res. Lett. 18 044020
DOI 10.1088/1748-9326/acc119
This data set has been used widely for a while for academic research and other purposes. But it’s incomplete, it doesn’t track all power plants in the US.
This is where the new methodology comes in. There is another data set from the EIA that tracks data from all power plants in the U.S. but on a monthly resolution. The real challenge was figuring out how to take the monthly data for the plants and translate that into an hourly value.
We were able to do that with the new data set from the EIA, that tracks hourly power generation at the regional and fleet level. For example, it could tell you the hourly generation value for the entire coal or natural gas fleet, or the entire solar fleet in California.
By crosswalking all of these datasets, we’re a able to identify what is the hourly profile of the generators that don’t report data already, to the [EPA] continuous emissions data set (CEMS). That was our approach to estimating what the profile of these generators would be on an hour by hour basis.
Has the EPA been involved during the research?
Yes, throughout this process we regularly engaged with folks at the EPA and EIA who publish these data sets. We discussed how to use and interpret the data and different issues in the data set itself. When we noticed a bug or an error in the dataset, I brought this to the attention of the EPA, and they were able to fix that in the next release. It was a collaborative process, as everyone is interested in having the best available data.
We also worked very closely with another partner, the Catalyst Cooperative, a data science cooperative. They are the creators and maintainers of the Public Utility Data Liberation (PUDL) project. They take data from different government sources, like the EPA and EIA, and clean and standardize it. They did a lot of the work to actually make the data usable, so that we can readily compare and crosswalk it from multiple sources.
Can you tell us a bit more about interregional power flows, how they are calculated and why they are important?
Generally, a lot of datasets focus on the generation side of the power system. They’ll tell you the average emissions of the generation fleet or the average emissions of electricity produced within a region. But if you think about this from the interest of end users of electricity, they want to know what’s the carbon intensity of the electricity that they actually consume.



That’s not always the same as the carbon intensity of electricity that’s generated nearby. And the reason for that is because there can be large amounts of imports or exports of electricity between regions. So the electricity that I consume here might come partially from power generators in my region, but it might also come from power generators from another region that was imported.
Is there any way to track emissions for consumers in specific regions?
Yes, we also incorporated methodologies in the realm of carbon flow tracing, or power flow tracing. [With them] we virtually trace the flow of carbon emissions through the power grid and say what’s the carbon intensity of the imports that you’re receiving from another region.
The hourly imputation of the data on the generation side also has data about the exchanges of electricity between regions on an hourly basis. So, we’re able to use that data to trace and say: in this hour 80% of your electricity comes from local generation, but 20% was being imported from this other region, and the carbon intensity of that is x. And that allows us to come up with that consumption-based emissions number that is useful for carbon accounting and end consumers of electricity.
Can individual states calculate their emissions to create and modify their decarbonization targets and policies?
If their goal is to decarbonize their power generation fleet, that’s different than if their goal is to decarbonize their energy consumption, right. Because if they’re just trying to decarbonize their generation fleet, and they don’t necessarily care about imports or exports, they just care about the generators that exist in their boundaries. But if they actually want to make sure that everyone in their state is consuming carbon free electricity, then they need to account for those flows of electricity.
One other nuance here is that a lot of these state regulations use renewable portfolio standards or similar policies, and those are based on tracking renewable energy attribute certificates. But they are more of a market-based mechanism for tracking electricity. A lot of the work that we’ve done in this paper specifically focuses on tracking physical flows of electricity, rather than the carbon intensity of the electricity you are procuring.



In the context of state policies, a lot of times they’re using these certificates to track the flows, but it’s still not perfect. There’s a lot of attention on this topic right now, especially in the western U.S. about accounting for imports and exports and these flows of electricity.
There is an assumption that biomass is a carbon neutral energy source but you argue that is not correct. What is the current state of biomass emission accounting and how can it be improved?
There are two somewhat separate issues here. One is how its accounted in accounting standards, and another is how it’s actually accounted in these datasets. We found that a lot of these existing data sets treat biomass emissions differently than they treat emissions from all other combustion sources. By biomass I mean, everything from actual wood biomass to biogas, landfill gas, all of these types of biogenic emissions.
What’s happening in these datasets is that they were treating biomass emissions as if they had zero carbon emissions. I think this is based on the argument that on a net basis, in the long term, there’s no net emissions to the atmosphere. If you’re growing a tree, it absorbs the carbon and then you’re burning the tree and putting that carbon back. But a lot of research has found that biomass combustion is not always carbon neutral. There can be net emissions to the atmosphere, depending on a lot of factors.
On the long timescale, it’s carbon neutral, but on the timescales that we care about, in the next 20 to 30 years, you’re still putting carbon dioxide into the atmosphere at a time when we don’t want to be doing that.
How much of U.S. emissions are attributed to biomass?
We found that in the US, treating biomass as carbon neutral would underestimate direct CO2 emissions by 3%. Nationally it doesn’t sound that much but on a regional level that number could be even more. In New England it would undercount emissions by 20%. In California ignoring them would underestimate 12%. Which is substantial if you think about the fact that people are being regulated on the [different types] of emissions they have, like building or vehicle emissions.
We argue that biomass emissions might be carbon neutral in some cases, but it has to be accounted for on a case-by-case basis. You can’t just make a blanket assumption that all biomass combustion is carbon neutral. The way that the existing data sets were treating this was essentially trying to do a lifecycle accounting for biomass, but not doing a lifecycle accounting for any other sources of combustion like methane leakage from natural gas.
What do you hope to achieve with the paper?
I believe that, especially going forward in the future, tracking emissions on an hourly basis is going to be critical for informing policy on decarbonization, whether that’s at the building level, for electric vehicles or just for electricity consumption in general.
One of the big barriers to wider adoption of hourly accounting was the absence of high quality data on hourly emissions from the power grid. With the publication of this peer reviewed data set, my big hope is that it reduces one of those barriers for the wider adoption of hourly accounting, which I believe can help inform more effective and targeted decarbonization policies.
Relevant: California To Spend $300M On Curbing Methane Emissions
I think it would be helpful for any agencies publishing data, to learn about how we can continually improve the data that we put out about power sector emissions [and] make sure we’re representing the complete picture. And I think it’s also important for academic researchers working on climate impacts or the local health impacts of power generation.
Will all the different stakeholders benefit from the research being open source?
Yes, we release this as an open source data set, because we want this to serve as a central repository for the best knowledge and state of research on these topics. [It can] also help avoid duplication of efforts on this research in order to move as quickly as possible on decarbonization.
We want to have a readily usable data set that people can use, so that researchers don’t waste time recreating these datasets. And we also recognize that accounting for these emissions is incredibly complex. And we don’t pretend that the data set as it exists today is perfect. We’re hoping that having this open source project can be inviting for researchers from around the world to contribute to this and make sure everyone is able to benefit from any improvements that are made.