Meta – the social metaverse company and Georgia Institute of Technology (Georgia Tech)- one of the US’ top public research universities, announced on November 2nd the release of a new dataset (OpenDAC23) and associated AI models to help accelerate research on Direct Air Capture.
The OpenDAC23 dataset is the largest dataset of Metal Organic Frameworks (MOFs) characterized by their ability to adsorb CO2 in the presence of water, as well the largest than any other pre-existing dataset at quantum precision.
The companies also announced the release of new state-of-the-art versions of their ML models developed for the Open Catalyst project, trained on the OpenDAC23 dataset, plus a set of 200+ promising MOF sorbents. The paper issue named The Open DAC 2023 Dataset and Challenges for Sorbent Discovery in Direct Air Capture can be accessed here.
The initiative aims to help the direct air capture space by reducing the cost of the technology and accelerating research across the sector.
New methods for carbon dioxide removal are urgently needed to address the historic emissions in the atmosphere. Metal-organic frameworks (MOFs) have been widely studied as potentially customizable adsorbents for DAC, however, discovering promising MOF sorbents has been challenging. That is due to the fact that the chemical space to explore is vast and there is a need to understand materials as functions of humidity and temperature.
The dataset named Open DAC 2023 (ODAC23) consists of more than 38M density functional theory (DFT) calculations on more than 8,800 MOF materials containing adsorbed CO2 and/or H2O. In addition to probing properties of adsorbed molecules, the dataset is a rich source of information on structural relaxation of MOFs, which will be useful in many contexts beyond applications for DAC.
A large number of MOFs with promising properties for DAC have been identified. Meta and Georgia Tech also trained state-of-the-art ML models on this dataset to approximate calculations at the DFT level. The open-source dataset and the initial ML models will provide an important baseline for future efforts to identify MOFs for a wide range of applications, including DAC.