MRS Meetings and Events

 

DS04.04.02 2023 MRS Fall Meeting

Predicting Synthesizability using Positive-Unlabeled Learning

When and Where

Nov 28, 2023
9:00am - 9:15am

Sheraton, Second Floor, Back Bay B

Presenter

Co-Author(s)

Geun Ho Gu1

Korea Institute of Energy Technology1

Abstract

Geun Ho Gu1

Korea Institute of Energy Technology1
The high-throughput screening of the crystal databases has accelerated the discovery of new materials. To expand the screening scope, density functional theory (DFT) optimization is combined with generative models, evolutionary algorithms, and element substitution to create “virtual” crystals that have not been synthesized previously but are predicted to be stable in silico. However, the synthesizability of the virtual candidates remains a critical concern as their actual synthesis methods are unknown. To address the synthesizability of virtual candidates, DFT calculated energy such as formation energy, and energy above the hull is often used as intuitive screening criteria. While some of the thermodynamically stable virtual crystals are indeed synthesizable and functional, many virtual candidates are unsynthesizable, deeming the virtual crystal search impractical. These results are, however, not unexpected as the density functional theory-based energy metrics do not account for chemical potentials at the synthesis condition, as well as the kinetics of the complex synthesis. Quantifying the synthesizability is crucial to improving the reliability and practicality of virtual crystal exploration and advancing materials discovery.<br/>Here, we present a probabilistic approach called positive-unlabeled learning to predict the synthesizability of the virtual crystals to improve the quality of the virtual crystal screening. The model framework implicitly considers the structural similarities between the virtual candidates and the previously synthesized candidates by training an ensemble of binary classification models with previously synthesized data as positive and randomly selected unlabeled data as negative. The average of the ensemble of the model predicts the synthesizability score that we call the crystal-likeness score. We train our model with Materials Project data, demonstrating the 87.4% true positive rate (TPR) for the test set of experimentally reported cases (9356 materials). We further validate the model by predicting the synthesizability of newly reported experimental materials in the last 5 years (2015–2019) with an 86.2% true positive rate using the model trained with the database as of the end of the year 2014. Our analysis shows that our model captures the structural motif for synthesizability beyond what is possible by Ehull. We find that 71 materials among the top 100 high-scoring virtual materials have indeed been previously synthesized in the literature.<br/>We further explore the applications by performing transfer learning to inorganic perovskite crystals. We find that the model shows a 95.7% TPR. Further validation is established by demonstrating that 179 virtual crystals that are predicted to be synthesizable have already been synthesized in literature, and those with the lowest synthesizability scores have not been reported. These numbers are comparable to 943 perovskite crystals in OQMD, MP, and AFLOW that are registered as previously synthesized. While previous methods focused on metal oxides, our model applies to other classes of perovskites, including chalcogenide, halide, and hydride perovskites, as well as anti-perovskites. For comparison, Goldschmidt factor-based screening is applied which was only applicable to 388 perovskites out of 943 registered perovskites, and a TPR of 0.863 was obtained for the 388 crystals. We apply the method to identify synthesizable perovskite candidates for metal halide optical materials. With the proposed data-driven metric of the crystal-likeness score, high-throughput virtual screenings can benefit significantly by reducing and prioritizing the candidate for experimental testing.

Keywords

inorganic

Symposium Organizers

Andrew Detor, GE Research
Jason Hattrick-Simpers, University of Toronto
Yangang Liang, Pacific Northwest National Laboratory
Doris Segets, University of Duisburg-Essen

Symposium Support

Bronze
Cohere

Publishing Alliance

MRS publishes with Springer Nature