John Gregoire1,Joel Haber1,Dan Guevarra1,Lan Zhou1,Di Chen2,Shufeng Kong2,Lusann Yang3,Francesco Ricci4,Jeffrey Neaton4,Carla Gomes2
California Institute of Technology1,Cornell University2,Google Research3,Lawrence Berkeley National Laboratory4
John Gregoire1,Joel Haber1,Dan Guevarra1,Lan Zhou1,Di Chen2,Shufeng Kong2,Lusann Yang3,Francesco Ricci4,Jeffrey Neaton4,Carla Gomes2
California Institute of Technology1,Cornell University2,Google Research3,Lawrence Berkeley National Laboratory4
As materials discovery efforts increasingly expand in high-order composition spaces and/or far-from-equilibrium syntheses, efficient exploration requires both automation of experiments and advancement of data science to interpret and plan experiments. We will discuss recent advances in high throughput workflows and artificial intelligence from making statistical inferences on the presence of new phases to physics-integrated machine learning. Active or sequential learning holds great promise for accelerating materials experiments, although full realization of this promise relies on the ability of machine learning models to be predictive in new composition spaces where no training data is available, or at least to generate priors that facilitate guidance in the new search spaces. Expert scientists make such predictions based on their materials chemistry knowledge, i.e. how the individual elements and interactions among them give rise to properties in composition spaces that have not been measured. Additionally, experts transfer knowledge about materials chemistry from other domains. To emulate these aspects of expert extrapolative predictions, we introduce the Hierarchical Correlation Learning for Multi-property Prediction (H-CLMP) framework. In a complementary effort, we generate ultra-high throughput workflows that enable statistical inferences on the presence of new phases, automatically identifying materials of interest for other property measurements. A primary bottleneck if high throughput discovery of solid-state materials is the limited ability to automate phase mapping – the generation of the composition map of crystalline phases from a set of x-ray diffraction patterns. Mixed-phase x-ray diffraction patterns must be de-mixed, which can be very challenging when candidate structures contain overlapped features and the diffraction patterns of prototype structures vary due to, for example, alloying. In such cases, experts analyze the data by collectively considering a collection of related mixed-phase diffraction patterns and invoking thermodynamic rules that govern phase mixtures in the given materials space. Previous methods have used such rules to “fix” phase mapping solutions from de-mixing patterns, but when dozens of phase mixtures exist within a dataset and the materials space has multiple degrees of freedom that require hundreds of diffraction patterns to characterize, the data complexity exceeds the capabilities of both human experts and state of the art algorithms. In such cases, the prior knowledge about phase prototypes of thermodynamic rules must be integrated into the demixing process, which has been achieved with Deep Reasoning Networks (DRNets) that seamlessly integrate constraint reasoning and deep learning.