MRS Meetings and Events

 

DS04.04.01 2023 MRS Fall Meeting

Designing High Throughput Workflows: From Experiment Automation to Data Management

When and Where

Nov 28, 2023
8:30am - 9:00am

Sheraton, Second Floor, Back Bay B

Presenter

Co-Author(s)

John Gregoire1,Joel Haber1,Dan Guevarra1,Lan Zhou1,Kevin Kan1,Ryan Jones1,Yungchieh Lai1,Ja'Nya Breeden1,Michael Statt2,Brian Rohr2,Santosh Suram3

California Institute of Technology1,Modelyst LLC2,Toyota Research Institute3

Abstract

John Gregoire1,Joel Haber1,Dan Guevarra1,Lan Zhou1,Kevin Kan1,Ryan Jones1,Yungchieh Lai1,Ja'Nya Breeden1,Michael Statt2,Brian Rohr2,Santosh Suram3

California Institute of Technology1,Modelyst LLC2,Toyota Research Institute3
In the quest to accelerate materials discovery via experiment automation and artificial intelligence, we recognize the challenges in emulating human capabilities with respect to contextualizing data and rapidly adapting experiments based on real-time data streams. In the development of infrastructure for next-generation workflows, these aspects of traditional research are most tightly connected to instrument control software and the management of experimental data. We will describe the evolution of these capabilities at Caltech, from automated workflows focused on throughput and consistency to workflows that embrace modularity and responsiveness to new knowledge, where techniques for this latter approach are being developed collaboratively with Modelyst, Inc. and Toyota Research Institute. The lessons learned with respect to data management may be the most generalizable to the materials chemistry community, especially our development of Event-Sourced Architecture for Materials Provenance Management (ESAMP) and the Materials Experiment Knowledge Graph (MekG), which addresses the hierarchical nature of materials data. Regarding representation of materials data, high-level descriptors can be provided by the chemical elements, crystal structure motifs, and types of materials properties, and ultimately a given piece of data must be considered in the context of its acquisition. Detailed descriptors of a piece of experimental data include not only the metadata for the experiment that generated it, but also the prior history of synthesis and metrology experiments. Graph databases offer an opportunity to represent such hierarchical relationships among data, organizing semantic relationships into a knowledge graph. Initial reports of knowledge graphs in materials science highlight the breadth of approaches for their development. We describe a knowledge graph of materials experiments whose construction encodes the complete provenance of each material sample and its associated experimental data and metadata. Additional relationships among materials and experiments further encode knowledge and facilitate data exploration. MekG is sufficiently large and complex to demonstrate a path toward a global materials knowledge graph. We characterize the scalability of this approach, especially with respect to executing queries, illustrating the value that modern graph databases can provide to the enterprise of data-driven materials science.

Keywords

autonomous research | combinatorial

Symposium Organizers

Andrew Detor, GE Research
Jason Hattrick-Simpers, University of Toronto
Yangang Liang, Pacific Northwest National Laboratory
Doris Segets, University of Duisburg-Essen

Symposium Support

Bronze
Cohere

Publishing Alliance

MRS publishes with Springer Nature