DS06.11.07

Efficient Generation of Materials Data for Machine Learning

When and Where

Dec 1, 2023
10:15am - 10:45am

Hynes, Level 2, Room 203

Presenter

Co-Author(s)

Shyue Ping Ong1,Ji Qi1,Tsz Wai Ko1

University of California, San Diego1

Abstract

Shyue Ping Ong1,Ji Qi1,Tsz Wai Ko1

University of California, San Diego1
The biggest bottleneck to machine learning (ML) for materials science is the generation of training data. In this talk, I will discuss various approaches to efficiently generate and use materials data to develop ML models. For instance, I will demonstrate the use of universal interatomic potentials to pre-generate a large configuration space of structures, as well as a DImensionality Rduced Encoded Clusters with sTratified (DIRECT) sampling approach to create a robust training set for an ML interatomic potential (MLIP). I will also discuss the application of multi-fidelity techniques to maximize the return on scarce, high quality data. While a major focus of this talk will be on MLIP development, I will also highlight the applicability of these techniques to other ML-enabled applications.

Symposium Organizers

Mathieu Bauchy, University of California, Los Angeles
Ekin Dogus Cubuk, Google
Grace Gu, University of California, Berkeley
N M Anoop Krishnan, Indian Institute of Technology Delhi

Symposium Support

Bronze
Patterns and Matter | Cell Press

Publishing Alliance

MRS publishes with Springer Nature

 

Symposium Support