Ha-Kyung Kwon2,Tian Xie1,Daniel Schweigert2,Sheng Gong1,Arthur France-Lanord3,Arash Khajeh2,Emily Crabb1,Michael Puzon2,Chris Fajardo2,Will Powelson2,Yang Shao-Horn1,Jeffrey Grossman1
Massachusetts Institute of Technology1,Toyota Research Institute2,Sorbonne Université3
Ha-Kyung Kwon2,Tian Xie1,Daniel Schweigert2,Sheng Gong1,Arthur France-Lanord3,Arash Khajeh2,Emily Crabb1,Michael Puzon2,Chris Fajardo2,Will Powelson2,Yang Shao-Horn1,Jeffrey Grossman1
Massachusetts Institute of Technology1,Toyota Research Institute2,Sorbonne Université3
Open material databases storing hundreds of thousands of material structures and their corresponding properties have become the cornerstone of modern computational materials science. Yet, the raw outputs of the simulations, such as the trajectories from molecular dynamics simulations and charge densities from density functional theory calculations, are generally not shared due to their huge size. In this work, we provide a cloud-based platform to facilitate sharing of raw data and enable fast post-processing in the cloud to extract new properties defined by the user. As an initial demonstration, our current database includes 6286 molecular dynamics trajectories for amorphous polymer electrolytes and 5.7 terabytes of data. We create a public analysis library at https://github.com/TRI-AMDD/htp_md to extract multiple properties from the raw data, using both expert-designed functions and machine learning models. The analysis is run automatically with computation in the cloud, and results then populate a database that can be accessed publicly. Our platform encourages users to contribute both new trajectory data and analysis functions via public interfaces, and newly analyzed properties will be incorporated into the database. Finally, we create a front-end user interface at https://www.htpmd.matr.io for browsing and visualization of our data. We hope the platform provides a new way of raw data sharing for the computational materials science community.