Elyssa Hofgard1,Vanessa Oklejas2,David Mittan-Moreau2,Aria Mansouri Tehrani1,Daniel Paley2,Aaron Brewster2,Tess Smidt1
Massachusetts Institute of Technology1,Lawrence Berkeley National Laboratory2
Elyssa Hofgard1,Vanessa Oklejas2,David Mittan-Moreau2,Aria Mansouri Tehrani1,Daniel Paley2,Aaron Brewster2,Tess Smidt1
Massachusetts Institute of Technology1,Lawrence Berkeley National Laboratory2
We investigate different machine learning (ML) methods to determine unit cell parameters and classify Bravais lattices from powder X-ray diffraction (XRD) data. Traditional powder indexing algorithms encounter errors when dealing with noisy experimental data and overlapped regions from multiple phases. The dominant zone problem also causes errors, arising when certain crystal planes significantly overshadow other planes in the diffraction pattern due to differences in the magnitudes of unit cell axes. This causes complications in accuracy of peak identification, peak intensity measurements, and subsequent crystal structure determination [1-3]. ML methods offer the potential to surpass time-consuming conventional techniques through their ability to learn complex patterns from diverse datasets and provide fast inference speeds. <br/>Previous research has primarily addressed lattice type classification using ML models with decreased accuracy for lower symmetry crystal structures [4-7]. A recent publication on unit cell parameter regression showed that 1-D convolutional neural networks (CNNs) could predict unit cell lengths across all crystal systems but struggled to predict unit cell angles for monoclinic or triclinic systems [8]. However, CNNs assume translational invariance in the input data, which is absent in powder XRD data.<br/>Therefore, we explore the effectiveness of ML models specifically designed for identifying 1D patterns in sequential data. We train and test these models on established crystallographic databases and assess their performance on Bravais lattice classification and unit cell parameter regression. We also consider other essential factors such as regression targets, spectrum complexity, data augmentation techniques to improve performance with large unit cells or dominant zones, and the use of physically meaningful loss functions. Ultimately, our goal is to integrate these models with powder XRD analysis software such as GSAS-II or TOPAS to accelerate crystal structure determination for complex or challenging systems.<br/><br/>[1] Esmaeili et al. J. Appl. Cryst. (2017) 50, 651–659 <br/>[2] Lutterotti et al. J. Appl. Cryst. (2019) 52, 587-598 <br/>[3] Coelho J. Appl. Cryst. (2017) 50, 1323-1330<br/>[4] Oviedo, et al. npj Comput Mater (2019) 5, 60<br/>[5] Corriero et al., J. Appl. Cryst. (2023) 56, 409-419<br/>[6] Lolla et al., J. Appl. Cryst. (2022) 55, 882-889<br/>[7] Suzuki, et al. Sci Rep (2020) 10, 21790<br/>[8] Chitturi et al., J. Appl. Cryst. (2021) 54, 1799-1810.