Claudia Draxl1,Santiago Rigamonti1,Adrian Stroth1,Manish Kumar1,Mao Yang1,2,Peter Benner2
Humboldt-Universität zu Berlin1,Max Planck Institute for Dynamics of Complex Technical Systems2
Claudia Draxl1,Santiago Rigamonti1,Adrian Stroth1,Manish Kumar1,Mao Yang1,2,Peter Benner2
Humboldt-Universität zu Berlin1,Max Planck Institute for Dynamics of Complex Technical Systems2
Despite the success of the cluster expansion (CE) approach demonstrated by numerous applications to alloys, CE may perform poorly in two cases which are common in materials science. These concern properties which depart from linearity as a function of concentration, and properties which split into sub-domains in configuration space, each having distinct physical behavior. Examples include kinks in the energy of formation [1] and metal to insulator transitions in complex alloys [2]. In this work, we combine CE with machine-learning techniques toward non-linear modeling and classification such to overcome these problems. First, the input space is significantly augmented by adding mathematically complex but potentially more descriptive features built upon cluster correlations. Second, state of the art techniques including compressed sensing and support vector machines, are employed to find sparse models that generalize well. Compared to standard CE approaches, the larger input space of the introduced methodology leads to steeper learning curves, which generally means higher requirements with respect to the amount and variety of training data. Since producing DFT data is costly, we make use of an iterative approach that allows for reaching fast convergence, by automatically detecting and adding data from regions in configuration space which are either unexplored or more relevant for the property to be modeled. This approach is especially important for systems with large parent cells [1], where the combinatorial explosion cannot be restrained by limiting the supercell size. All developments are implemented in the CE Python package CELL [3]. We demonstrate our approach by addressing the stability and electronic structure of complex materials, including intermetallic thermoelectric clathrates that are of interest for waste-heat recovery and oxide perovskites that are promising candidates for optoelectronic applications. The modeled properties include the energy of formation, the conducting behavior (metal vs. semiconductor), as well as concentration- and temperature-dependent band gaps.<br/><br/><br/>[1] M. Troppenz, S. Rigamonti, and C. Draxl, Chem. Mater. 29, 2414 (2017).<br/>[2] M. Troppenz, S. Rigamonti, J. O. Sofo, and C. Draxl; https://arxiv.org/abs/2009.11137v1.<br/>[3] S. Rigamonti et al., CELL: a Python package for cluster expansion with a focus on complex alloys, preprint; https://sol.physik.hu-berlin.de/cell/