TY - JOUR
T1 - Committee machine that votes for similarity between materials
AU - Nguyen, Duong Nguyen
AU - Pham, Tien Lam
AU - Nguyen, Viet Cuong
AU - Ho, Tuan Dung
AU - Tran, Truyen
AU - Takahashi, Keisuke
AU - Dam, Hieu Chi
N1 - Funding Information:
This work was partly supported by PRESTO and by the Materials Research by Information Integration Initiative (MI2I) project of the Support Program for Start-Up Innovation Hub, from the Japan Science and Technology Agency (JST), and by JSPS KAKENHI Grant-in-Aid for Young Scientists (B) (grant No. JP17K14803), Japan.
Publisher Copyright:
© 2018 Duong-Nguyen Nguyen et al.
PY - 2018
Y1 - 2018
N2 - A method has been developed to measure the similarity between materials, focusing on specific physical properties. The information obtained can be utilized to understand the underlying mechanisms and support the prediction of the physical properties of materials. The method consists of three steps: Variable evaluation based on nonlinear regression, regression-based clustering, and similarity measurement with a committee machine constructed from the clustering results. Three data sets of well characterized crystalline materials represented by critical atomic predicting variables are used as test beds. Herein, the focus is on the formation energy, lattice parameter and Curie temperature of the examined materials. Based on the information obtained on the similarities between the materials, a hierarchical clustering technique is applied to learn the cluster structures of the materials that facilitate interpretation of the mechanism, and an improvement in the regression models is introduced to predict the physical properties of the materials. The experiments show that rational and meaningful group structures can be obtained and that the prediction accuracy of the materials' physical properties can be significantly increased, confirming the rationality of the proposed similarity measure.
AB - A method has been developed to measure the similarity between materials, focusing on specific physical properties. The information obtained can be utilized to understand the underlying mechanisms and support the prediction of the physical properties of materials. The method consists of three steps: Variable evaluation based on nonlinear regression, regression-based clustering, and similarity measurement with a committee machine constructed from the clustering results. Three data sets of well characterized crystalline materials represented by critical atomic predicting variables are used as test beds. Herein, the focus is on the formation energy, lattice parameter and Curie temperature of the examined materials. Based on the information obtained on the similarities between the materials, a hierarchical clustering technique is applied to learn the cluster structures of the materials that facilitate interpretation of the mechanism, and an improvement in the regression models is introduced to predict the physical properties of the materials. The experiments show that rational and meaningful group structures can be obtained and that the prediction accuracy of the materials' physical properties can be significantly increased, confirming the rationality of the proposed similarity measure.
KW - data mining
KW - first-principles calculations
KW - machine learning
KW - materials informatics
KW - physical properties of materials
KW - similarity
UR - http://www.scopus.com/inward/record.url?scp=85056170777&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85056170777&partnerID=8YFLogxK
U2 - 10.1107/S2052252518013519
DO - 10.1107/S2052252518013519
M3 - Article
AN - SCOPUS:85056170777
SN - 2052-2525
VL - 5
SP - 830
EP - 840
JO - IUCrJ
JF - IUCrJ
ER -