Informatics Aided Design of Crystal Chemistry

Page 1

INFORMATICS AIDED DESIGN OF CRYSTAL CHEMISTRY By Changwon Suh A Thesis Submitted to the Graduate Faculty of Rensselaer Polytechnic Institute in Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY Major Subject: Engineering Science

Approved by the Examining Committee:

Krishna Rajan, Thesis Advisor

Kevin C. Craig, Member

Mutsuhiro Shima, Member

Luciano Castillo, Member

Rensselaer Polytechnic Institute Troy, New York April 2005 (For Graduation May 2005)


INFORMATICS AIDED DESIGN OF CRYSTAL CHEMISTRY By

Changwon Suh An Abstract of a Thesis Submitted to the Graduate Faculty of Rensselaer Polytechnic Institute in Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY Major Subject: Engineering Science The original of the complete thesis is on file in the Rensselaer Polytechnic Institute Library

Examining Committee: Krishna Rajan, Thesis Advisor Kevin C. Craig, Member Mutsuhiro Shima, Member Luciano Castillo, Member

Rensselaer Polytechnic Institute Troy, New York April 2005 (For Graduation May 2005)


Š Copyright 2005 by Changwon Suh All Rights Reserved

ii


CONTENTS

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vii

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

viii

ACKNOWLEDGMENT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xii

ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xiv

1 . INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

2.

1.1

Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

1.2

Research Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

1.3

Silicon Nitrides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5

1.4

Structural Descriptors for Spinel Structures . . . . . . . . . . . . . . . . . . . .

7

1.5

Descriptors of Structure and Property in Spinels . . . . . . . . . . . . . . . .

11

1.6

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

RATIONALE OF RESEARCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

12

2.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

12

2.2

Traditional Approaches of Materials Design . . . . . . . . . . . . . . . . . . . .

14

2.2.1

Structure Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

2 .2 .1.1

Binary Structure Maps . . . . . . . . . . . . . . . . . . . . .

14

2 .2 .1.2

Ternary Structure Maps . . . . . . . . . . . . . . . . . . . .

19

Traditional Mixing Strategy in Materials Design . . . . . . . . . .

21

2.3

New Mixing Strategy in Materials Design . . . . . . . . . . . . . . . . . . . . .

23

2.4

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

BULK MODULUS AND DESCRIPTORS RELATIONSHIPS . . . . . . . . . . .

27

3.1

27

2.2.2

3.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii


3.2.

Quantitative Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

28

3.3

Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31

3.3.1

Spinel Database and Selection of Descriptors for Modeling .

31

3.3.2

Identification of Outliers . . . . . . . . . . . . . . . . . . . . . . . . . . . .

33

3.3.3

PLS Analysis with Different Models . . . . . . . . . . . . . . . . . . .

35

3.3.4

Virtual Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

39

3.3.5

Semiempirical Approaches for Bulk Modulus . . . . . . . . . . . .

40

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43

HIGH THROUGHPUT SCREENING OF PHASE STABILITY . . . . . . . . . .

45

4.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45

4.2

Data and Computational Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45

4.2.1

Addition of New Descriptors . . . . . . . . . . . . . . . . . . . . . . . . .

45

Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

48

4.3.1

Outlier Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

48

4.3.2

Data Analysis and Partial Least Squares . . . . . . . . . . . . . . . .

51

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

56

REVISITING THE STRUCTURE MAPS FOR CRYSTAL CHEMISTRY OF AB2X4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

57

5.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

57

5.2

Structure Maps for Spinel Nitrides . . . . . . . . . . . . . . . . . . . . . . . . . . .

58

5.2.1

Hill’s Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

58

5.2.2

Haeuseler’s Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

60

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

60

DIMENSIONALLY REDUCED STRUCTURE MAPS . . . . . . . . . . . . . . . . .

62

6.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

62

6.2

Data and Computational Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

3.4 4.

4.3

4.4 5.

5.3 6.

iv


Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

65

6.3.1

Cluster Analysis with KNN . . . . . . . . . . . . . . . . . . . . . . . . . .

65

6.3.2

Principal Component Analysis . . . . . . . . . . . . . . . . . . . . . . . .

67

6.3.2.1

PC1 versus PC2 Configuration . . . . . . . . . . . . . . . .

67

6.3.2.2

PC1 versus PC3 Configuration . . . . . . . . . . . . . . . .

71

Physical Interpretation of Data Mining Results. . . . . . . . .

74

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

78

CONCLUSIONS AND FUTURE WORK . . . . . . . . . . . . . . . . . . . . . . . . . . .

79

7.1

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

79

7.2

Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

80

LITERATURE CITED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

82

6.3

6.3.3 6.4 7.

APPENDICES A.

B.

CLUSTER ANALYSIS BASED ON K-NEAREST NEIGHBOR METHOD

95

A.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

95

A.2

Numerical Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

96

PRINCIPAL COMPONENT ANLAYSIS . . . . . . . . . . . . . . . . . . . . . . . . . . .

99

B.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

99

B.2

Projection and Maximum Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

B.3

Covariance Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

B.4

Derivation of Principal Components . . . . . . . . . . . . . . . . . . . . . . . . . 101 B.4.1

1st Principal Component Calculation . . . . . . . . . . . . . . . . . . . 102

B.4.2

2nd Principal Component Calculation . . . . . . . . . . . . . . . . . . . 103 v


C.

B.5

The Graphical Representations of Eigenvalues . . . . . . . . . . . . . . . . . . 104

B.6

The Graphical Representations of Eigenvectors . . . . . . . . . . . . . . . . . 105

B.7

The Eigenvalue Problem and PCA . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

B.8

Singular Value Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

B.9

Eigenvalue Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

B.10

SVD, Eigenvalue Decomposition, and Principal Components . . . . . . 109

B.11

Optimal Number of Principal Components . . . . . . . . . . . . . . . . . . . . . 109

PARTIAL LEAST SQUARES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 C.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

C.2

Ordinary Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

C.3

Principal Components Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

C.4

Partial Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

C.5

Algorithms for Partial Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . 115

. D.

113

C.5.1

Non-Iterative Partial Least Squares . . . . . . . . . . . . . . . . . . . . 115

C.5.2

SIMPLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

116

QUANTUM MECHANIC DESCRIPTORS . . . . . . . . . . . . . . . . . . . . . . . . . . 118 D.1

Bulk Modulus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

D.2

Effective Charges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

vi


LIST OF TABLES

1.1

Lattice sites along the cubic unit cell body diagonal in the spinel, AB2X4. .

10

2.1

Possible chemistries in the periodic table . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

2.2

Summary of AB structure maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

18

2.3

Summary of ternary structure maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

22

2.4 Villars’ and Miedema’s mixing rules versus new mixing rule based on PCA/PLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

3.1

List of single and double nitrides which are used in this thesis . . . . . . . . . .

32

3.2

Training and test sets for model I and III in combinatorial arrays of AB2N4

35

3.3

Statistical parameters of each model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

36

3.4

Semiempirical rules between lattice d-spacing and bulk modulus. . . . . . . .

43

4.1

List of single and double nitrides which are used in this thesis . . . . . . . . . .

46

4.2

Training and test sets for all models in combinatorial arrays of AB2N4 . . .

50

4.3

Statistical parameters of each model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

50

6.1

List of single and double nitrides which are used in this thesis . . . . . . . . . .

64

A.1 Example of dataset for KNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

96

A.2 Autoscaled dataset for KNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

A.3 Distance matrix for KNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

vii


LIST OF FIGURES

1.1

A flow chart for “Materials by design” strategy based on materials informatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

Combinatorial selection of stoichiometries used in this thesis based on the work described by Ching et al. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7

(a) The unit cell of the spinel structure formula (b) Occupied ions in other octants of the unit cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

A tetrahedral and octahedral shape in the two octants of the unit cell in the Figure 1.3 (a) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

1.5

The nearest neighbors of an anion in the spinel structure . . . . . . . . . . . . . .

10

1.6

The role of structure-property relationships in materials design . . . . . . . .

11

2.1

The Mooser-Pearson map for octet AB compounds . . . . . . . . . . . . . . . . . .

15

2.2

Mendeleev sequencing number after Pettifor . . . . . . . . . . . . . . . . . . . . . . .

15

2.3

The Pettifor map for AB2 compounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17

2.4

(a) The Villars map (or quantum structural diagram) for 67 high Tc(>10K) binary/ternary superconductors (b) Matthias profiles . . . . . . . .

21

Flow charts of Villars’ and Miedema’s approaches versus PCA/PLS approach in mixing strategies from elemental properties of element A and B to binary properties of AxBy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

25

3.1

A flowchart of PLS modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29

3.2

Studentized residual versus leverage for 16 compounds whose bulk moduli are known . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

34

Cross-validated (CV) Y residual versus Y calibration residual for 16 compounds whose bulk moduli are known . . . . . . . . . . . . . . . . . . . . . . . .

34

Predicted versus ab initio derived bulk modulus for the PLS model I . . . .

36

1.2

1.3

1.4

2.5

3.3

3.4

viii


3.5

Predicted versus ab initio derived bulk modulus for the PLS model II . . .

37

3.6

Predicted versus ab initio derived bulk modulus for six external test set using PLS model I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

39

(a) “Structure-property” map combining theoretical data and PLS derived data (b) Traditional structure-property map from database . . . . . . . . . . . .

41

Studentized residual versus leverage for 32 compounds whose phase stabilization energies are known . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

49

Cross-validated (CV) Y residual versus Y calibration residual for 32 compounds whose phase stabilization energies are known . . . . . . . . . . . .

49

Predicted versus ab initio derived stabilization energy for the PLS model III . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

52

Predicted stabilization energy for new external test set using PLS model III . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

53

A-N bond length versus B-N bond length with ab initio derived stabilization energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

55

A-N bond length versus B-N bond length with PLS derived stabilization energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

55

5.1

Modified spinel (oxides, sulfides, and nitrides) structure map of Hill et al.

59

5.2

Scatter diagram of lattice parameter and anion positional parameter for only spinel nitrides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

59

Structure map for spinels (oxides, sulfides, and nitrides) by Haeuseler’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

61

5.4

Structure map for spinel nitrides by Haeuseler’s method . . . . . . . . . . . . . .

61

6.1

A flow chart of PCA and CA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

6.2

A dendrogram of 39 spinel nitrides with 18 variables . . . . . . . . . . . . . . . .

66

6.3

The PC1-PC2 score plots for the complete data using principal components (a) by tetrahedral site (b) by octahedral site . . . . . . . . . . . . . .

68

(c) with sample ID (d) loading plot of PC1-PC2 . . . . . . . . .

69

3.7

4.1

4.2

4.3

4.4

4.5

4.6

5.3

ix


6.4

The PC1-PC3 score plots for the complete data using principal components (a) by tetrahedral site (b) by octahedral site . . . . . . . . . . . . . .

72

(c) Score plot with sample ID. The relationships between each of the principal components and the loadings of variables can be seen in loading plot (d) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

73

(a) The PC1-PC2 score plot with direct band gap (Eg) (b) Loading plot for PC1-PC2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

76

(a) The PC1-PC3 score plot with direct band gap (Eg) (b) Loading plot for PC1-PC3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

77

A.1

1-NN and 7-NN classification of unknown object . . . . . . . . . . . . . . . . . . .

95

A.2

Steps for generating a dendrogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

98

B.1

A graphical depiction of the PCA method with the assumption of a ring shaped dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

99

6.5

6.6

B.2

Two dimensional data representation for six samples with two variables, y1 and y2 where 0 is a centroid of six samples after the mean centering. The reduction of dimensionality of space from two to one is shown in (b). The line is created by the projection from the two dimensional space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

B.3

A graphical representation of the data points and their eigenvalues . . . . . . 105

B.4

A graphical representation of the data points and their eigenvectors . . . . . 106

B.5

A covariance matrix, S, calculated from a given dataset (filled circles) . . . 107

B.6

Determination of two principal components (PC1 and PC2) in a new scaled coordinate, x1 and x2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

C.1

The matrix description of PLS method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

C.2

Geometrical representation of PLS method . . . . . . . . . . . . . . . . . . . . . . . . 115

x


I believe that imagination is stronger than knowledge. That myth is more potent than history. That dreams are more powerful than facts. That hope always triumphs over experience. That laughter is the only cure for grief. And I believe that love is stronger than death.

The Storyteller's Creed from "All I Really Need to Know I Learned in Kindergarten: Uncommon Thoughts on Common Things� by Robert Fulghum, 1986, Ballantine Books

xi


ACKNOWLEDGMENT I, first, would like to acknowledge my thesis advisor, Dr. Krishna Rajan, for his patient guidance of my research and his personal advice. He introduced me to the multidisciplinary study program, Engineering Science, and I have enjoyed various fields of study; Mathematics, Computer Science, Chemistry, Physics, and Materials Science, at Rensselaer. I truly appreciate the full confidence he had in my ability and continuing care and encouragement he gave me at difficult times. He has always known how to advise me through all the hurdles in this long journey and how to motivate me to do my best work. Next, I would like to express my gratitude to one of my committee Professors, Dr. Kevin C. Craig. He has always been at my side whenever I was in trouble. I also thank other committee members, Prof. Luciano Castillo and Prof. Mutsuhiro Shima, for their thorough feedback, warm support, and frequent encouragement. In addition, I thank Dr. Vicky Karen of NIST for taking good care of me when I worked as a guest researcher at NIST for a short amount time. I would like to acknowledge Prof. Gรถran Grimvall of Royal Institute of Technology in Stockholm, Sweden and Prof. Shuichi Iwata of the University of Tokyo, Japan for their valuable comments on this thesis. My interest in the field of Materials Science started burgeoning during the time I studied and worked in Korea. I thank Prof. Ok-Kyoung Kim, Prof. Kun-Sang Lee, Prof. Young-Dae Jung, and the late Prof. Chang-Hyo Lee of Hanyang University, and Dr. Gimo Yang of Samsung Corning Co. since they led me in an exploration of this exciting territory and encouraged me to continue with it. Many thanks are also due to my research labmates, Dr. Arun Rajagopalan, Michael Stukowski, and James LeBeau, for our fruitful discussion and collaboration. In particular, I deeply thank Dr. Arun Rajagopalan for the substantial feedback he gave me and the numerous stimulating conversations. I would like to thank the Materials Research Center staff especially Nancy Beatty, Sheila Zador, and Raymond Dove for easing my passage with their timely and gracious help. I owe much to my friends, to xii


name but a few, Yoonyoung Choi, Sang Ryoun Ryu, Taegyun Kim, Akio Koike, and Mihir Roy. They all helped to make Troy more of a home for me and I hope that we keep in touch in the future. The completion of this thesis signifies the end of my graduate school career and the beginning of my life. The potential of a bright future gave me the energy to continue working late into the night on many occasions. Finally, I owe a great deal to my family. Especially my wife and daughter, who is still playing inside her mother, have always encouraged me to achieve my goals. They are responsible for the person I am today. Without their love and support, I would not have attempted and completed such an arduous task. Words simply cannot express my gratitude. I dedicate this thesis to them.

xiii


ABSTRACT The search and design of new materials can be significantly aided by combinatorial experiments. However the key to minimizing the search process in combinatorial experiments is to identify the key combinations that achieve the desired functionality in the class of materials being studied. The concept of virtual combinatorial experiments for materials selection and design strategy is useful to show how one may design combinatorial libraries a priori by integrating data mining techniques with physically robust multivariate data. In this thesis, using crystal chemistry of spinel nitrides as a framework for materials design, the methodology for integrating derived variables using newly proposed mixing rules based on statistical tools such as Partial Least Squares (PLS) and Principal Component Analysis (PCA) is described. Strategically selecting appropriate quantum mechanical and crystallographic data are used to predict and identify new alloy chemistries, modulus properties, and phase stabilities. This approach is unique in materials design because it overcomes problems of length scale by connecting the microscopic phenomena and macroscopic engineering properties. The integration of the physics based predictions with data mining predictions is used to propose new virtual compounds, especially with higher order or multicomponent chemistries. With the predicted target properties (bulk moduli and phase stabilities) by PLS, correlations between all variables in a created library of binary and ternary spinel nitrides is visualized in the dimensionally reduced structure maps created by PCA. Through these activities, materials informatics plays an important role in guiding the choice of the most promising chemistries that exhibit the desired functionality in the virtual combinatorial libraries of hypothetical materials.

xiv


CHAPTER 1 INTRODUCTION The search for new materials, whether through experiment or simulation has been a slow and arduous task, punctuated by infrequent and often unexpected discoveries. Each of these findings prompts a flurry of studies to better understand the underlying science governing the behavior of these materials. The few systematic efforts that have been undertaken to analyze trends in the data as a basis for prediction have been inconclusive. This is due to the lack of large amounts of organized data, and more importantly, the challenge of sifting through them in a timely and efficient manner. Experimental strategy for making new alloys is considered to be a mature field and they are prohibitively slow. The value of computational aided design of alloys has been recognized by a number of workers as a far more rapid approach. However, even here there are limits to the number of combinatorial array of chemistries that can be explored [1-8]. This thesis addresses this informatics challenge by integrating large-scale databases with advanced computational materials science and machine learning tools.

1.1

Objectives Using alloy development as a test bed for informatics research, the discovery of

new chemistries of binary and multicomponent alloys with tailored modulus characteristics and phase stability is the main objective of this thesis. Modulus is a property that is influenced by a wide array of fundamental parameters describing atomistic, thermodynamic and crystal structure characteristics [9, 10]. Furthermore, modulus impacts all aspects of materials design ranging from its importance in understanding fundamental aspects of materials physics to its value in engineering design. Modulus, as derived from crystal structure scale information, can be used as a critical descriptor in designing new materials, whether they are new lightweight structural materials or ultra high temperature alloys or even physical properties such as superconductivity. The term “descriptor� generally represents physical parameters or

1


2 properties in this thesis. Descriptors can be generated from experiments or computation [11]. When combining a large combinatorial space of chemistries, as defined by even a small portion of the periodic table, it is clearly seen that searching for new materials with a tailored modulus is a prohibitive task [12]. Hence, the search for new materials for new applications is limited to educated guesses. Data that do exist are often limited to small regions of the compositional space. Experimental data is dispersed in the literature and computationally-derived data is limited to a few systems for which reliable quantum mechanical information exists. Even with recent advances in high speed computing, there are limits to how many new alloys can be calculated. This poses both a challenge and an opportunity; to deal with extremely large disparate databases and large scale computation. It is here that knowledge discovery in databases, also known as data mining, an interdisciplinary field merging ideas from statistics, machine learning, and databases, provides a unique tool to integrate scientific information and theory for materials discovery. Data mining has been engendered by the phenomenal growth of data in all spheres of human endeavor, and the economic and scientific need to extract useful information from the collected data. The key issue of data mining is learning through the extraction of useful information from a given set of data [13]. It takes the form of discovering new patterns or building models from a given dataset. Therefore, the challenge is to take advantage of recent advances in data mining and apply them to state-of-the-art computational approaches for calculating modulus and phase stability. In this chapter, the structure of spinels is briefly discussed. Rationale of research is discussed in chapter 2. New variables of spinel nitrides are calculated based on Villars’/Miedema’s rules in chapter 3 and 4. In these chapters, using crystal chemistry as a framework for materials design, the methodology for integrating derived variables using newly proposed mixing rules based on PCA/PLS, quantum mechanical and crystallographic data is developed to predict and identify new alloy chemistries, modulus properties, and phase stabilities. This approach is unique in materials design because it overcomes a length scale problem by connecting the microscopic phenomena and macroscopic engineering properties [14]. The integration of the physics based


3 predictions with data mining predictions is used to propose new virtual compounds, especially with higher order or multicomponent chemistries. Based on the prediction model, existing traditional structure maps of AB2N4 compounds are filled with spinel nitrides in chapter 5. Correlations between variables (or samples) in a created bigger library of binary and ternary compounds are visualized in the dimensionally reduced space in chapter 6. Through these activities, materials informatics will guide the choice of the most promising chemistries and identify correlations between variables. Chapter 7 presents the conclusions and future work. In the appendix, more detailed information about main machine learning techniques such as cluster analysis, principal component, and partial least squares is presented. The sequences of ‘Materials by design’ strategy based on materials informatics are summarized in Figure 1.1.

1.2

Research Background In the materials science community, crystallographic and thermochemical

databases historically have been two of the most well established databases. The former serves as the foundation for interpreting crystal structure data of metals, alloys and inorganic materials. The latter involves the compilation of fundamental thermochemical information in terms of heat capacity and calorimetric data. While crystallographic databases are primarily used as reference sources, thermodynamic databases were actually some of the earliest examples of informatics since these databases were integrated into thermochemical computations to map phase stability in binary and ternary alloys. This led the development of computationally derived phase diagrams which is a classic example of integrating information from databases and data models. The evolution of both databases has occurred independently although in terms of their scientific value, they are extraordinarily intertwined. Phase diagrams map out regimes of crystal structure in temperature-composition space or temperature-pressure space. Yet crystal structure databases were developed completely independently. At present the community has to work with each database separately, information searches are cumbersome, and data analysis interpretation involving both is very difficult to do.


4

TARGET PROPERTIES

EXPERIMENTS

COMPUTATIONS

AB INITIO CALCULATIONS (based on quantum mechanics)

MATERIALS INFORMATICS

PREDICTED TARGET PROPERTIES

VALIDATION OF PREDICTED TARGET PROPERTIES

Start

Compound DATABASE

Decide the target Property (Bulk moduli & phase stabilities)

Choose appropriate descriptors

Perform PLS (prediction)

YES

Good prediction? Perform PCA (classification)

Expanded Compound DATABASE

Identify relationships between samples/variables

Exit

NO Add new descriptors Calculation of compound descriptors from elemental database (based on Villars'/ Miedema's rules)

Elemental DATABASE

YES

Compound descriptors?

YES

Perform PLS (prediction)

NO

Validation by experiments and/or theory... Reasonable? NO

Figure 1.1: A flow chart for “Materials by design” strategy based on materials informatics


5 Researchers only integrate such information on their own for a very specific system at a time based on their individual interests. Hence there is at present no unified way to explore patterns of behavior across both databases, which are also scientifically related. Even with recent reports of computationally designed materials, knowing why those approaches worked is critical to understand if we are to build on that knowledge base for future predictions. As noted by Ledbetter and Kim [15], the bulk modulus, or reciprocal compressibility, represents a solid’s most basic elastic constant. It shows a material’s resistance to interatomic distance changes caused by hydrostatic pressure. Bulk modulus, atomic volume, and cohesive energy comprise the three fundamental cohesive properties, which, in turn, relate to a wide spectrum of properties ranging from hardness to superconductivity. In this thesis, such an approach is used for nitride spinels, an extremely important class of materials for a broad range of engineering applications, ranging from structural materials to microelectronics and photonics. These compounds form a fascinating example of the synergy between structure and properties that correlates fundamental issues ranging from electronic structure to crystal chemistry. Based on an analysis of a database of spinel nitrides, Partial Least Squares (PLS) is used to predict bulk modulus and phase stability. Principal Component Analysis (PCA) and cluster analysis based on K-nearest neighbor (KNN) are used to identify the role of crystal site occupancy on phase stability for nitride spinels. The research strategy for this thesis is to establish a methodology for the investigation and discovery of new compounds with tailored modulus and stability characteristics by applying new and existing data mining methods to heterogeneous databases. For the purposes of this study we are attempting to link phase stability to modulus as related to materials characteristics defined at the crystal unit cell length scale. Phase stability is evaluated in an unusually integrated fashion using both crystallographic and quantum mechanic descriptions of solids.

1.3

Silicon Nitrides Silicon nitride (Si3N4) is an important structural ceramic for high temperature

applications. Most Si3N4 based ceramics have high fracture toughness and hardness [16]


6 and are also used in the microelectronics industry [17]. Because of these excellent mechanical, electrical, and thermal properties, the structure and physical behavior of two stable polymorphs, α-Si3N4 and β-Si3N4, have been extensively studied [18-20]. These two crystals have a hexagonal structure [21, 22] and all Si are tetrahedrally bonded to N atoms with strong covalent bonding properties [21]. After the new spinel phase of cubic Si3N4 (c-Si3N4) was found at about 15GPa of pressure and high temperature exceeding 2000 K in 1999 [16], nitride based spinels have become fascinating examples of the synergy between structure and properties that correlate fundamental issues ranging from electronic structure to crystal chemistry. As a result of many efforts, c-Si3N4, c-Ge3N4, and c-Sn3N4 from the IVA based nitride spinels have been synthesized in the lab [1, 23-25] and cubic SiC2N4 for double nitrides has also been suggested [26]. From the discovery of these spinel structures, Ching et al. investigated to see if spinel structures are formed from other group IV elements and tried to calculate the physical properties for them [1]. In this thesis, their combinatorial selection of stoichiometries for spinel nitrides is used as the main template (Figure 1.2). The silicon atom in cubic spinel structures forms two types of coordination: four-fold and six-fold to the nitrogen atoms as 1:2 ratio [27, 28]. Hardness of c-Si3N4 is about 35.3GPa, which is greater than that of α-Si3N4 and β-Si3N4 and is stable against oxidation up to 1673K [29]. The bulk modulus and first pressure derivative of c-Si3N4 are B0=290GPa and B'0=4.9 respectively [30]. Therefore, c-Si3N4 is a promising superhard material. In addition, since c-Si3N4 theoretically has a direct band gap of 3.5 eV, it is in wide band gap semiconductor region but still narrower than that of other phases like α-Si3N4 or β-Si3N4. Therefore c-Si3N4 is a potential structural ceramic, which belongs to the field of semiconductors [31].


7

B

Si

Ge

Sn

Ti

Zr

C

C3N4

CSi2N4

CGe2N4

CSn2N4

CTi2N4

CZr2N4

Si

SiC2N4

Si3N4

SiGe2N4

SiSn2N4

SiTi2N4

SiZr2N4

Ge

GeC2N4

GeSi2N4

Ge3N4

GeSn2N4

GeTi2N4

GeZr2N4

IVA

Sn

SnC2N4

SnSi2N4

SnGe2N4

Sn3N4

SnTi2N4

SnZr2N4

IVB

Ti

TiC2N4

TiSi2N4

TiGe2N4

TiSn2N4

Ti3N4

TiZr2N4

Zr

ZrC2N4

ZrSi2N4

ZrGe2N4

ZrSn2N4

ZrTi2N4

Zr3N4

ZrHf2N4

HfZr2N4

Hf3N4

Hf

No available data

IVA

Hf

No available data

C

A

IVB

Figure 1.2: Combinatorial selection of stoichiometries used in this thesis based on the work described by Ching et al. [1]

1.4

Structural Descriptors for Spinel Structures It is interesting to study spinel structures since there are many crystallographic

parameters which can be changed by anion dilation from the ideal cubic closed packed positions [32]. There have been many electronic and structural studies of the cubic spinel phase and most of the works have studied the structures or properties somewhat separately [21, 22, 33, 34]. The cubic unit cell of spinel is shown in Fig.1.3. The spinel structure with the general formula AB2X4 consists of cubic close packed anions and tetrahedral/octahedral interstitial cations. In this formula, A and B represent cations and X is an anion (N, O, S, Se, Te). Spinels can be divided into two groups by extreme distributions of cations: the normal spinels A[B2]X4 and the inverse spinels B[AB]X4. The bracket represents the octahedral site. Intermediate phases can be represented as (A1-xBx)[AxB2-x]X4 with an inversion parameter x (x=0: normal and x=1: inverse) as well. The space group of the spinel structure is Fd3m or O7h. An FCC cubic


8 cell of A-atoms is divided into eight octants which have a length of a/2 as shown in Figure 1.3 (a). Four of the octants are occupied by AX4 and the other four are occupied by B4X4 (Figure 1.3 (b)).The anions in the spinel are determined by the anion displacement parameter, u. The change of u represents the adjustment of the structure due to the difference in effective radii of the cations in the tetrahedral and octahedral sites [35]. In the case of the unit cell origin of 43m on an A-site cation, the anions are arranged in ideal cubic close packing at u=0.375.

The lattice sites along the unit cell body diagonal for various choices of origin are shown in Table 1.1. If u increases above 0.375, the tetrahedron will be larger while the octahedron will be smaller because anions move away from the nearest tetrahedral cation in a [111] direction [35]. The nearest neighbors of an anion in the spinel structure are shown in Figure 1.5. The bond length between tetrahedral cation and anion (BLA-X) and the bond length between octahedral cation and anion (BLB-X) using δ=u-0.375 [36] are expressed as BL A-X = a 3(δ + 1/ 8) and BL B-X = a 1/16 − δ / 2 + 3δ 2

(1.1)

Tetrahedral and octahedral volumes can be also calculated. All the relationships between crystallographic parameters can be found in many sources [32, 35-38].


9 (b)

(a)

a

Figure 1.3: (a) The unit cell of the spinel structure formula. Only two octants of the cell are shown for clarity. Large circles: anions, small circles: tetrahedral cations, gray circles: octahedral cations. (b) Occupied ions in other octants of the unit cell: white octants for AX4 and shaded octants for B4X4. The formula units per cubic unit cell is Z=8 [32, 39].

Figure 1.4: A tetrahedral and octahedral shape in the two octants of the unit cell in the Figure 1.3 (a)


10 Fractional

Origin at

43m

point symmetry

Equipoint

Origin on

Origin on

(**)

tetrahedral cation site

tetrahedral vacancy

0,0,0

8a

A-site cation

Tetrahedral vacancy

1/8,1/8,1/8

16c

Octahedral vacancy

B-site cation

1/4,1/4,1/4

8a

A-site cation

Tetrahedral vacancy

3/8,3/8,3/8

32e

Anion X

Anion X

1/2,1/2,1/2

8b

Tetrahedral vacancy

A-site cation

5/8,5/8,5/8

16d

B-site cation

Octahedral vacancy

3/4,3/4,3/4

8b

Tetrahedral vacancy

A-site cation

7/8,7/8,7/8

32e

Anion X

Anion X

coordinates (*)

(* along body diagonal of unit cell, **Wycoff notation)

Table 1.1: Lattice sites along the cubic unit cell body diagonal in the spinel, AB2X4 [32]

a 4

X B A

3 a 8

Figure 1.5: The nearest neighbors of an anion in the spinel structure [36, 39]


11

1.5

Descriptors of Structure and Property in Spinels To estimate the bulk moduli of solids by semiempirical methods, there have

been many trials to find simple correlations of key parameters such as (polyhedral) volumes, nearest neighbor distances, cohesive energy, and average electron density [15, 40-76]. In this thesis, multiple parameters used in the theoretical calculations to quantitatively assess the statistical relationship of various descriptors (lattice constants, electronic bonding, and lattice stability) for each compound studied that may influence structure-property relationships in the spinel structure. Therefore it is crucial to choose appropriate descriptors to get robust relationships as shown in chapter 3 and 4. The important role of structure-property relationship is a linkage between processing and performance in material design as shown in Figure 1.6. All the structural descriptors are briefly explained in section 1.4 and the quantum mechanic descriptors are explained in the appendix D.

ing s s ce o Pr

re tu c ru St

rty e op Pr

rf e P

an m or

ce

Figure 1.6: The role of structure-property relationships in materials design [77]

1.6

Summary In this chapter, the limitations of both traditional and computational approaches

to search for the new materials are discussed. A new method using databases, data mining, and machine learning tools for materials discovery is proposed in this thesis, and it provides a unique tool to integrate scientific information and theory for materials discovery. As a main testbed of this thesis, spinel nitrides are also discussed.


CHAPTER 2 RATIONALE OF RESEARCH 2.1

Introduction The traditional approach of designing materials is based on the Edisonian

approach (i.e. a trial and error method) which only a few experiments are performed. Therefore opportunities to find possible material candidates are dramatically decreased. As a result of this traditional approach, a number of potential compounds are still undiscovered. This situation is shown clearly in Table 2.1. Today, however, the development of new materials has become more formidable because of increasing demand for diverse functionalities and precise control of each relevant parameter in different areas of modern technology. More recently, many of these problems are being addressed by using combinatorial synthesis techniques. Thus it is possible to accelerate materials discovery and screening by combinatorially designed high throughput experimentation. However, combinatorial approaches in materials science have been focused generally on the experimental aspects such as optimization of a process. In order to search for new materials through a number of candidate samples (i.e. a library) in an accelerated way, it is indispensable to integrate the experimental aspects of combinatorial synthesis with the computational aspects of information based design of materials [78]. To meet this end, in biology and organic chemistry, quantitative structure-activity relationships (QSARs) have become a major tool [79]. There are two fields which are closely related to optimal materials design. One is the field of informatics which is generally used to study sequences of protein or DNA by searching information obtained from large biological databases, especially in biology or drug design. Informatics involves the management of various types of information with the aid of computational and statistical techniques. The other field is chemometrics. In this field the chemical state of a system can be investigated via application of statistical methods and optimal experiments can be designed from multivariate data. 12


13 Experimentally

Maximum

Known

possible number

Unaries

~ 100

100

100%

Binaries

~ 4,000

4,950

81%

Ternaries

~ 8,000

161,700

5%

Quaternaries

~ 1,000

3,921,225

<1%

Systems

Percentage of known

Table 2.1: Possible chemistries in the periodic table [14] Chemometrics is widely used in the fields of organic chemistry and chemical engineering. The applications of chemometrics in both these fields are highly relevant to material science as materials informatics deals with computational chemistry (or physics) which includes mathematical models to predict material properties and also handles huge multivariate datasets for optimal experiment design. In addition, since quantitative structure-property relationships (QSPRs) are more useful for physical interpretation than QSARs [79], we need to have an exploring tool for the huge number of combinations of structure-property relations in materials science. This approach is significantly different from previous attempts at discovery. Thus while the traditional approach is based on different types of structure maps of empirical pairwise correlations between observed behaviors, all the aspects of our new exploring tool depends on prior existing data and the knowledge derived from it. The main purpose of this thesis is to apply statistical machine learning techniques, previously used in chemometrics, to the materials informatics approach for materials design and discovery. In this thesis, multivariate statistical techniques are the exploration tools for the vast number of structure-property relations. In this manner, different types of information are integrated and different length scales are linked in materials science. To this end, the strategy is to first build a large materials database to retrieve information whenever we need to and to use the data for finding structure-property relations. Furthermore, it is crucial for developing the scheme to extend elemental properties to


14 compound properties (Table 2.1). Through these activities good guidelines for materials selection in discovery processes are created.

2.2

Traditional Approaches of Materials Design

2.2.1 Structure Maps The traditional way to rapidly handle structure types and compositions (or given stoichiometry) of compounds depends on empirical rules that identify particular structures using a few variables [12]. The mapped empirical pairwise correlations between observed behaviors are called structure maps. These explain the structure types as functions of factors governing crystal structure. Structure maps are based on a priori theories and they associate information from electronic and atomistic information to crystal chemistry.

2.2.1.1 Binary Structure Maps Mooser-Pearson Map: For binary structures, the first successful structure maps were made by Mooser and Pearson for valence compounds (Figure 2.1) [80]. They used the difference in the electronegativity and average principal quantum number as coordinates of the structure map. The average principal quantum number is a measure of the directionality of the bonds. The difference in electronegativity is a degree of ionicity. Therefore, more directionally bonds phases will be in the region of small value of two axes as shown in Figure 2.1. The small values of two axes are related to four-fold zinc blende structures. Highly ionic bonded phases are eight fold-coordinated CsCl structure.

Pettifor Map: Instead of using physical coordinates, Pettifor used a phenomenological coordinate to create the structure maps [81]. This new coordinate is a Mendeleev sequencing number (MNP) in periodic table as shown in Figure 2.2. Figure 2.3 shows clear structural separation for 84 different AB2 structure types. For clarity, only 28 structure types are marked in the legend of Figure 2.3.


15

Figure 2.1: The Mooser-Pearson map for octet AB compounds [80]

0

IA

IIB

IIIB

IVB

VB

VIB

VIIB

1 He

12 Li

77 Be

86 B

95 C

100 N

101 O

102 F

2 Ne

11 Na

IIA

IIIA

IVA

VA

VIA

VIIA

VIIIA VIIIB VIIIC

IB

73 Mg

80 Al

85 Si

90 P

94 S

99 Cl

3 Ar

10 K

16 Ca

19 Sc

51 Ti

54 V

57 Cr

60 Mn

61 Fe

64 Co

67 Ni

72 Cu

76 Zn

81 Ga

84 Ge

89 As

93 Se

98 Br

4 Kr

9 Rb

15 Sr

25 Y

49 Zr

53 Nb

56 Mo

59 Tc

62 Ru

65 Rh

69 Pd

71 Ag

75 Cd

79 In

83 Sn

88 Sb

92 Te

97 I

5 Xe

8 Cs

14 Ba

50 Hf

52 Ta

55 W

58 Re

63 Os

66 Ir

68 Pt

70 Au

74 Hg

78 Tl

82 Pb

87 Bi

91 Po

96 At

6 Rn

7 Fr

13 Ra 33 La

32 Ce

31 Pr

30 Nd

29 Pm

28 Sm

(Eu)

27 Gd

26 Tb

24 Dy

23 Ho

22 Er

21 Tm

(Yb)

20 Lu

48 Ac

47 Th

46 Pa

45 U

44 Np

43 Pu

42 Am

41 Cm

40 Bk

39 Cf

38 Es

37 Fm

36 Md

35 No

34 Lr

17 Yb

18 Eu

103 H

Figure 2.2: Mendeleev sequencing number after Pettifor [81]. The string through modified periodic table connects all elements.


16

Structure maps of AB compounds are summarized in Table 2.2. Most binary structure maps in Table 2.2 used following factors as mentioned by Villars [82]. Thus

z Size factor: radii z Electrochemical factor: electronegativities and s-p parameters z Valence-electron factor: number of electron vacancies per atom, group number, number of valence s+d electrons, and Mendeleev numbers after Pettifor.


17

Figure 2.3: The Pettifor map for AB2 compounds [81]


18 Structure maps and references

Class of compound*

Mooser-Pearson (1959) [80] Phillips-Van Vechten (1969) [83] St. John-Bloch (1974) [84] Machlin et al. (1977) [85]

AB normal valence compounds

QN vs. ΔX Pauling

AB octet compounds

Average covalent vs. ionic energy gaps

Watson-Bennett (1978) [86] Watson-Bennett (1978) [87] Watson-Bennett (1978) [87] Andreoni et al. (1979) [88] Bloch-Schatteman (1981) [89]

XNY8-N compounds

Coordinates*

J&B vs. R J&B R J&B -R p+s,Y +R J&B p+s,X p-s,X p-s,Y

Fractionally bounded sub-octet XNYP-N compounds (3≤P≤6)

vs. R J&B R J&B -R J&B +R J&B p+s,X p+s,Y p-s,X p-s,Y

TU compounds

ΔX W & B

vs. N v

Non-octet XN-DYD compounds (N<8)

ΔX W & B

vs. S X + S Y

XU compounds

ΔX W & B

XY normal valence compounds

A A R pA+ s , X − R pA+ s ,Y vs. R p − s , X + R p − s ,Y

XNY8-N octet compounds

1 1 ( R0B & S + R1B & S − R2B & S ) X 1 8 1 1 −( R0B & S + R1B & S − R2B & S )Y 2 8 1 2

*

vs. S X

vs.

( R1B &S − R0B &S ) X + ( R1B &S − R0B & S )Y

Zunger (1981) [90]

XNY8-N octet compounds

R pZ+ s , X − R pZ+ s ,Y

vs. R pZ− s , X + R pZ− s ,Y

Zunger (1981) [90]

MN non-octet compounds

R pZ+ s , M − R pZ+ s , N

vs. R pZ− s , M + R pZ− s , N

Villars (1983) [91]

AB compounds excluding hP4 AsNi type compounds

∑V

Burdett-McLaran (1984) [92]

TU transition-transition metal compounds

H &S N TU vs. ΔETU

Pettifor (1988) [93]

AB compounds

P MN PA vs. MN B

Walzer (1990) [94]

XT simple metal-transition metal compounds

RspW 21 + RspW 22 vs. RspW 01 + RspW 02

Harada et al. (1997) [95]

Aluminides, silicides, and transition metal based compounds

d-orbital energy level of transition elements vs. bond order

Used abbreviations X, Y Z T,U M,N A, B

non-transition elements A group cations transition elements s, p, and d elements s, p, d, and f elements

AB

M &B vs. ΔX AB vs. ΔRsZ+ p , AB


19 QN

average principal quantum number

X Pauling , W &B , M &B R p − s , R p + s , RsJ+&pB , A,Z

electronegativity of Pauling, Watson and Bennett, and Martynov and Batsanov pseudopotential radius of St John and Bloch, Andreoni et al., and Zunger

W sp 01,02,21,22

pseudopotential radius after Walzer

B&S 0,1,2

R

renormalized orbital radius after Bloch and Schatteman

Nv

number of electrons vacancies per atom s-p factor average number of valence s+d electrons per atom number of valence electrons (group number) d-orbital energy after Herman and Skillman

R

S

N V

E H &S MN P

Mendeleev number of Pettifor

Table 2.2: Summary of AB structure maps. Note that most of above structure maps were done for AB2, AB3, A2B3, and A3B5. (modified from [81]) 2.2.1.2 Ternary Structure Maps Villars Map: A good example of a ternary structure map is a Villars map. In 1983, Villars achieved excellent results in separating the crystal structures of AB intermetallic binary compounds by using three key descriptors (electron negativity differences, orbital radii differences, and averaged valence-electron numbers) [91]. To generalize from element to ternary system, Villars defined generalized coordinates [96]. For the binary system AxBy with x≤y and x+y=1, the atomic parameters, from elemental variables, are

z Average valence-electron number N v = x( N v ) A + y ( N v ) B

(2.1)

z Weighted Electronegativity differences ΔX = 2 x( X A − X B )

(2.2)

z Weighted differences of Zunger’s pseudopotential radii sums

ΔR = 2 x{(rsA + rpA ) − (rsB + rpB )}

(2.3)

For ternary compounds of the form AxByCz, with x≤y≤z and x+y+z=1, then z Average valence-electron number N v = x( N v ) A + y ( N v ) B + z ( N v ) C

(2.4)

z Weighted Electronegativity differences ΔX = 2 x ( X A − X B ) + 2 x ( X A − X C ) + 2 y ( X B − X C )

(2.5)


20

z Weighted differences of Zunger’s pseudopotential radii sums

ΔR = 2 x{( rsA + rpA ) − ( rsB + rpB )} + 2 x{( rsA + rpA ) − ( rsC + rpC )}

(2.6)

+ 2 y{( rsB + rpB ) − ( rsC + rpC )} Villars’ linear weighting scheme has been effectively used by researchers for creating structure maps to separate ternary structures [95-97]. For example, the above three descriptors were used to make golden coordinates (quantum structural diagram: QSD, Figure 2.4 (a)). Villars’ QSD and Matthias profile (Nv vs. Tc) allow us to identify regions where high Tc superconductors are located [97]. When high temperature superconductors are defined at Tc>10K, the Villars map separates three domains A, B, and C as shown in Figure 2.4. Domain A is dominated A15 (or Cr3Si) family and domain B is for NbN family. The Chevrel sulfides and perovskites sit in domain C. Villars structure maps help us accelerate the search for functionality and structure from spatial configurations. Extended Pettifor Map: Pettifor has extended Mendeleev sequencing number for

structure maps to pseudobinary and ternary phases. To this end, he introduced the concept of average Mendeleev number MN for pseudobinaries. It is assumed that new addition of C and D follows of A and B sites for quaternary phases with constituent atoms A-B-C-D. For example, the alloy (AxC1-x)(ByD1-y)3 is treated as the pseudobinary AB3. The average Mendeleev number can be expressed as MN A =xMN A +(1-x )MN c and MN B =yMN B +(1-y )MN D

(2.7)

Pettifor applied this method to ternary borides, sulfides, selenides, and tellurides to create structure maps such as AlBm>l(C=S,Se,Te)n or AlBm>1(C=boron)n. Summary of ternary structure maps are shown in Table 2.3. Spinel related structure maps in Table 2.3 are described in chapter 5.


21

Figure 2.4: (a) The Villars map (or quantum structural diagram) for 67 high Tc (>10K) binary/ternary superconductors (b) Matthias profiles [97] 2.2.2 Traditional Mixing Strategy in Materials Design

Although the approach of structure maps is convenient to identify the structure for a given stoichiometry, it is highly empirical and just describes pairwise correlations between variables. However, a material property is a relational concept because it is a relationship between the change of environment and the materials response for this external environmental change [98]. The change of an environment is described by intensive variables while the extensive parameters represent the response of the material. Therefore, only by considering all of the variables governing a property in an equilibrium system, the property is well described. Most of the materials modeling approaches including structure maps are based on the use of databases of crystal structure, electronic structure, and thermochemistry. Each of these databases by itself provides information on hundreds of binaries, ternaries and other multicomponent systems. Furthermore, a database can be enlarged to permit a wide array of simulations for thousands of combinations of material chemistries. In principle, mixing and control of related elemental parameters in the periodic table is the main theme of materials design. This is due to the lack of information of multicomponent systems as shown in Table 2.1. The degree of complexity in materials design increases when the number of components is increased.


22

Structure maps and references

Class of compound*

Coordinates*

Kugimiya-Steinfink (1968) [99]

AB2C4 (C=S, Se, Te) compounds

K AB vs. RiA, A / RiA, B

Hill et al. (1979) [35]

AB2C4 (C=S, Se,Te) spinels

Ionic radii, lattice constant, and internal anion parameter

Burdett et al. (1981, 1982) [100, 101]

AB2C4 spinels

rσA = rsA + rpA vs. rσB = rsB + rpB

Villars (1988) [97]

Superconducting compounds

ΔX M & B vs. ΔR

Hovestreydt (1988) [102]

ABC compounds where A:rareearth, B:transition-metal, C: silicides,germanides, and gallides

RmT ,&CG &W − RmT ,&AG &W vs.

Haeuseler (1990) [103]

AxByC4 (C=O,S) spinels

VB + VC + PN C vs. X BM & B Rσ = ( xRσAX + yRσBX ) /( x + y ) vs.

Rπ = ( xRπAX + yRπBX ) /( x + y ) *

Used abbreviations

X AG X BG /( RiA,e ) 2 where X AG and X BG are electronegativities of

K AB

Gordy

(R )

( RiA,d + RiA,c ) 2 + ( RiA, B + RiA,c ) 2 + 1.155( RiA, A + RiA,c )

RiA, A , RiA, B , RiA,C

Ahrens’s ionic radii of the elements A, B, and C

Rs + p

the crossing points of the non-local pseudopotentials

RmT ,&CG &W − RmT ,&AG &W

VB + VC + PN C

difference in metallic radii of C and A atoms as defined by Teatum et al. Sum of group number V and periodic number PN of B and C atoms

X BM & B

Martynov and Batsanov’s electronegativity of the atom B

RσAB = ( rsA + rpA ) − ( rsB + rpB )

the difference between the total effective core radii of atoms A and B

RπAB = rsA − rpA + rsB − rpB

the sum of the orbital nonlocality of the s and p electrons on each site

A 2 i ,e

Table 2.3: Summary of ternary structure maps

Therefore, a design tool for handling a wide variety of chemistries from elements to multicomponent systems is indispensable. As discussed in above section, Villars’ approach [96] has been proposed to use elemental properties to multicomponent systems, which is based on a linear weighting scheme. The advantages of these linear mixing approaches are their simplicity and good compatibility between binary and


23 ternary systems. There is another good example what is called Miedema’s model for linear mixing of elemental properties to describe the properties of compounds. By Miedema’s rule, the heat of mixing in a binary alloy system consists of a negative contribution from the electronegativity difference between the two elemental constituents, and a positive contribution from their difference in electron densities [104]. For binary alloy, the energy of mixing is given by z

0 ΔH iinj = f (electronegativity, electron density of each constituent)

(2.8)

For a ternary alloy, the extended Miedema model is expressed as z

0 ΔH ABC = ΔH AB + ΔH BC + ΔH AC and ΔH ij = xi x j ( x j ΔH iinj + xi ΔH 0jini )

(2.9)

where xi and xj are the atomic compositions of i and j constituents. The extended Miedema’s model works well and has been proved by many works [104-108]. Consequently, Villars’ method and Miedema’s rule decompose multicomponent systems into a binary problem by linear weight functions of each constituent element. After each property of the compounds is calculated, it is used in structure maps in a bivariate way for classification of structure types (ex. Figure 2.4) and the results are compared with experimental data of compounds.

2.3

New Mixing Strategy in Materials Design In this thesis, new types of mixing rules developed using PCA and PLS are

proposed. PCA provides a classification tool for the properties of compounds, while PLS is used to predict the properties of compounds. The new mixing strategy is based on a linear mixing scheme of Villars’ or Miedema’s rules. While Villars and Miedema’s rules simply depend on the stoichiometric ratio of each constituent, the weights of newly defined variables in PCA/PLS are simultaneously determined by maximizing variance and covariance, respectively, after Villars’ or Miedema’s parameterizations. Detailed explanations of PCA and PLS techniques are shown in appendices B and C. The new mixing approach used in this thesis is contrasted with the traditional approach in Table 2.4.


24

Rule of mixing Element

Villars / Miedema 1) Elemental properties (Pele) of A and B

2) Define new variables

Pbin = xPeleA + yPeleB

1 2 3 n V1 = α1 Pcom + α 2 Pcom + α 3 Pcom + L + α n Pcom

3) Property (Pter) of AxByCz

1 2 3 n V2 = β1 Pcom + β 2 Pcom + β 3 Pcom + L + β n Pcom

Pter = xPeleA + yPeleB + cPelec M

Multicomponent

M Note: each coefficient is determined by maximizing variance and covariance in PCA

(Pcom)

and PLS, respectively Bi-variate structure maps (ex. axes will be

Prediction

1) Given n - Pcom s from Villars /Miedema’s mixing rules

2) Property (Pbin) of AxBy

Classification

New mixing rules based on PCA/ PLS

n1 n2 vs. Pcom ) Pcom

Limited in structure maps

Multi-variate structure maps by PCA (ex. axes will be Vn1 vs. Vn2) Target property: QSAR type prediction model by PLS

Note: Predicted target property can be used in PCA for future structure maps

Table 2.4: Villars’ and Miedema’s mixing rules versus new mixing rule based on PCA/PLS

The flow charts for mixing strategies are shown in Figure 2.5. The advantages of a new mixing rule based on PCA/PLS over Villars’ and Miedema’s rules include its handling of multivariate and length scale problems, statistically robust approach, and ability to capture the nonlinearity by using higher order interaction terms (i.e. minute adjustment of property prediction).


25

Elemental Properties

Step 1: Villars / Miedema's approach

Compound Properties

Villars' / Miedema's approach

Step 2: Classification by structure map Axes: compound properties (bi-variate problem)

New mixing approach based on PCA/ PLS

Step 2: Linear combinations of each compound property

Step 3: PCA Classification by structure map Axes: linear combinations of each compound property (multi-variate problem)

Step 4: PLS Prediction by QSAR formulation Target Property: linear combinations of each compound property (multi-variate problem)

Figure 2.5: Flow charts of Villars’ and Miedema’s approaches versus PCA/PLS approach in mixing strategies from elemental properties of element A and B to binary properties of AxBy


26

2.4

Summary In this chapter, a new mixing strategy is proposed. The new mixing rule, based

on PCA/PLS, is used to create new variables which are linear combinations of compound variables. Each compound variable is calculated by weighted elemental properties by Villars’ or Miedema’s scheme. This new mixing strategy in materials design has advantages for interpreting all the influences of each variable simultaneously, and for more precisely predicting target properties by considering all aspects of multiple variables.


CHAPTER 3 BULK MODULUS AND DESCRIPTORS RELATIONSHIPS 3.1

Introduction The search and design of new materials can be significantly aided by

combinatorial experiments. However the key to minimizing the search process in combinatorial experiments is to identify the key combinations to achieve the desired functionality in the class of materials being studied. The concept of virtual combinatorial experiments [109] for materials selection and design strategy is useful to show how one may design combinatorial libraries a priori by integrating data mining techniques with statistically robust multivariate data. This involves a process of strategically selecting appropriate physical parameters that can be analyzed in a multivariate manner. The analysis can lead to the identification and development of new material chemistries that show promise for the desired functionality. In this chapter, explicit quantitative relationships are developed which identify the relative contributions of different data descriptors and the resulting relationships between all these descriptors as a linear combination and the final property (i.e. QSPR/QSAR). QSPR/QSAR studies are based on regression methods that are widely used in machine learning algorithms [11]. With the identified structure-property relationship, any other compounds or even unsynthesized compounds can be quickly screened to set desired structures and properties [110]. The foundation of this study is a combinatorial analysis of first principles derived data on crystal chemistry and bulk modulus. The engineering motivation of such a study is that the use of combinatorial and informatics based methods can help in rapid screening and identification of new materials chemistries for hard materials. As noted by McMillan [111], the search for alternative superhard materials, some of which may even be harder than diamond, has stimulated high-pressure research for over half a century. The approaches have been of two kinds: either develops computational predictions from first principles or through experimental means which take advantage

27


28 of new advances in instrumentation. Despite this progress, our knowledge base on possible new material chemistries is still limited to a relatively small number of systems. In this chapter, the use of informatics techniques to accelerate this screening process is described. While there is a major concern in the combinatorial experimentation community for integrating data mining tools into high throughput experimentation data, we are developing a parallel strategy of creating a computational informatics infrastructure. The question that we can address via the QSAR formulation is how rapidly can we identify and test new materials without the need for a prohibitive array of complex and expensive experiments or computations? The traditional approaches to a “materials-bydesign” strategy would call for a general survey of likely materials and a detailed set of experiments or computations on a few to assess their efficacy for a specific application. The basic tenet of the “materials-by-design” strategy is to identify combinations of mixed metal sequences for a given stoichiometry of nitrides that can serve as promising candidates for the desired properties. To experimentally synthesize and characterize a vast array of possible compositions is of course not realistic. Similarly, despite the advances and sophistication of first principles predictions it is equally prohibitive to computationally predict properties for a large combinatorial array of crystal chemistries. Our approach is unique in that it blends in data derived from both experiments and computation to serve as a “training” data set and integrates it into large elemental database. This database contains a vast array of property information that is relevant to issues of interest, such as thermal expansion, compressibility, and modulus.

3.2

Quantitative Methodology The fundamental strategy of our work is to apply multivariate analysis on the

multiple parameters used in the theoretical calculations to quantitatively assess the statistical relationship of each of the numerous descriptors of each compound studied. We wish to explore the individual correlations of the specific variables (i.e. latent variables: LV) used in the ab initio calculations, which take into account their relative impact on final properties.


29

Choose samples and variables for modeling

Existing DATABASE

Normalization Outlier detection by preliminary test of PLS or PCA

Assign training set and test set

NO

YES

Want to keep the samples and variables?

NO

Perform PLS for training set

Expanded DATABASE

Choose the optimum number of latent variables

Calculations of model parameters

Statistically appropriate parameters?

Apply prediction model to test set

YES

Test prediction model for external test set (unknown samples)

YES

Validation by experiments and/or theory... Reasonable?

NO

Figure 3.1: A flowchart of PLS modeling

The Partial Least Squares (PLS) regression method is particularly appropriate for QSAR formulations because it is used to predict properties based on variables (even some which may have only indirect impact) which collectively relate to properties of interest. In addition, model parameters in PLS can be more accurately calculated with an increasing number of relevant variables and observations [112]. PLS also has an advantage over multiple linear regression because it can handle collinearity and missing data [112]. To cope with collinearity (for example, Mulliken’s effective charges, Q*oct


30 and Q*N, in Table 3.1) in a few variables in this chapter, PLS regression is chosen as a modeling approach for developing a QSAR type formulation. The procedures for PLS analysis are shown in Figure 3.1 and detailed geometrical and mathematical descriptions of PLS can be found in appendix C. The PLS analysis results in an equation that is a linear combination of each descriptor used to find the latent variables. All descriptors were autoscaled because they have different ranges and scales. The auto-scaled data matrix has a mean of zero and unit variance. Cross validation is used to choose an optimum number of LVs of the calibration model. In this thesis, the NIPALS algorithm was used and the appropriate numbers of LVs were chosen by the leave-one-out (LOO) cross validation method for the PLS calculation. The procedure of LOO is done by fitting a PLS model to n-1 samples and making a prediction of the y value for the omitted sample ( $y i ) [113]. This calculation is repeated for every sample in the data matrix. In this process, a few model parameters are calculated: RMSEC (root mean square error of calibration), RMSECV (root mean square error of cross validation), R2 (coefficient of determination), and Q2 (the cross validated coefficient of determination). When the predictive ability of the models including the number of the calibration samples, n, is calculated, the definitions of each model parameter are given as follows [114, 115]. The model fit by cross validation is discussed in the work of Hawkins et al. [116]. z PRESS: a measure of the predictive ability of prediction model n

PRESS = ∑ ( yi − $y i ) 2

(3.1)

PRESS n

(3.2)

i =1

z RMSECV RMSECV =

z RMSEC: a measure of how well the model fits the data n

RMSEC =

∑( y

i

− yi ) 2

i=1

(3.3)

n

where y i are the values of the predicted variable when all samples are used in


31 the model [117]. z R2: A quantitative measure of the goodness of fit

R2 = 1 −

RSS SSX tot . corr .

(3.4)

where RSS is the residual sum of squares and SSXtot.corr represents the total variation in data matrix after mean centering. z Q2: A measure of the goodness of prediction

R2 = 1 −

3.3

PRESS SSX tot . corr .

(3.5)

Results and Discussion

3.3.1 Spinel Database and Selection of Descriptors for Modeling

The database foundation for our study was taken from the work of Ching et al. [1] who investigated the influence of lattice constants, bulk moduli, band structures, and electronic bonding on the lattice stability of thirty nine single and double nitrides. A portion of the variables is shown in Table 3.1. Ching et al. explored a portion of the combinatorial space involving cubic A3N4 and cubic AB2N4 compounds where A and B can be either Group IVA or Group IVB elements (Figure 1.2). A series of studies have systematically explored up to 39 of the total possible of 49 compounds [1, 33, 34]. While one may in principle repeat these complex first principles calculations, the issue we wish to address in this chapter is whether one can develop quantitative structureproperty relationships (QSAR) to predict properties without arduously repeating first principle calculations and fitting the energy-volume curve to an equation of state. Furthermore, we describe how we have used an informatics approach to gain further insight into this theoretically derived data set and hence provide a means of rapidly enhancing critical data for structure-property relationships in materials. Aside from descriptors described by Ching et al. [1], electronegativity was added into our analysis because it is a good criterion of bond characters of ionicity and covalency.


32

ID 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39

Crystal (Spinel) C3N4 Si3N4 Ge3N4 Sn3N4 Ti3N4 Zr3N4 Hf3N4 CSi2N4 CGe2N4 SiGe2N4 CTi2N4 SiTi2N4 GeTi2N4 ZrTi2N4 SnTi2N4 ZrHf2N4 SiC2N4 GeC2N4 GeSi2N4 TiC2N4 TiSi2N4 TiGe2N4 TiZr2N4 CSn2N4 SnC2N4 CZr2N4 ZrC2N4 SiSn2N4 SnSi2N4 SiZr2N4 ZrSi2N4 GeSn2N4 SnGe2N4 GeZr2N4 ZrGe2N4 TiSn2N4 SnZr2N4 ZrSn2N4 HfZr2N4

EN -0.4114 -0.7457 -0.7371 -0.8314 -0.8486 -0.9857 -0.9600 -0.5229 -0.5200 -0.7429 -0.5571 -0.7800 -0.7743 -0.9400 -0.8371 -0.9771 -0.6343 -0.6286 -0.7400 -0.7029 -0.8143 -0.8114 -0.8943 -0.5514 -0.6914 -0.6029 -0.7943 -0.7743 -0.8029 -0.8257 -0.9057 -0.7685 -0.8000 -0.8200 -0.9029 -0.8429 -0.8829 -0.9343 -0.9686

Crystallographic Parameter lc (Å) u BLA-N BLB-N 6.8952 0.3832 1.584 1.673 7.8374 0.3844 1.831 1.885 8.2112 0.3841 1.907 1.982 8.9651 0.3845 2.092 2.165 8.4460 0.3832 1.949 2.045 9.1215 0.3830 2.109 2.206 8.7038 0.3815 1.982 2.121 7.5209 0.3811 1.714 1.832 7.7429 0.3700 1.616 1.970 8.0873 0.3772 1.790 2.000 7.8351 0.3637 1.550 2.046 8.2168 0.3753 1.785 2.051 8.4002 0.3829 1.940 2.032 8.6806 0.3868 2.056 2.072 8.6340 0.3888 2.087 2.045 8.9223 0.3853 2.091 2.143 7.2867 0.3885 1.754 1.725 7.4289 0.3942 1.863 1.723 8.0008 0.3900 1.946 1.885 7.5400 0.3937 1.883 1.753 8.0470 0.3898 1.956 1.896 8.3159 0.3836 1.932 2.006 8.9276 0.3800 2.017 2.163 8.3600 0.3636 1.650 2.187 7.7625 0.3988 2.007 1.772 8.5091 0.3674 1.738 2.189 7.8186 0.3965 1.991 1.799 8.6279 0.3715 1.816 2.188 8.2479 0.3948 2.076 1.909 8.7484 0.3753 1.878 2.197 8.2912 0.3928 2.051 1.937 8.7583 0.3795 1.965 2.151 8.4615 0.3895 2.053 1.997 8.9505 0.3807 2.028 2.188 8.5689 0.3872 2.040 2.029 8.8175 0.3780 1.953 2.176 9.1425 0.3859 2.152 2.191 9.0475 0.3814 2.059 2.206 8.9922 0.3815 2.057 2.193

Effective Charge Q*tet Q*oct Q*N 3.70 3.63 5.27 2.65 2.58 6.05 2.81 2.80 5.90 2.71 2.70 5.97 3.09 3.20 5.62 3.06 3.17 5.65 3.17 2.97 5.72 4.14 4.44 4.75 3.67 2.79 5.68 3.10 2.91 5.77 3.71 3.23 5.46 2.51 3.31 5.72 2.90 3.18 5.68 4.10 1.14 6.40 2.75 3.22 5.70 2.82 3.09 5.75 2.45 3.71 5.53 2.85 3.67 5.45 3.02 2.53 5.98 2.97 3.77 5.37 2.62 3.65 5.56 3.08 2.91 5.77 3.14 3.14 5.64 3.65 2.76 5.71 2.70 3.71 5.47 3.78 3.16 5.48 2.55 4.43 5.15 4.62 3.65 5.02 2.96 3.33 5.60 3.45 3.02 5.63 2.62 3.59 5.55 2.88 2.66 5.95 2.68 2.83 5.92 3.00 3.12 5.69 2.99 2.96 5.77 3.21 2.76 5.82 2.80 3.17 5.72 3.08 2.80 5.83 3.23 2.83 5.78

B (GPa) 377.6 280.1 268.6 203.6 265.6 225.3 328.6 309.5 266.0 277.1 300.3 274.5 253.2 179.0 167.6 258.3

Table 3.1: List of single and double nitrides which are used in this thesis, taken from Ching et al. [1-4] (EN: weighted electronegativity difference)


33 Since the Pauling electronegativity scale is not good for bulk materials [118], we chose Martynov-Batsanov electronegativity which is fully based on quantum mechanical basis [119]. The parameterization of electronegativity for ternary compounds is based on a linear weighting model which was originally proposed by Villars et al. [96, 109].

As

shown in equation (2.5), for ternary compounds such as AxByCz, if x≤y≤z and x+y+z=1, then the weighted electronegativity difference is: EN = ΔX = 2 x ( X A − X B ) + 2 x( X A − X C ) + 2 y ( X B − X C )

(3.6)

We also used crystallographic parameters and effective charges to predict the bulk modulus (See Table 3.1). However, since lattice constants (lc) correlate with cation radii or bond length of B-N (BLB-N), a column of lattice constants was excluded. Each individual crystallographic parameter is described in chapter 1.

3.3.2 Identification of Outliers

Since the behaviors of outliers deteriorate the power of model prediction, detection of outliers from given database is critical. Moreover, outliers provide good opportunities to interpret their unique physical behavior. PLS can be used to identify outliers. With 16 compounds whose bulk moduli are known (Table 3.1), preliminary PLS analysis was used to detect outliers in the dataset. From the leverage versus studentized residuals, outliers can be identified. Here, leverage is defined as the influence that a given sample will have on a model. The studentized (autoscaled) residual is a degree of the lack of fit of the y-value of a sample [117]. c-ZrTi2N4 (#14) has high leverage in residual versus leverage plot (Figure 3.2) and also deviated from the slope 1 line shown in Figure 3.3. Therefore c-ZrTi2N4 was identified as an outlier and this was not included for PLS analysis.


34

2.0 15

1.5 1.0

Y Stdnt Residual

0.5

2

10 5

0.0

14

8

12

6

13

9

11

16

-0.5

3 1

-1.0

4 7

-1.5 0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Leverage

Figure 3.2: Studentized residuals versus leverage for 16 compounds whose bulk moduli are known. All numbers in the figure correspond to the spinels in Table 3.1.

200 14

150

Y CV Residual

100 8

15 12

50

2

0 3

16

95 13 6 11

10

Slope 1 line

-50 14

7

-100 -100

-80

-60

-40

-20

0

20

40

60

80

Y Residual

Figure 3.3: Cross-validated (CV) Y residual versus Y calibration residual for 16 compounds whose bulk moduli are known. Solid line represents for slope 1 line.


35

3.3.3 PLS Analysis with Different Models

As shown in Table 3.2, training and test sets for model I and III were carefully chosen. In order to assess the value and number of latent variables appropriate to avoid overfitting the data, we tested the different models (I, II, and III) based on the selection of latent variables (the results of which are summarized in Table 3.3). For the PLS calculation, the NIPALS algorithm was used and the appropriate numbers of LVs were chosen by leave-one-out (LOO) cross validation method. In model I, four LVs were chosen and explained 99.8% of the variance in the descriptors. Calculated model parameters were as follows: RMSECV: 15.84GPa, RMSEC: 6.17GPa, R2: 0.96, and Q2:0.76. After LOO was used as an internal validation, the test set was tested as an external validation. Figure 3.4 shows predicted versus ab initio derived bulk modulus for model I with the resulting QSAR formulation: Bulk Modulus = -1.00096 EN -0.35682 u -0.77228 BLA-N – 0.83367 BLB-N

(3.7)

+0.03296 Q*tet + 0.18484 Q*oct - 0.13503 Q*N

where EN: weighted electronegativity difference, u; internal anion parameter, BLA-N: AN bond length, BLB-N: B-N bond length, Q*tet,

oct, N:

Mulliken effective charge for

tetrahedral site ion, octahedral site ion, N ion, respectively. B

C

Si

Ge

C

C3N4 (te)

Si

SiC2N4

CSi2N4 (tr) Si3N4 (tr)

Ge

GeC2N4

GeSi2N4

CGe2N4 (te) SiGe2N4 (tr) Ge3N4 (te)

GeSn2N4

Sn

SnC2N4

SnSi2N4

SnGe2N4

Sn3N4 (tr)

Ti

TiC2N4

TiSi2N4

TiGe2N4

TiSn2N4

Zr

ZrC2N4

ZrSi2N4

ZrGe2N4

ZrSn2N4

A

Hf

Sn

Ti

Zr

CSn2N4

CTi2N4 (tr)

CZr2N4

SiSn2N4

SiTi2N4 (te)

SiZr2N4

GeTi2N4 (tr) SnTi2N4 (te) Ti3N4 (tr) ZrTi2N4 (outlier)

Hf

GeZr2N4 SnZr2N4 TiZr2N4 Zr3N4 (tr)

HfZr2N4

ZrHf2N4 (tr) Hf3N4 (te)

Table 3.2: Training and test sets for model I and III in combinatorial arrays of AB2N4: (tr)-training set, (te)-test set, and bold- new external test set


36

PLS derived bulk modulus (GPa)

400

training set test set

350

c-C3N4

300

c-Zr3N4

250

c-Hf3N4 c-ZrHf2N4

c-SnTi2N4 200

150 150

200

250

300

350

400

ab initio derived bulk modulus (GPa)

Figure 3.4: Predicted versus ab initio derived bulk modulus for the PLS model I

Parameters

R2

Q2

RMSECV (GPa)

RMSEC (GPa)

I (4 LV model)

0.96

0.76

15.84

6.17

II (4 LV model)

0.87

0.56

36.86

19.39

III (3LV model)

0.87

0.34

39.46

11.59

Model

Table 3.3: Statistical parameters of each model

In model II, c-C3N4 and c-SnTi2N4, which have extreme value of bulk modulus as shown in Figure 3.4, were included in the training set (Table 3.2). Calculated model parameters for model II are shown in Table 3.3 and results are shown in Figure 3.5.


37

400

PLS derived bulk modulus (GPa)

training set test set 350

300

250

200

150 150

200

250

300

350

400

ab initio derived bulk modulus (GPa)

Figure 3.5: Predicted versus ab initio derived bulk modulus for the PLS model II

Since the difference between R2 and Q2 is quite large (~0.31), some relevant terms were not included in model II [115]. To see the effect of the number of LVs used (in this case 3 LVs – model III), we used the same training and test set as model I. Note the very low value of Q2 shown in Table 3.3. Since the balance of R2 and Q2 represents a desirable model [115], we chose model I as the final model. The PLS regression equation (3.7) provides the QSAR for a linear combination of all variables: thus a means of computationally engineering the modulus of spinel nitrides. It should be noted that the apparent discrepancies of the outer points does not necessarily detract from the statistical analysis but are actually of more value in identifying the limitations of the computational parameters associated with these compounds. Consider Figure 3.4: c-ZrHf2N4 and c-Hf3N4 have similar band structures and cSnTi2N4 and c-ZrHf2N4 are semimetals, while others in the given data are metals or semiconductors [1]. c-C3N4 is in fact a system that has been of great computational


38 controversy primarily due to the complexity of the electronic structure associated with this system. Based on our QSAR formulation, the role of the effective charge (Q*) in enhancing the modulus is particularly notable. This is consistent with theoretical studies [1, 118] that were shown the effective charge parameter helps to define the degree of charge transfer and the level of covalency associated with the specific site occupancy of a given species. This effective charge from ab initio calculations can be used as a major screening parameter in identifying promising crystal chemistries for promoting the modulus. Hence using PLS to develop a QSAR formulation combined with an interpretation of the physics governing these materials can indeed be valuable. Our predictions fit well with systems of similar electronic structure and allow for the clear identification of outliers based on these quantum mechanical calculations. As noted in part 3.3.2, PLS can be used to detect outliers as well. Using the above QSAR formulation for model I, equation (3.7), the bulk moduli of a new external test set were predicted. For six samples, we compared PLS derived bulk moduli with those of ab initio calculations (Figure 3.6). It should be noted that the comparison to theoretical calculations is the primary source for validation as experimental studies in this field are extremely limited. Hence property prediction is essentially dependent on difficult and expensive computational approaches and our QSAR strategy offers a “high throughput� computational tool to aid in designing crystal chemistry.


39

PLS derived bulk modulus (GPa)

340

c-SiC2N4

320

c-TiC2N4

300

c-GeC2N4

280

c-TiGe2N4

260

c-TiZr2N4

c-GeSi2N4

240

220 220

240

260

280

300

320

340

ab initio derived bulk modulus (GPa)

Figure 3.6: Predicted versus ab initio derived bulk modulus for six external test set using PLS model I. Used ab initio derived values are from literature [120].

3.3.4 Virtual Libraries

An informatics-derived database can be expanded from an initial set of theoretically-derived data. This forms a larger virtual library of structure-modulus relationships based on a smaller set of known data for a few compounds. This new structure map allows one to quickly examine data clustering, and other trends including apparent outliers, which otherwise would not be possible with the original computationally derived data set. Since bulk moduli of spinel nitrides were calculated by PLS in this chapter, the expanded structure-modulus relationships can be demonstrated. Figure 3.7 (b) is one example of a structure-property map from an existing database. With the aid of PLS, modulus-lattice constant relationships of spinel nitrides were mapped out for data from both the original training/test set as well as those predicted by the PLS regression (Figure 3.7 (a)). Before adding new informatics-


40 derived data into the database, all data should be validated with experiments or theories if applicable. In Figure 3.7 (b), c-C3N4 (#1), c-Hf3N4 (#7), and c-SnTi2N4 (#15) were already discussed in regard to Figure 3.4. The reasons for the behavior of these samples are discussed in chapter 6. c-ZrTi2N4 (#14) was identified as an outlier in the initial step of PLS. 3.3.5 Semiempirical Approaches for Bulk Modulus

The search for new materials will be more efficient by using a semiempirical approach. There are two approaches to describe elastic properties: one is to link elastic quantities with thermodynamics and another approach is to use crystallographic parameters in terms of elastic properties. From thermodynamic analysis, Raju et al. proposed a linear relationship between logarithmic bulk modulus and relative enthalpy at constant pressure [121]. This relationship is expressed as ln( Bs ) = ln( B0 ) + k ΔH

(3.8)

where Bs is the adiabatic bulk modulus at temperature T, B0 is the corresponding bulk modulus at the reference temperature T0, and k is a material dependent parameter. k does not depend on temperature and ΔH is a relative enthalpy at constant pressure. Raju et al. also tried to find the relationship between ln(Bs) and molar volume as shown in equation (3.9). VT = V0 + b ln(

Bs ) B0

(3.9)

where V is a molar volume and b is a thermoelastic parameter (temperature independent). This is the isobaric representation of the Grover-Getting-Kennedy (GGK) relation which is for the pressure dependence of isothermal bulk modulus [121]. Bulk modulus can be easily estimated from the developed simple scaling relations.


41 (a) 400

training/test set by ab initio new external test set by PLS -1.68 B=812.9d

1

350

Bulk modulus (GPa)

7 17

300

18

8 20

2711 2 25 9

250

21 10 1231 28 3 22 5 16 35 19 29 13 30 23 24 26 36 39 33 386 32 34 4

200

37

14 15

150 1.6

1.7

1.8

1.9

2.0

2.1

2.2

Weighted average of bond length, d

(b) 500 450

Os

C

BCC FCC HCP NaCl ZnS CsCl Diamond Rare-earth Oxide

400

Bulk Modulus (GPa)

350 300

??

250 200 150 100 50 0 1.5

2.0

2.5

3.0

3.5

4.0

4.5

o Nearest Neighbor Distance (A)

5.0

5.5

Figure 3.7: (a) “Structure-property� map combining theoretical data and PLS derived data (b) Traditional structure-property map from database ([1, 47, 53, 75, 122-126]). The question marks on (b) denote unknown region and it was identified as a spinel nitride region as shown in (a).


42 On the other hand, there already exist a few semiempirical relationships between crystallographic parameter and bulk modulus. For examples, if B and d (the nearest neighbor distance) are in units of GPa and Å respectively, B=550d-3 for I-VII rocksalt compounds (completely ionic) [44] and B=(Nc/4)(1972-220λ)/d3.5 or B=(3000100λ)(a/2)-3.5 for tetrahedral semiconductors [67, 76] where Nc is the average coordination number, a is lattice constant, and λ =0, 1, and 2 for group IV, III-V, and IIVI respectively. More detailed information is presented in section 6.3.3. Sung et al. [123] also proposed another type of relationship for 24 diamond like semiconductors, B=9.75P-0.0448d-0.423C0.0462 where P is an average period number of the constituent element, C=4-∆P/2, and ∆P is the difference of the number of p-orbital electrons between bonding atoms (d is in unit of kb). For elements, there is a Makino and Miyake’s power law [53], B=Cd-m. Semiempirical relationships are summarized in Table 3.4. From Figure 3.7 (a), a regression equation for the bulk modulus and average bond length was identified. For the case of spinel nitrides, a power law was appropriate to identify relationships between B and d. While Ching et al. proposed the simple equation of B=919d-1.86 for 18 spinel nitrides [120], the d-B relationship for 39 spinel nitrides in this thesis is expressed as:

B = 812.9d −1.68

(3.10)

As shown in Table 3.4, preterm in power laws ranges from 550 to 163120 while power changes from -7.81 to -1.68 for each class of materials. Since 39 spinel nitrides have the lowest value of power (-1.68), the variations of bulk moduli of spinel nitrides are quite small by changing the lattice constant. To get a logarithmic scaling relation for B similar to equation (3.8), equation (3.10) is given as follows.

ln( B ) = 6.7 − 1.68ln(d )

(3.11)

By using PLS methods with data generated by informatics, we have developed a linear logarithmic scaling relationship between bulk modulus and the nearest neighbor distance, which is consistent with similar logarithmic relationships such as the isobaric


43 representation of the GGK relation. Hence materials informatics has produced a scaling relationship in a rapid and in a robust manner.

Semiempirical rule Traditional Approach

Cohen [44] Cohen [67] Al-Douri et al. [76] Sung et al. [123] Makino et al. [53]

Present Approach

Ching et al. [120] Suh

Class of

Relationship

compound I-VII rocksalt compounds Tetrahedral semiconductors Tetrahedral semiconductors Diamond like semiconductors Elements

18 spinel nitrides 39 spinel nitrides

B=550d-3 (*) B=(Nc/4)(1972-220λ)d-3.5 (*) B=(3000-100λ)(a/2)-3.5 (*) B=9.75P-0.0448 C0.0462d-0.423 (**) B=Cd-m (*) sp3 bonding spd bonding 3d bonding 4d bonding 5d/4f bonding

C=2062 C=3702.6 C=23012 C=163120 C=73170

m=-3.57 m=-4.33 m=-5.27 m=-6.64 m=-7.81

B=919d-1.86 (*) B = 812.9d −1.68 or ln( B ) = 6.7 − 1.68ln(d ) (*)

Table 3.4: Semiempirical rules between lattice d-spacing and bulk modulus. (*) B and d are in units of GPa and Å respectively. (**) B and d are in units of kb and Å respectively. Note that all the values of Makino’s power law were calculated to unify the unit of d (from nm to Å).

3.4

Summary QSAR relationships provide the basis for conducting computationally based

combinatorial studies. When combined with a database of the appropriate latent variables as identified in the QSAR formulation, one can develop a “virtual” screening tool for identifying potential new chemistries for targeted properties.


44 In this chapter, we have developed a methodology by which we incorporate our training data into statistical learning / data mining tools to quantitatively assess how the properties of individual elements influence the collective property of the compound. In this manner we can rapidly develop predictions of vast new arrays of chemistries. Based on these predictions we can now seriously and effectively accelerate materials design by focusing on promising candidate chemistries. Those selected can now be subject to further analysis via experimentation and computational methods to validate crystal structure level properties. The data generated by these selective experiments and computations also serve to refine the next generation of “training� data for another iterative round of data mining, which permits a further refinement of high throughput predictions. Existing phenomenological relationship between bulk modulus and lattice constant for 39 spinel nitrides were refined in an accelerated way and in a robust manner by using materials informatics approach.


CHAPTER 4 HIGH THROUGHPUT SCREENING OF PHASE STABILITY 4.1

Introduction The phase stability of crystals is fundamental issue in relation to the bulk

modulus. Phase stability, as derived from thermodynamics and ab initio calculations, can be used as a critical descriptor when designing new materials. As discussed in chapter 3, one can learn from the data set in a heuristic manner to then calculate further properties without repeating the initial first principle calculations. In this chapter, the same approach as used in chapter 3 is examined for combining theoretical data with PLS derived phase stability.

4.2

Data and Computational Details In this chapter, a spinel nitrides dataset as shown in Table 3.1 is also used for

assessment of phase stability. The stabilization energy is used as defined in literature [1]. The stabilization energy per formula unit is expressed as

1 2 ( ΔE )double nitride = Et ( AB2 N 4 ) − [ Et ( A3 N 4 ) + Et ( B3 N 4 )] 3 3

(4.1)

While a negative value of ΔE represents the likelihood of a stable phase, a positive value indicates, at most, a metastable phase. It should also be noted that equation (4.1) is based on the assumption that single nitrides are stable and have equilibrium geometry at zero temperature and zero pressure as described in literature [1].

4.2.1 Addition of New Descriptors

As in Table 3.1, Ching et al. calculated various parameters of spinel nitrides to search influence of variables. For modeling of stabilization energy, several additional parameters are used in this chapter (Table 4.1).

45


46 ID Crystal

∆E (eV)

rσA

rσB

RσAX

RσBX

RπAX

RπBX

1 c-C3N4

0.100

0.260

0.640

0.640

0.100

0.100

0.260

0.260

2 c-Si3N4

0.880

0.180

1.420

1.420

0.880

0.880

0.180

0.180

3 c-Ge3N4

1.020

0.240

1.560

1.560

1.020

1.020

0.240

0.240

4 c-Sn3N4

1.340

0.240

1.880

1.880

1.340

1.340

0.240

0.240

5 c-Ti3N4

2.040

0.400

2.580

2.580

2.040

2.040

0.400

0.400

6 c-Zr3N4

2.285

0.415

2.825

2.825

2.285

2.285

0.415

0.415

7 c-Hf3N4

2.370

0.430

2.910

2.910

2.370

2.370

0.430

0.430

8 c-CSi2N4

-0.650

0.620

0.207

0.640

1.420

0.100

0.880

0.260

0.180

9 c-CGe2N4

0.000

0.713

0.247

0.640

1.560

0.100

1.020

0.260

0.240

10 c-SiGe2N4

-0.260

0.973

0.220

1.420

1.560

0.880

1.020

0.180

0.240

11 c-CTi2N4

-1.950

1.393

0.353

0.640

2.580

0.100

2.040

0.260

0.400

12 c-SiTi2N4

-1.430

1.653

0.327

1.420

2.580

0.880

2.040

0.180

0.400

13 c-GeTi2N4

-0.440

1.700

0.347

1.560

2.580

1.020

2.040

0.240

0.400

14 c-ZrTi2N4

-0.130

2.122

0.405

2.825

2.580

2.285

2.040

0.415

0.400

15 c-SnTi2N4

-0.200

1.807

0.347

1.880

2.580

1.340

2.040

0.240

0.400

16 c-ZrHf2N4

-1.523

2.342

0.425

2.825

2.910

2.285

2.370

0.415

0.430

17 c-SiC2N4

3.080

0.360

0.233

1.420

0.640

0.880

0.100

0.180

0.260

18 c-GeC2N4

3.840

0.407

0.253

1.560

0.640

1.020

0.100

0.240

0.260

19 c-GeSi2N4

0.440

0.927

0.200

1.560

1.420

1.020

0.880

0.240

0.180

20 c-TiC2N4

4.510

0.747

0.307

2.580

0.640

2.040

0.100

0.400

0.260

21 c-TiSi2N4

1.080

1.267

0.253

2.580

1.420

2.040

0.880

0.400

0.180

22 c-TiGe2N4

0.910

1.360

0.293

2.580

1.560

2.040

1.020

0.400

0.240

23 c-TiZr2N4

0.950

2.203

0.410

2.580

2.825

2.040

2.285

0.400

0.415

24 c-CSn2N4

1.790

0.927

0.247

0.640

1.880

0.100

1.340

0.260

0.240

25 c-SnC2N4

5.650

0.513

0.253

1.880

0.640

1.340

0.100

0.240

0.260

26 c-CZr2N4

1.750

1.557

0.363

0.640

2.825

0.100

2.285

0.260

0.415

27 c-ZrC2N4

6.510

0.828

0.312

2.825

0.640

2.285

0.100

0.415

0.260

28 c-SiSn2N4

0.290

1.187

0.220

1.420

1.880

0.880

1.340

0.180

0.240

29 c-SnSi2N4

1.150

1.033

0.200

1.880

1.420

1.340

0.880

0.240

0.180

30 c-SiZr2N4

0.250

1.817

0.337

1.420

2.825

0.880

2.285

0.180

0.415

31 c-ZrSi2N4

1.710

1.348

0.258

2.825

1.420

2.285

0.880

0.415

0.180

32 c-GeSn2N4

0.410

1.233

0.240

1.560

1.880

1.020

1.340

0.240

0.240

33 c-SnGe2N4

0.060

1.127

0.240

1.880

1.560

1.340

1.020

0.240

0.240

34 c-GeZr2N4

0.730

1.863

0.357

1.560

2.825

1.020

2.285

0.240

0.415

35 c-ZrGe2N4

0.960

1.442

0.298

2.825

1.560

2.285

1.020

0.415

0.240

36 c-TiSn2N4

1.060

1.573

0.293

2.580

1.880

2.040

1.340

0.400

0.240

37 c-SnZr2N4

0.520

1.970

0.357

1.880

2.825

1.340

2.285

0.240

0.415

38 c-ZrSn2N4

0.080

1.655

0.298

2.825

1.880

2.285

1.340

0.415

0.240

39 c-HfZr2N4

0.035

2.313

0.420

2.910

2.825

2.370

2.285

0.430

0.415

Table 4.1: List of single and double nitrides which are used in this thesis, taken from Ching et al. and Zunger [1, 90] (∆E (eV): Stabilization energy per formula unit)


47 Development of new variables is indispensable for developing QSAR type prediction models because appropriate variables (to get more robust models) need to be chosen. It heavily depends on physical intuition to find appropriate descriptors for phase stability of spinel nitrides. For example, from the fact that the atomic coordinations of the building blocks are closely related with the stability of the structures [127], the atomic coordination in the modeling process can be tested. To choose appropriate descriptors for spinel nitrides, traditional structure maps and QSAR formulations are connected since structure maps contain large amounts of factors governing crystal structure and information of stability by structure types. Some of the most used key descriptors in the structure map for various structures are Zunger’s pseudopotential orbital radii [91, 96, 97, 100, 101, 103, 128, 129]. These orbital radii have been used for the structural sorting map for AB2X4 chalcogenide spinels [101, 103]. For example, Burdett et al. used the following two parameters in structure field maps to separate normal and inverse spinels of AB2X4 [100, 101]. rσA = rsA + rpA and rσB = rsB + rpB

(4.2)

The sum of each orbital radii of equation (4.2) represent the total size of the effective core of atoms [101]. For applications to ternary compounds, Haeuseler introduced another parameterization scheme as mean values of pseudopotential radii [103].

Rσ = ( xRσAX + yRσBX ) /( x + y ) Rπ = ( xRπAX + yRπBX ) /( x + y )

(4.3)

where x and y are the stoichiometric factors for A and B, respectively and the terms of binary radii are expressed as Zunger’s parameterization scheme [90] for AB compounds structure maps. Thus RσAB = (rsA + rpA ) − (rsB + rpB ) RπAB = rsA − rpA + rsB − rpB

(4.4)

where RσAB is a measure of the difference between the total effective core radii of atoms A and B while RπAB represents the sum of the orbital nonlocality of the s and p electrons


48 on each site [90]. RπAB also can be defined as the degree of s-p hybridization for the two atoms [103]. In this chapter, both Burdett’s and Hauseler’s parameters are used to predict phase stability with crystallographic parameters and effective charges based on the calculations by Ching et al. [1]. The values of Burdett’s and Haeuseler’s descriptors are shown in Table 4.1.

4.3

Results and Discussion

4.3.1 Outlier Detection

With 32 compounds, whose phase stabilities are already determined in Table 4.1, preliminary PLS was performed to detect outliers in the dataset. c-CSi2N4 (#8) and cZrTi2N4 (#14) have high values of studentized residuals and high leverage as shown in Figure 4.1. c-SiSn2N4 (#28) also deviates from the slope 1 line on its cross-validated residual versus residual plot (Figure 4.2). Therefore these three compounds were identified as outliers and were removed from the PLS analysis.


49

2.0 16

1.5

38

8

33

11

1.0

9

Y Stdnt Residual

35 22 15

0.5

32 18 37 39

1336

0.0

20

12 17 10 21 34

-0.5

29 23 31 28

25

24

19

-1.0

27 14

26 30

-1.5 0.0

0.1

0.2

0.3

0.4

0.5

0.6

Leverage

Figure 4.1: Studentized residuals versus leverage for 32 compounds whose phase stabilization energies are known. All numbers in the figure correspond to the spinels in Table 4.1. 4 8

3

Y CV Residual

2

3816

11 33

1

0

21 34 Slope 1 line

26

-2 -1.5

9 35

23 29 14 31 19 24 25

-1 30

10 17 12

18 20 13 39 36 37

22 32 15

-1.0

27 28

-0.5

0.0

0.5

1.0

1.5

Y Residual

Figure 4.2: Cross-validated Y residual versus Y calibration residual for 32 compounds whose phase stabilization energies are known. Solid line represents for slope 1 line.


50

B A C Si Ge Sn Ti Zr

C

Si

Ge

Sn

Ti

Zr

C3N4

CSi2N4 (outlier)

CGe2N4 (tr) SiGe2N4 (te)

CSn2N4 (te) SiSn2N4 (outlier) GeSn2N 4 (tr)

CTi2N4 (tr) SiTi2N4 (tr) GeTi2N4 (tr) SnTi2N4 (te)

Ti3N4

CZr2N4 (tr) SiZr2N4 (tr) GeZr2N4 (te) SnZr2N4 (tr) TiZr2N4 (tr)

ZrTi2N4 (outlier)

Zr3N4

ZrHf2N4 (tr)

HfZr2N4 (tr)

Hf3N4

SiC2N4 (te) GeC2N4 (tr) SnC2N4 (tr) TiC2N4 (te) ZrC2N4 (tr)

Si3N4 GeSi2N4 (te) SnSi2N4 (tr) TiSi2N4 (tr) ZrSi2N4 (tr)

Ge3N4 SnGe2N4 (tr) TiGe2N4 (te) ZrGe2N4 (te)

Sn3N4 TiSn2N4 (tr) ZrSn2N4 (tr)

Hf

Hf

Table 4.2: Training and test sets for all models in combinatorial arrays of AB2N4: (tr)-training set, (te)-test set, and bold- new external test set

Parameter Model I (3-LV model) II (4-LV model) III (6-LV model) IV (6-LV model)

Descriptors

Explained variance in the descriptors

R2

Q2

x, BLA-N, BLB-N, Q*tet, Q*oct, Q*N

0.937

0.694

x, BLA-N, BLB-N, Q*tet, Q*oct, Q*N, EN

0.983

x, BLA-N, BLB-N, Q*tet, Q*oct, Q*N, EN, Rσ , Rπ x, BLA-N, BLB-N, Q*tet, Q*oct, Q*N, EN,

rσA , rσB

RMSECV

RMSEC

(eV)

(eV)

0.423

1.62

1.17

0.743

0.493

1.53

1.07

0.986

0.908

0.751

1.08

0.64

0.998

0.881

0.715

1.16

0.73

Table 4.3: Statistical parameters of each model


51 4.3.2 Data Analysis and Partial Least Squares

Training and test sets for each model were assigned as shown in Table 4.2. In order to assess the optimum number of LVs, different models were tested based on the selection of LVs (Table 4.3). As in chapter 3, the NIPALS algorithm was used and the appropriate numbers of LVs were chosen using the leave-one-out (LOO) crossvalidation method in PLS analysis. From the balance of R2 and Q2, model III was chosen as the final model. In model III, six LVs were chosen that explained 98.6% of the variance of the descriptors. Calculated model parameters were as follows: RMSECV: 1.08 eV, RMSEC: 0.64 eV, R2: 0.908, and Q2: 0.751. After using LOO as an internal validation, the test set was tested as an external validation. Figure 4.3 shows predicted versus ab initio derived stabilization energy values in the model with the resulting QSAR formulation: Stabilization energy = -0.20504 EN +0.20064 u +0.4316 BLA-N + 0.89129 BLB-N +0.03936 Q*tet + 0.24921 Q*oct - 0.41697 Q*N -1.69317 Rσ +0.53483 Rπ

(4.5)

where EN: weighted electronegativity difference, u; internal anion parameter, BLA-N: AN bond length, BLB-N: B-N bond length, Q*tet: Mulliken effective charge for tetrahedral site ion, Q*oct: Mulliken effective charge for octahedral site ion, Q*N: Mulliken effective charge for N ion, Rσ and Rπ : mean values of pseudopotential radii. In Figure 4.3, most of the spinels were well predicted except c-GeSi2N4 (#19), which was predicted as a stable phase. This effect statistically is due to the smallest value of Q*oct and Rπ in the training and test sets. The contributions of each attribute (variable) of making stable phase of c-GeSi2N4 (#19) are discussed in chapter 6.


52

8

PLS derived stabilization energy (eV)

27

6 25

4

18

20

17

2

29 37 34 35 21 15333032 36 23 22 109 13 39 38

0 11 16 12

training set test set

19

-2 -2

31 26 24

0

2

4

6

8

ab-initio derived stabilization energy (eV)

Figure 4.3: Predicted versus ab initio derived stabilization energy for PLS model III. Sample ID corresponds to Table 4.1.

Based on the above QSAR formulation, the role of Zunger’s pseudopotential radii ( RĎƒ and RĎ€ ) and B-N bond length (BLB-N) in phase stability is particularly important. In other words, if the difference between the total effective core radii of atoms A (or B) and N becomes large, the phase will be more stable while a smaller value of the sum of the orbital nonlocality of the s and p electrons on each site leads to a more stable phase. In addition, a shorter B-N bond length is favorable to a stable phase but the A-N bond length, a corresponding factor of the B-N bond length, also should be considered. Relationships between all used factors will be discussed in the next chapter. Using the above QSAR formulation of model III, the stabilization energy is predicted for a new external test set which are single nitrides as shown in Table 4.2. The predicted values of single nitrides are shown in Figure 4.4.


53

4

c-C3N4

Predicted stabilization energy (eV)

3 2 1

c-Zr3N4

c-Sn3N4

0

c-Ge3N4 c-Ti3N4

-1 -2

c-Hf3N4

-3

c-Si3N4

-4 1

2

3

4

5

6

7

Sample ID

Figure 4.4: Predicted stabilization energy for new external test set using PLS model III

As Ching et al. discussed in their paper [1], c-C3N4 (#1) is a highly metastable phase in Figure 4.4. This highly metastable phase is statistically related with the smallest values of RĎƒ within the dataset. Among the stable phases (below 0 eV), cSi3N4 [16, 130] and c-Ge3N4 have been recently synthesized [131, 132]. Moreover, the existence of c-Ti3N4 was predicted by Ching et al. [34]. For these reasons, c-Hf3N4 may exist because of its low predicted stabilization energy. Consequently, this prediction model seems to be reasonable except the fact that c-Sn3N4 has been synthesized by a reaction of SnBr2 and KNH2 or high pressure solidstate metathesis reaction [24, 133]. The metastable phase of c-Sn3N4 is generally related with larger values of bond lengths and a small value of RĎ€ in PLS formulation. More detailed work needs to investigate the behavior of c-Sn3N4. As mentioned in the work of Ching et al. [1], a smaller cation is favorable for tetrahedral sites and larger cations are favorable for octahedral sites, generally (Figure 4.5). Figure 4.6 is the same plot of


54 Figure 4.5, and uses PLS derived stabilization energy as a color code. Most stable phases sit in the upper left region of slope-1 line except extreme values (above 2.15 Ă… or below 1.70 Ă…) of the B-N bond length. c-C3N4 (#1) in Figure 4.6 has the smallest value of B-N bond length. In the upper right region, c-TiZr2N4 (#23) and c-HfZr2N4 (#39) sit with metastable compounds. For these compounds, c-ZrHf2N4 (#16) and its reverse case, cHfZr2N4 (#39), have almost the same stabilization energy. Similarly, c- ZrTi2N4 (#14) and c-TiZr2N4 (#23) also have similar values of stabilization energy. Since site preference does not have an effect on these compounds, they need to be explored in detail through quantum theory as suggested by Ching et al. [1]. In the lower right part of the slope-1 line in Figure 4.6, c-GeSi2N4 (#19) has only a stable phase. The fact that a phase of c-GeSi2N4 by PLS calculations is stable (see Figure 4.3) is not consistent with ab initio calculations by Ching et al. [1]. However, this PLS result is in agreement with the experimental result [134] and calculations which are based on total energy density functional theory and statistical pseudo-binary alloy theory [25]. Dong et al. explained site preference of c-GeSi2N4 by using the local anion environment which is related to the symmetry of A-N and B-N bond lengths within the Ge3N4-Si3N4 system [25]. To get stable spinel nitrides, consequently, larger B-N bond lengths than those of A-N are desirable. It also should be noted that extreme values of B-N bond length leads to metastable phases.


55

2.2

26

24

30

28

2.1 10

2.0

37 7.000

5.875

14

12

11

6 38 34 39 4 23 16

36 32 7

3

9

5 13 22

15

35 33

4.750

BLB-N

31

1.9

3.625

29

21 19

2

2.500

8 27 25

1.8 20 17

1.7

1.375

18

0.2500

1 0

single nitrides double nitrides

1.6 1.5 1.5

1.6

1.7

1.8

1.9

2.0

2.1

-2.000

ΔE

2.2

BLA-N

Figure 4.5: A-N bond length versus B-N bond length with ab initio derived stabilization energy (color code) [1]. Note: blue triangles of single nitrides were assumed to be stable [1] and the dotted line is a slope 1 line.

2.2

26

24

30

28

36 32 7

2.1 10

2.0

37 7.000

5.688

14

12

11

6 38 34 39 4 23 16

3

9

5 13 22

35 33

15 4.375

BLB-N

31

1.9

3.063

29

21 19

2

1.750

8

1.8 20 17

1.7

27 25

0.4375

18

0

1 -2.188

1.6 -3.500

ΔE

1.5 1.5

1.6

1.7

1.8

1.9

2.0

2.1

2.2

BLA-N

Figure 4.6: A-N bond length versus B-N bond length with PLS derived stabilization energy. Note: the dotted line is a slope 1 line.


56

4.4

Summary In this chapter, QSAR type prediction model was developed for phase stability

in spinel nitrides system. The use of appropriate descriptors such as Zunger’s pseudopotential radii is crucial to get a more robust prediction model. With carefully chosen descriptors, predictions fit well with spinel nitrides systems.


CHAPTER 5 REVISITING THE STRUCTURE MAPS FOR CRYSTAL CHEMISTRY OF AB2X4

5.1

Introduction The search for stable compound structures based on information about only the

constituent elements is a classic crystal chemistry problem as discussed in chapter 2. Structure mapping has played an important role as a priori useful guide for finding stable phases [81]. Physical factors governing stable crystal structures are the coordinates of structure maps. With carefully chosen physical factors, each compound can be spatially identified by its structure type. Therefore the choice of appropriate indices (coordinates) is critical to obtain good resolution for separating structures in a structure map (for example, Table 2.2 and 2.3). For AB2X4 (O, S, Se, Te) stoichiometries, crystal structures were identified by the bond stretching force constants and ionic radius ratio of cations [99, 135]. Hill et al. [35] exploited ionic radius, lattice constant, and internal anion parameter for spinel structures. In 1987, a concentration weighted linear parameterization scheme by Villars and Hulliger made it possible to extend binary to ternary structure maps as discussed in chapter 2 [96]. In the approach of Burdett el al., the AB2X4 spinel structures were divided into two groups, normal and inverse spinels [100]. Haeuseler’s work was more oriented to ternary compounds by the use of new parameterization of Zunger’s pseudopotential radii as shown in equation (4.3) [103]. In this chapter, the approaches of Hill et al. [35] and Haeuseler’s structure maps [103] for spinel structure are mainly discussed. Spinel nitride data (Table 3.1 and 4.1) with predicted bulk moduli and phase stabilities are mapped onto the traditional spinel structure maps. From the pairwise correlations in newly populated spinel structure maps, the spatial configurations of spinel nitrides and their physical behaviors are identified.

57


58

5.2

Structure Maps for Spinel Nitrides

5.2.1. Hill’s Approach

The first example of structure maps in this chapter is an approach by Hill et al. [35]. They clearly separated sulfides and oxides in spinel systems by choosing the lattice constant and anion parameter as axes, investigated the internal consistency of spinel structural data, and aimed to predict the parameters for unknown compounds. When the data of nitride spinels are included (Table 3.1), the resulting structure map is as shown in Figure 5.1. Most of the stable nitrides sit together with oxides while the metastable nitrides are located near the tetrahedral bond length (BLA-N) of 1.6 and 2.15Ă… and octahedral bond length (BLB-N) of 1.8 and 2.2Ă…. From this fact, phase stabilities of spinel nitrides are more related with lattice constant than internal anion parameter. With best fit lines for each site of tetrahedral and octahedral and predicted phase stabilities in chapter 4, the data of spinel nitrides were plotted in Figure 5.2. According to literature [35], if AB2X4 sits away from the intersection of tetrahedral site (Atet) and octahedral site (Boct) lines, then cations of the spinel will be partially disordered over the A and B sites. Although c-Si3N4 lies away from the Sitet and Sioct in Figure 5.2, this cannot be considered as disordered spinel because it is a single (binary) nitride. All nitrides with C or Sn atoms at octahedral sites (Coct or Snoct) and Sn atoms at tetrahedral sites (Sntet) are metastable. Most metastable nitrides sit in the marginal region. The location for the metastable phase of c-TiSi2N4 is somewhat different from other metastable spinels. For this compound, further investigation for other factors should be considered. The positions of c-ZrHf2N4, c-ZrTi2N4, c-HfZr2N4, and c-TiZr2N4 are also abnormal with respect to other site occupancy. However, c-ZrHf2N4 and c-HfZr2N4 have similar values of stabilization energy like c-ZrTi2N4 and c-TiZr2N4. Thus site preference does not have an effect on the stability of these compounds. The influence of variables on stability is discussed in chapter 6.


59

BL .1 =2

N A-

0.400

BL

0.395

.7 =2

N A-

.6 =1

-N

0.390

Å

BL A Å

0.385 0.380

Å

B-N =2.8

BL

0.365

BL

B-N =1 .

B-N =2.2

0.370

Å

0.375

BL

anion parameter (unit cell origin at -43m)

Oxides Sulfides Stable nitrides Metastable nitrides

0.360 6.5

7.0

7.5

8.0

8.5

9.0

9.5

10.0

10.5

11.0

11.5

o

lattice constant (A)

anion parameter (unit cell origin at -43m)

Figure 5.1: Modified spinel (oxides, sulfides, and nitrides) structure map of Hill et al. [35]

0.400 N

r ea inc

ABL

0.395

Sntet Zrtet

d se

Ge

Titet tet

Coct

B-N

Sioct

ZrC2N4

inc r ea

sed

SnSi2N4

GeC2N4 TiC2N4

Geoct oct Ti

ZrSi2N4

Sitet

0.390

BL

SnC2N4

GeSi2N4 TiSi2N4

SnGe2N4 SnTi2N4

SiC2N4

Snoct oct Zr

ZrGe2N4 ZrTi2N4

Ctet

0.385

Si3N4 C3N4

ZrHf2N4 Sn3N4

Ge3N4 TiGe2N4 Ti3N4 GeTi2N4 Hf3N4

HfZr2N4 ZrSn2N4 GeZr2N4 TiZr2N4 GeSn2N4

CSi2N4

0.380

SnZr2N4 Zr3N4

TiSn2N4

SiGe2N4

0.375

SiTi2N4

SiZr2N4

SiSn2N4

0.370

CGe2N4 CZr2N4

0.365

CTi2N4

CSn2N4

0.360 6.5

7.0

7.5

8.0

8.5

o (Å) lattice parameter (A)

9.0

9.5

Figure 5.2: Scatter diagram of lattice parameter and anion positional parameter for only spinel nitrides. Each line represents the site occupancy for tetrahedral (Atet) and octahedral (Boct) interstice. Note: stable phase (■) and metastable phase (□)


60 5.2.2 Haeuseler’s Approach

The structure map of Haeuseler including spinel nitrides is shown in Figure 5.3. The used sulfides and oxides data in Figure 5.3 were taken from the literature [35, 103]. As mentioned by Haeuseler, stable spinels sit in a diagonally situated small band region in the plot while six metastable nitrides are located outside of the band region. Therefore the balance of Rσ and Rπ is important for the formation of stable phases. Spinel oxides generally have larger values of Rσ and Rπ than sulfides. In other words, the difference between the total effective core radii of cations and anions in oxides is larger than sulfides while the degree of s-p hybridization in oxides is smaller than the sulfides as discussed in chapter 4. In Figure 5.4, spinel nitrides are plotted with identification number from Table 4.1.

5.3

Summary In this chapter, spinel nitride data, including PLS derived properties, were

mapped onto the traditional AB2X4 structure map. Although a few trends of spinel nitrides were easily found from their locations in the structure maps, the influence of variables on structure and property is not clear still. In the next chapter, structureproperty relationships in terms of variances of variables in spinel nitrides are identified with the aid of PCA.


61

0.50

Sulfides Oxides Stable nitrides Metastable nitrides

0.45

0.40 0.35 0.30 0.25 0.20

0.0

0.4

0.8

1.2

1.6

2.0

2.4

Figure 5.3: Structure map for spinels (oxides, sulfides, and nitrides) by Haeuseler’s method

0.45

7 16 39

Stable nitrides Metastable nitrides

0.40

5

11

26 13

0.35

12

20

1

18 25

34 37 15 30

27 22 35

0.30

0.25

14

23 6

9

24 3

17

10 8

19

0.20

21

36

38

31

33 32 4 28

29

2

0.15 0.0

0.4

0.8

1.2

1.6

2.0

2.4

Figure 5.4: Structure map for spinel nitrides by Haeuseler’s method. All numbers in the figure correspond to the spinels in Table 4.1.


CHAPTER 6 DIMENSIONALLY REDUCED STRUCTURE MAPS 6.1

Introduction The predictions of bulk modulus (B) and phase stability (∆E) of spinel nitrides

by using PLS and the development of QSAR type regression equations were already described in chapter 3 and 4. With these QSAR equations, we can explore the variables used in the ab initio calculations, and their individual correlations which take into account their relative impact on the final properties (i.e. bulk modulus and phase stability). However, a way of efficiently showing these correlations of the various variables for all the spinels needs to be found. Since we are dealing with a vast array of structure and property variables associated with a single set of compounds or chemistries in materials design, we need tools to explore these associations. This effort leads to new types of multivariate structure maps. Indeed, most traditional structure maps have been created by bi-variate methods as discussed in chapter 2. Therefore, it is difficult to classify the behavior of each compound and the traditional approach is only partially successful. In this chapter, it is shown how these problems can be handled by using multivariate analytical tools such as Principal Component Analysis (PCA), which is one of the well known techniques in the field of Chemometrics. The aim of PCA is to reduce the dimensionality of the structure-property-chemistry space. The main purpose of this chapter is to apply PCA on the multiple parameters used in the theoretical calculations to quantitatively assess the statistical relationship of each of the numerous descriptors (for example: crystallographic and electronic bonding parameters, band structures, and phase stability) for each spinel nitride that may influence structure-property relationships. In addition, new types of multivariate structure maps generated by PCA are used to accelerate the discovery process.

62


63

6.2

Data and Computational Details Previously derived values of bulk modulus and phase stabilization energy using

PLS for 39 hypothetical spinel nitrides are used as main descriptors. The data is tabulated in Table 3.1, 4.1, and 6.1. In Table 6.1, bond order (BO) is a measure of the bond strength and the values of bond order and crystal bond order (BOcry) are taken from literature [1]. Since multi-dimensional problems are reduced to two dimensional problems (or sometimes three dimensional) by PCA, it is easy to explore the relationships between the samples (or variables) in the compressed space. Each calculated principal component (PC) is expressed as a linear combination of descriptors with a related loading. To clarify the sample classification, PCA results are compared with those of CA (cluster analysis). In Figure 6.1 the PCA and CA sequences are shown.

Expanded DATABASE by PLS

Choose samples and variables for PCA

Normalization

Perform CA for dataset

Normalization Decide appropriate distance for clustering

Perform PCA for dataset

Choose the optimum number of principal components

YES

Outlier Detection

Interpretation

Compare with the results of PCA Need more PC?

NO

Compare

Figure 6.1: A flow chart of PCA and CA

Interpretation


64 ID 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39

Crystal c-C3N4 c-Si3N4 c-Ge3N4 c-Sn3N4 c-Ti3N4 c-Zr3N4 c-Hf3N4 c-CSi2N4 c-CGe2N4 c-SiGe2N4 c-CTi2N4 c-SiTi2N4 c-GeTi2N4 c-ZrTi2N4 c-SnTi2N4 c-ZrHf2N4 c-SiC2N4 c-GeC2N4 c-GeSi2N4 c-TiC2N4 c-TiSi2N4 c-TiGe2N4 c-TiZr2N4 c-CSn2N4 c-SnC2N4 c-CZr2N4 c-ZrC2N4 c-SiSn2N4 c-SnSi2N4 c-SiZr2N4 c-ZrSi2N4 c-GeSn2N4 c-SnGe2N4 c-GeZr2N4 c-ZrGe2N4 c-TiSn2N4 c-SnZr2N4 c-ZrSn2N4 c-HfZr2N4

BOcry

BOAN

BOBN

B (GPa)

ΔE (eV)

8.647 8.670 7.900 6.958 8.474 8.609 4.718 11.231 8.284 9.999 9.005 9.075 8.657 7.261 8.430 4.473 8.260 7.816 8.260 8.030 10.806 8.231 8.482 7.234 7.093 8.169 7.286 6.740 10.829 9.443 10.619 7.184 7.536 8.442 8.219 7.705 8.419 7.753 4.379

0.358 0.362 0.327 0.284 0.353 0.356 0.220 0.299 0.361 0.564 0.383 0.366 0.323 0.452 0.289 0.163 0.359 0.317 0.32 0.352 0.303 0.378 0.339 0.335 0.273 0.312 0.348 0.441 0.312 0.498 0.315 0.311 0.291 0.289 0.388 0.384 0.27 0.397 0.199

0.241 0.241 0.220 0.195 0.236 0.240 0.123 0.368 0.225 0.229 0.248 0.256 0.253 0.152 0.255 0.132 0.225 0.220 0.238 0.217 0.349 0.217 0.240 0.190 0.204 0.236 0.188 0.134 0.347 0.227 0.338 0.196 0.217 0.255 0.213 0.193 0.261 0.191 0.116

319.099 281.368 252.511 202.680 266.241 236.340 267.209 309.292 278.692 281.306 297.322 285.547 253.944 179.000 233.635 244.507 316.565 292.289 255.694 301.676 285.132 263.646 244.671 242.013 266.523 241.437 300.257 273.041 251.123 249.186 276.216 218.946 232.701 219.483 256.952 236.675 205.696 228.689 240.239

3.448 -3.102 -0.337 0.765 -0.948 0.904 -1.901 -0.650 -0.188 -0.301 -0.756 -1.632 -0.487 -0.130 0.229 -0.737 2.485 3.794 -1.804 3.754 0.677 -0.137 -0.133 0.934 4.881 1.365 6.735 0.290 1.796 0.249 1.707 0.197 0.368 1.017 0.844 0.150 1.398 1.127 -0.798

Direct Eg Indirect Eg (eV) (eV) 1.140 3.450 2.220 1.290 0.250 0.070 0.400 0.230 Metal 1.343 1.259 1.356 1.850 Metal Metal Metal Metal 0.000 -0.020 Metal 0.707 2.635 2.554 0.965 0.620 2.620 2.510 2.269 1.870 0.320 0.150 Metal 1.000 0.990 Metal 0.390 1.680 2.630 2.580 Metal 3.315 2.780 Metal 2.310 2.280 Metal 2.640 2.400 2.270 1.880 Metal 2.630 2.600 0.240 0.100

Table 6.1: List of single and double nitrides which are used in this thesis, taken from Ching et al. [1, 90] (BO: bond order, Eg: Band gap energy)


65

6.3

Results and Discussion For cluster analysis based on K-nearest neighbor and PCA, the following

eighteen variables for 39 spinel nitrides are used: z Ab initio descriptors: lc, u, BLA-N, BLB-N, Q*tet, Q*oct, Q*N, BOcry, BOA-N, BOB-N z Developed descriptors: EN, |BLA-N-BLB-N|, Rσ

,

,

rσA , rσB

z PLS derived descriptors: B, ΔE

6.3.1 Cluster Analysis based on KNN

As discussed in appendix A, Euclidean distances between spinel nitrides are calculated for the above eighteen descriptors. The similarities between the clusters are shown in Figure 6.2. In Figure 6.2, five clusters (red, green, black, orange, and gray) were assigned at a distance of 3. Here, the blue lines represent the individual seven data points which do not belong to any five clusters. The constituents of each cluster are: z Red cluster: c-SiC2N4 (#17), c-GeC2N4 (#18), c-TiC2N4 (#20), c-SnC2N4 (#25),

c-ZrC2N4 (#27) z Green cluster: c-Hf3N4 (#7), c-ZrHf2N4 (#16), c-HfZr2N4 (#39) z Black cluster: c-CGe2N4 (#9), c-CSn2N4 (#24), c-CZr2N4 (#26) z Orange cluster: c-TiSi2N4 (#21), c-SnSi2N4 (#29), c-ZrSi2N4 (#31) z Gray cluster: c-Si3N4 (#2), c-Ge3N4 (#3), c-Sn3N4 (#4), c-Ti3N4 (#5), c-Zr3N4

(#6), c-SiTi2N4 (#12), c-GeTi2N4 (#13), c-SnTi2N4 (#15), c-GeSi2N4 (#19), cTiGe2N4 (#22), c-TiZr2N4 (#23), c-GeSn2N4 (#32), c-SnGe2N4 (#33), c-GeZr2N4 (#34), c-ZrGe2N4 (#35), c-TiSn2N4 (#36), c-SnZr2N4 (#37), c-ZrSn2N4 (#38) z Others (blue lines): c-C3N4 (#1), c-CSi2N4 (#8), c-SiGe2N4 (#10), c-CTi2N4

(#11), c-ZrTi2N4 (#14), c-SiZr2N4 (#30)


66

Dendrogram of Data with Preprocessing: Autoscale

40 14 35 30 25 20 15 10 5 0

8 28 1 27 25 17 20 18 16 39 7 11 26 24 9 10 29 31 21 30 12 5 23 6 15 13 37 34 32 4 2 33 19 3 35 22 38 36

0

1

2

3

4

5

Distance to K-Nearest Neighbor

Figure 6.2: A dendrogram of 39 spinel nitrides with 18 variables. All numbers on the dendrogram correspond to the spinels of Table 6.1.

From these clusters, a few trends can be seen. For example, all the members of the red cluster include the C atom at an octahedral site. This type of argument is then applied for the other clusters. Each compound of the blue line can be assigned to specific clusters from the connected lines in dendrogram, Figure 6.2. For example, cC3N4 (#1) is assigned to the red cluster, while c-SiZr2N4 (#30) belongs to the gray cluster. From the KNN analysis, it is possible to easily identify clusters existing in the dataset. However, it is not easy to interpret which attributes (variables) contribute to specific clusters. Therefore CA is not enough to probe completely the structure of the dataset. Nevertheless, it is a useful technique when combined with other techniques such as PCA. The combination CA-PCA approach to explain trends in the dataset is


67 discussed in section 6.3.2. 6.3.2 Principal Component Analysis 6.3.2.1 PC1 versus PC2 Configuration

PCA primarily shows all variances of eighteen variables for each spinel. The first and second principal components (PC) contain 61.50 % of the variance of the data. These PCs are linear combination of eighteen descriptors as followings. z PC1= 0.344lc -0.079|BLA-N-BLB-N| -0.315EN -0.037u +0.256BLA-N +0.301BLB-N -0.028Q*tet -0.234Q*oct +0.236Q*N +0.342 Rσ +0.253 Rπ +0.220 rσA

+0.293 rσB -0.189BOcry -0.083 BOA-N -0.174BOB-N -0.304B -0.162ΔE

z PC2= -0.048lc -0.394|BLA-N-BLB-N| -0.203EN +0.489u +0.333BLA-N -0.239BLB-N -0.372Q*tet +0.131Q*oct +0.033Q*N -0.049 Rσ -0.035 Rπ +0.317 rσA -0.216 rσB +0.003BOcry -0.182 BOA-N +0.081BOB-N +0.025B +0.216ΔE

z PC3= -0.002lc -0.195|BLA-N-BLB-N| +0.001EN +0.025u +0.026BLA-N -0.021BLB-N-0.115Q*tet -0.394Q*oct +0.432Q*N -0.101 Rσ -0.344 Rπ -0.116 rσA -0.061 rσB +0.388BOcry +0.309 BOA-N +0.297BOB-N -0.182B -0.304ΔE

The score and loading plots of PC1-PC2 are shown in Figure 6.3. Figures 6.3 (ac) are the same score plots. While Figures 6.3 (a) and (b) are created using tetrahedral and octahedral site occupancy of each atom, respectively, Figure 6.3 (c) is shown with sample ID (Table 6.1). Although the scores of Sn, Zr, and Hf at tetrahedral and octahedral sites are slightly mixed in Figure 6.3 (a) and (b), the scores of each spinel nitride are generally well distributed diagonally by specific atoms in tetrahedral and octahedral sites. These score plots, Figure 6.3 (a) and (b), correspond to Figure 5.2 which was generated by two variables (lc and u). From the fact that Figure 6.3 (a, b) and Figure 5.2 have almost same patterns, two variables in Figure 5.2 are identified as important factors in spinel crystal chemistry.


68

6

Ctet Sitet Getet

4

Titet Sntet

PC2 (21.52%)

2

Zrtet Hftet

0

-2

-4

-6 -8

-6

-4

-2

0

2

4

6

8

PC1 (39.98%)

(a)

6

Coct Sioct

4

Geoct Tioct Snoct

PC2 (21.52%)

2

Zroct Hfoct

0

-2

-4

-6 -8

-6

-4

-2

0

2

4

6

8

PC1 (39.98%)

(b) Figure 6.3: The PC1-PC2 score plots for the complete data using principal components (a) by tetrahedral site (b) by octahedral site


69

6

4 27

PC2 (21.52%)

2

25 20

17 18

21 29

31 3335

19 3 2

0 1 8

15

22

4

16

37

38 5 13 23 32 34 36

6 39

7

14

12

10

-2 30 9

28 26

-4 11

24

-6 -8

-6

-4

-2

0

2

4

6

8

PC1 (39.98%)

(c)

0.6

ux u

0.4

R PC2 (21.52%)

0.0 -0.2 -0.4

ΔE

Q*oct

0.2

σ

r

A

σ

Q*N

BOB-N

B

BLA-N

A

BOcry

lc

Rπ BOA-N

EN abs(BLA-N-BLB-N)

R

r

Q*tet

B

σ

B σ

BLB-N

-0.6 -0.4

-0.3

-0.2

-0.1

0.0

0.1

0.2

0.3

0.4

PC1 (39.98%)

(d) Figure 6.3: The PC1-PC2 score plots for the complete data using principal components (c) with sample ID (d) loading plot of PC1-PC2. Note: All numbers in the score plot (c) correspond to the spinels in Table 6.1. Principal component plot indicating clustering of lattice stability associated with crystal chemistry and site occupancy based on informatics based predictions. The color of each box corresponds to those of the dendrograms, Figure 6.2.


70 The importance of these two variables are also shown in the loading plot, Figure 6.3 (d) which illustrates the relationships between various descriptors to the PCs. The first principal component, PC1, is a strong function of lc, EN, B, and Rσ . The second principal component, PC2, is a strong function of u and |BLA-N –BLB-N|. The third principal component in Figure 6.4 has high loads Q*oct and Q*N. The relationship between tetrahedral site preference and variables is illustrated in Figure 6.3(a) and (d). For a C atom at tetrahedral site (Ctet), c-C3N4 (#1) and c-CSi2N4 (#8) have large values of B and EN and are separated from other Ctet spinels. c-SiC2N4 (#17) and c-GeC2N4 (#18) are highly metastable phases (large value of ∆E) and have large values of Q*oct in Sitet and Getet spinels, respectively. The locations of c-TiC2N4 (#20) and c-TiSi2N4 (#21) in Titet spinels is due to the large values of BOB-N, ∆E, and Q*oct. Large values of these variables are also related with the locations of c-SnC2N4 (#25), c-SnSi2N4 (#29), c-ZrC2N4 (#27), and c-ZrSi2N4 (#31). Thus the large values of PC1 and PC2 scores in the second quadrant are due to the large values of B, EN, BOB-N, BOcry, ∆E, and Q*oct . The octahedral site occupancy and various variables are considered in Figure 6.3 (b) and (d). All the samples in the second quadrant of Figure 6.3 (b) are highly metastable and include C or Si atom at octahedral sites. When C atoms occupy octahedral site (Coct), the phase is much more metastable. c-C3N4 (#1) sits away from the other Coct spinels. This is again due to the high value of B and EN. This argument can be also applied to c-CSi2N4 (#8). The two clusters, (#21, #29, and #31) and (#2 and #19), in Sioct spinels on the second quadrant are due to the difference in BOB-N. High values of Q*tet and |BLA-N –BLB-N| lead to different spatial configuration of c-CTi2N4 (#11) in Tioct spinels. Abnormal location of c-ZrTi2N4 (#14) in Tioct spinels is due to the large values of Q*N, lc, Rσ , Rπ , BLB-N, and rσB . From a loading plot itself, Figure 6.3 (d), relationships between descriptors can be identified. No variables sit near the center of the PC1-PC2 configuration, meaning all variables impact PC1 and PC2 simultaneously. When PC1 and PC2 are considered separately, however, u and Q*tet are weak functions in PC1. Most crystallographic (or


71 geometric) descriptors and Q*N are loaded positively in PC1. BLA-N and BLB-N almost equally affect B as can be seen from the distance between B and BLA-N (or BLB-N). From their positions, it is clear that B has a negative relationship with all positively loaded descriptors in PC1. Moreover, lc is a strong negative function of B as can be inferred from their exactly opposite locations. The negative relationships of all bond orders and size are intuitively reasonable. The stabilization energy, ∆E, is positively related with all variables in the second quadrant of Figure 6.3 (d). Shorter BLB-N makes compounds more metastable. Internal anion parameter u and Q*tet have a negative relationship. Since the first two PCs (PC1 and PC2) contain 61.50 % of the variance of the data, the third PC, PC3, was used to explain more of the variance. Considering the results of CA on the PC1-PC2 configuration (Figure 6.3(c)), all defined clusters in CA are also well defined here. Two spinel nitrides, c-CTi2N4 (#11) and c-SiSn2N4 (#28) are near the black cluster. This is due to the similarities of three descriptors, |BLA-N-BLB-N|, BOA-N, and Q*tet from Figure 6.3 (d).

6.3.2.2 PC1 versus PC3 Configuration

Since only 61.50 % of the variance of the data is captured with PC1 and PC2, the third PC, PC3, is used to add more information. The PC1 versus PC3 space is shown in Figure 6.4. With this configuration, 50.43% of the variance is explained. The first three PCs capture 71.95% of the variance. From the scores plot of Figure 6.4 (a-c), cZrTi2N4 (#14) is identified clearly as an outlier due to the large value of Q*N. The cluster of c-Hf3N4 (#7), c-ZrHf2N4 (#16), and c-HfZr2N4 (#39) in the fourth quadrant is due to large values of RĎ€ . In this PC1-PC3 configuration, variables in the loading plot (Figure 6.4 (d)) are more clearly separated by their characteristics: i.e. Q*N in the first quadrant, bond orders in the second quadrant, size factors in the fourth quadrant, and others in the third quadrant. The internal anion parameter u does not affect any other properties because of its central position.


72

Ctet

6

Sitet Getet

4

Titet Sntet

PC3 (10.45%)

2

Zrtet Hftet

0

-2

-4

-6 -8

-6

-4

-2

0

2

4

6

8

PC1 (39.98%)

(a)

Coct

6

Sioct Geoct

4

Tioct Snoct

PC3 (10.45%)

2

Zroct Hfoct

0

-2

-4

-6 -8

-6

-4

-2

0

2

4

6

8

PC1 (39.98%)

(b) Figure 6.4: The PC1-PC3 score plots for the complete data using principal components (a) by tetrahedral site (b) by octahedral site


73

6

4 2 10

PC3 (10.45%)

2

29 21 9

0 1

8

17 11

18

19 9 33 32 4 35 12 221330 36 38 15 34 37 5 23 6

31 24

26

2025

-2

14

3

28

39 7 16

27

-4

-6 -8

-6

-4

-2

0

2

4

6

8

PC1 (39.98%)

(c)

0.6

PC3 (10.45%)

BOA-N

BOB-N

0.2 0.0

Q*N

BOcry

0.4

BLA-N

ux

EN

Q*tet -0.2

B

-0.4

Q*oct

R

abs(BLA-N-BLB-N)

ΔE

B

A σ

r

A

σ

R σ B r σ

lc BLB-N Rσ

-0.6 -0.4

-0.3

-0.2

-0.1

0.0

0.1

0.2

0.3

0.4

PC1 (39.98%)

(d) Figure 6.4: The score plots for the complete data using principal components. (c) Score plot with sample ID. The relationships between each of the principal components and the loadings of variables can be seen in loading plot (d).


74 All the variables in the third quadrant have strongly negative relationships with Q*N. From this fact, if Q*N is larger or there is a lot of charge transfer to N ion, the phase of spinels will be more stable. Through the score and loading plots, the fact that B and ∆E sit closely is noticed. All clusters defined with KNN are clearly defined in PC1-PC3 space. 6.3.3 Physical Interpretation of Data Mining Results

Since PCA provides a dimensionally reduced space in which one can easily see variation in the data, other properties can be added to the PC space. By doing so, specific properties can be identified with a reduced dimension space. Similar to the scores plots of Figure 6.3 and 6.4, the information of direct band gap as color codes is shown in Figures 6.5 (a) and 6.6 (a). The data of band gap energy is taken from literature [1]. This method is often useful when the property is categorized and its trends need to be visualized in structure maps. From this approach, relationships between band gap energy and other parameters, including bulk modulus in spinel nitrides, can be identified. Specifically, there have been few efforts to identify the relationships between bulk modulus (or hardness) and band gap for few systems [44, 136, 137]. Hardness depends on both microscopic and macroscopic structural properties; plastic deformation, defects, and dislocations. During the deformation process, electron-pair bonds are broken. These phenomena can be energetically explained by two broken electrons being excited from the valence band to the conduction band [136]. Therefore, hardness (H) for covalent crystals can be expressed by energy gap Eg as shown in equation (6.1) [136]. H (GPa) = AN a E g

(6.1)

where Na is the covalent bond number per unit area and A is a constant. Since ionic contributions should be considered for polar covalent crystals, the average band gap Eg needs to be introduced as suggested by Phillips [138]. For a binary system, the average band gap is described by the covalent gap Eh and ionic gap C. Thus E g2 = Eh2 + C 2

(6.2)


75 Phillips’ homopolar (covalent) band gap Eh measures the strength of the covalent bond. Since the lattice constant depends on Eh and is independent of C, Cohen has shown that bulk modulus B can be expressed as Eh for zinc-blende semiconductors [44].

B = 45.6 Eh d −1

(6.3)

where d is the nearest neighbor distance. This equation is related to the discussion in section 3.3.5. Consequently, from the fact that hardness generally scales with bulk modulus for perfect crystals [67, 68, 123], bulk modulus also can be expresses as band gap energy using equations (6.1-3). In Figure 6.5 (a), most of the metals sit in the lower right region while most of the spinel nitrides, which have wide band gaps, sit in the upper left region. This fact is related to the small values of size related descriptors and large bond order from the loading plot, Figure 6.5 (b). Thus when Zunger’s pseudopotential radii or BLB-N become larger, the band gap will be narrower. This effect also can be found in the relationships between lattice constant and bandgap in chalcopyrite semiconductors [109]. On the other hand, band gap will be wider as bulk modulus (B), bond orders (BOB-N, BOcry), effective charge (Qoct), and stabilization energy (ΔE) are increased. Since bond orders represent the strength of bond, these correspond to the Eh term in equation (6.3). In Figure 6.6, bond orders are highly related to wide band gaps and high values of bulk modulus. From Figure 6.5, we can also detect outliers. Thus for metallic systems, although bond orders of c-SiC2N4 (#17) in the second quadrant of PC1-PC2 space are quite large, it was identified as a metal. The behaviors of c-Hf3N4 (#7) and c-ZrTi2N4 (#14) are also quite abnormal due to large Rπ and Q*N respectively in PC1-PC3 configuration (Figure 6.6 (b)). For these outliers, further study on their electronic structures needs to be completed.


76

6

Metal 4

27

3.000

25 20

17 18

2

PC2 (21.52%)

3.500

31

21 29

2.500

3335 19 3

2

0

1 8

15

22

4

16

37

38 5 13 23 32 34 36

2.000

6 39

7

1.500

14

12

10

-2

1.000

30 9

0.5000

28 26

-4

11

0

24 -0.5000

Direct Eg

-6 -8

-6

-4

-2

0

2

4

6

8

PC1 (39.98%)

(a)

0.6

ux 0.4

R PC2 (21.52%)

0.0 -0.2 -0.4

σ

A

r

ΔE

Q*oct

0.2

BLA-N

A

Q*N

BOB-N

B

σ

BOcry

lc

Rπ BOA-N

EN abs(BLA-N-BLB-N)

B

r Rσ B

Q*tet

σ

BLB-N

-0.6 -0.4

-0.3

-0.2

-0.1

0.0

0.1

0.2

0.3

0.4

PC1 (39.98%)

(b) Figure 6.5: (a) The PC1-PC2 score plot with direct band gap (Eg) (b) Loading plot for PC1-PC2


77

6

Metal

3.500

4 3.000

2 10

PC3 (10.45%)

2

29 21 9

0 1

8

17

14

2.500

3

33 32 4 35 12 221330 36 38 15 34 37 5 23 6

31 24

11

18

19

2.000

1.500

26

2025

1.000

28

-2

39 7 16

0.5000

27

0

-4

-0.5000

Direct Eg

-6 -8

-6

-4

-2

0

2

4

6

8

PC1 (39.98%)

(a)

0.6

PC3 (10.45%)

BOA-N

BOB-N

0.2 0.0

Q*N

BOcry

0.4

ux

EN

BLA-N

Q*tet -0.2

B

R

B σB

r

σ

abs(BLA-N-BLB-N)

ΔE -0.4

A

A rR σ σ

lc BLB-N Rσ

Q*oct

-0.6 -0.4

-0.3

-0.2

-0.1

0.0

0.1

0.2

0.3

0.4

PC1 (39.98%)

(b) Figure 6.6: (a) The PC1-PC3 score plot with direct band gap (Eg) (b) Loading plot for PC1-PC3


78

6.4

Summary In this chapter, we show how data mining methods can be used to discover key

attributes governing specific properties using information from dimensionally reduced sample/variable space. We compare and search for patterns and associations (structure and property) that lead to ways of relating information within the spinel nitride dataset. Such a pattern search process can potentially yield associations between seemingly disparate data sets as well as establish possible correlations between parameters that are not easily studied experimentally in a coupled manner. Consequently, data mining methods such as PCA should be incorporated as part of design and testing methodologies to increase the efficiency of materials development process.


CHAPTER 7 CONCLUSIONS AND FUTURE WORK 7.1

Conclusions PLS and PCA methods were combined to develop a larger heuristically-derived

database to flesh out the unknown entries in a pre-existing library of materials based on ab initio calculations. The main goal was to predict the bulk moduli of single and double spinel nitrides based on crystallographic and electronegativity properties. Phase stabilities of spinel nitrides were also predicted by adding new descriptors derived from Villars’ and Miedema’s linear weighting schemes. In this area, the model performed well on both training and testing data and predictions on additional data can be done with high confidence. For bulk modulus prediction, c-Hf3N4 and c-C3N4 in the training data have large errors and represent an area is the space where the model does not predict well. These are the materials we should target for ab initio computational or experimental validation to improve our model and better understand the underlying phenomena. The PLS models allowed us to explore a broader range of new trends and correlations (in this case modulus-lattice spacing relationships) that have not been established before, and to thus create a "virtual" library. This new "virtual library" forms a unique database, from which one can use ab initio calculations and eventually actual testing to verify trends and relationships. It is important to recognize that this library can in principle be built up by repeating complex atomistic calculations for each chemistry or compound of interest. However, this is extremely prohibitive as a screening tool since these calculations, even for one compound, are extremely difficult and time consuming despite advances in parallel computation. Having predicted target properties, new structure maps of spinel nitrides were created by PCA. From these dimensionally reduced structure maps, all the variances of each variable were demonstrated and new relationships of each variable and samples were identified.

79


80 In a recursive process, one can increase the size of database with predictive models and new relationships of variables which guide to fill in missing descriptors and then repeat the information cycle process (Figure 3.1).

This can significantly

accelerate the identification of promising materials by cutting down the combinatorial explosion of possible alloys. When using relatively modest data mining tools, these initial results in this thesis show much promise in the value of integrating fundamental materials data with data mining. Thus this opens an extraordinary set of research opportunities allowing identification or targeting compounds for further research.

7.2

Future Work The material informatics process in this thesis should be validated using out-of-

sample-testing using existing data, high throughput computational screening of proposed compounds, and experimental property measurements. In other words, to validate the hypotheses and predictive models generated, there are three approaches. The first approach involves using the existing data employing typical out-of-sample model validation strategies such as cross-validation used in machine learning. The second involves using computational modeling to calculate ab initio properties. Finally if the hypothetical materials can be synthesized, it will be possible to both experimentally validate and test the data mining results in the laboratory. As mentioned in chapter 1, the comprehensive integration of thermodynamic, crystallographic, and electronic databases in materials science has not been yet accomplished, despite its critical value for the materials science community. The linking of databases in which the taxonomy and organization of data are very different has been and still is a major challenge that has to be tackled in the future. Thermodynamic databases permit us to identify the possible chemistries that can exist based on thermodynamic principles. Crystallographic databases characterize potential atomic arrangements of the compounds predicted from thermodynamic databases. Electronic structure databases using interatomic potential databases permit us to build the electronic structures of a wide array of multi-component systems. With these meaningful “virtual” data, “materials by design” will be easily performed. This thesis focused on modulus and phase stability prediction serves as a pointer to an informatics


81 infrastructure for tackling similar problems in the future. From the physics perspective, issues such as defect arrays and allotropic phase transformations at a condition of high temperature or high pressure can significantly alter properties for a given chemistry. Since the described statistical tools such as PCA and PLS are dealing with the materials property itself, it is indispensable to use data mining tools that can capture such nonlinearities in terms of change of property. Nonlinear PLS [139, 140], Support Vector Machines [141-143], and Association Rule Mining [144, 145] are a few such techniques available for future study.


LITERATURE CITED [1]

W.Y. Ching, S. Mo, L. Ouyang, and P. Rulis, Theoretical Prediction of the Structure and Properties of Cubic Spinel Nitrides, J. Am. Ceram. Soc., 85, 75 (2002).

[2]

T. Wang, N. Moll, K. Cho, and J.D. Joannopoulos, Deliberately Designed Materials for Optoelectronics Applications, Phys. Rev. Lett., 82, 3304 (1999).

[3]

A. Franceschetti and A. Zunger, The Inverse Band-structure Problem of Finding an Atomic Configuration with Given Electronic Properties, Nature, 402, 60 (1999).

[4]

J.R. Salvador, F. Guo, T. Hogan, and M.G. Kanatzidis, Zero Thermal Expansion in YbGaGe Due to An Electronic Valence Transition, Nature, 425, 702 (2003).

[5]

T.Saito, T. Furuta, J. Hwang, S. Kuramoto, K. Nishino, N. Suzuki, R. Chen, A. Yamada, K. Ito, Y. Seno, T. Nonaka, H. Ikehata, N. Nagasako, C. Iwamoto, Y. Ikuhara, and T. Sakuma, Multifunctional Alloys Obtained via a Dislocation-free Plastic Deformation Mechanism, Science, 300, 464 (2003).

[6]

G. Shiflet, The More Elements The Merrier, Science, 300, 443 (2003).

[7]

H.W. Hugosson, U. Jansson, B. Johansson, and O. Eriksson, Restricting Dislocation Movement in Transition Metal Carbides by Phase Stability Tuning, Science, 293, 2434 (2001).

[8]

P. Zhang, V.H. Crespi, E. Chang, S.G. Louie, and M.L. Cohen, Computational Design of Direct-Bandgap Semiconductors that Lattice-Match Silicon, Nature, 409, 69 (2001).

[9]

Y. Song, R. Yang, D. Li, Z. Hu, and Z. Guo, Calculation of Bulk Modulus of Titanium Alloys by First Principles, J. Comput. Aid. Mat. Des., 6, 355 (1999).

[10]

G. Grimvall, Thermophysical Properties of Materials, Amsterdam: Elsevier (1999).

[11]

G.A. Landrum and H. Genin, Application of Machine Learning Methods to Solid State Chemistry: Ferromagnetism in Transition Metal Alloy, J. Solid. State. Chem., 176, 587 (2003). 82


83 [12]

F.J. DiSalvo, Challenges and Opportunities in Solid-State Chemistry, Pure Appl. Chem., 72, 1799 (2000).

[13]

C. Glymour, D. Madigan, D. Pregibon, and P. Smyth, Statistical Themes and Lessons for Data Mining, Data Min. Knowl. Disc., 1, 11 (1997).

[14]

J.R. Rodgers, Spotlight on Technology: Database Tools for CMS, AMPTIAC Newslett., 5, 1 (2001).

[15]

H. Ledbetter and S. Kim, Handbook of Elastic Properties of Solids, Liquids, and Gases, M. Levy, H. Gass, and R. Stern, Eds., II, 65, San Diego: Academic Press (2001).

[16]

A. Zerr, G. Miehe, G. Serghiou, M. Schwarz, E. Kroke, R. Riedel, H. Fueβ, P. Kroll, and R. Boehler, Synthesis of Cubic Silicon Nitride, Nature, 400, 340 (1999).

[17]

J. Dong, O.F. Sankey, S.K. Deb, G. Wolf, and P.F. McMillan, Theoretical Study of β-Ge3N4 and Its High-Pressure Spinel γ Phase, Phys. Rev. B, 61, 11979 (2000).

[18]

R. Belkada, T. shibayanagi, and M. Naka, Ab Initio Calculations of the Atomic and Electronic Structure of β Silicon Nitride, J. Am. Ceram. Soc., 83, 2449 (2000).

[19]

S. Ogata, N. Hirosaki, C. Kocer, and H. Kitagawa, An Ab Initio Calculation of the Ideal Tensile Strength of Beta -Silicon Nitride, Phys. Rev. B, 64, 172102 (2001).

[20]

N. Hirosaki, S. Ogata, C. Kocer, H. Kitagawa, and Y. Nakamura, Molecular Dynamics Calculation of The Ideal Thermal Conductivity of Single-Crystal α and β-Si3N4, Phys. Rev. B, 65, 134110 (2002).

[21]

S. Mo, L. Ouyang, W.Y. Ching, I. Tanaka, Y. Koyama, and R. Riedel, Interesting Physical Properties of the New Spinel Phase of Si3N4 and C3N4, Phys. Rev. Lett., 83, 5046 (1999).

[22]

W.Y. Ching, L. Ouyang, and J.D. Gale, Full Ab Initio Geometry Optimization of All Known Crystalline Phases of Si3N4, Phys. Rev. B, 61, 8696 (2000).

[23]

Y. Xu, P. Rulis, and W.Y. Ching, Prediction of Ferromagnetic Cubic Spinel Phase of Fe3N4, J. Appl. Phys., 91, 7352 (2002).

[24]

M.P. Shemkunas, G.H. Wolf, K. Leinenweber, and W.T. Petuskey, Rapid


84 Synthesis of Crystalline Spinel Tin Nitride by A Solid-State Metathesis Reaction, J. Am. Ceram. Soc., 85, 101 (2002). [25]

J. Dong, J. Deslippe, O.F. Sankey, E. Soignard, and P.F. McMillan, Theoretical Study of the Ternary Spinel Nitride System Si3N4-Ge3N4, Phys. Rev. B, 67, 094104 (2003).

[26]

J.E. Lowther, M. Amkreutz, T. Frauenheim, E. Kroke, and R. Riedel, Potential Ultrahard Nitride Materials Containing Silicon, Carbon and Nitrogen, Phys. Rev. B, 68, 033201 (2003).

[27]

C. Kocer, N. Hirosaki, and S. Ogata, Ab Initio Calculation of the Ideal Tensile and Shear Strength of Cubic Silicon Nitride, Phys. Rev. B, 67, 035210 (2003).

[28]

A. Zerr, G. Miehe, and R. Riedel, Synthesis of Cubic Zirconium and Hafnium Nitride Having Th3P4 Structure, Nat. Mater., 2, 185 (2003).

[29]

J.Z. Jiang, J. Lindelov, L. Gerward, K. Stรฅhl, J.M. Recio, P. Mori-Sanchez, S. Carlson, M. Mezouar, E. Dooryhee, A. Fitch, and D.J. Frost, Compressiblity and Thermal Expansion of Cubic Silicon Nitride, Phys. Rev. B, 65, 161202 (2002).

[30]

A. Zerr, M. Kempf, M. Schwarz, E. Kroke, M. Gรถken, and R. Riedel, Elastic Moduli and Hardness of Cubic Silicon Nitride, J. Am. Ceram. Soc., 85, 86 (2002).

[31]

K. Tatsumi, I. Tanaka, H. Adachi, F. Oba, and T. Sekine, Theoretical Prediction of Post-Spinel Phases of Silicon Nitride, J. Am. Ceram. Soc., 85, 7 (2002).

[32]

K.E. Sickafus, J.M. Wills, and N.W. Grimes, Structure of Spinel, J. Am. Ceram. Soc., 82, 3279 (1999).

[33]

W.Y. Ching, S. Mo, and L. Ouyang, Electronic and Optical Properties of the Cubic Spinel Phase of c-Si3N4, c-Ge3N4, c-SiGe2N4, and c-GeSi2N4, Phys. Rev. B, 63, 245110 (2001).

[34]

W.Y. Ching, S. Mo, L. Ouyang, I. Tanaka, and M. Yoshiya, Prediction of the New Spinel Phase of Ti3N4 and SiTi2N4 and the Metal-Insulator Transition, Phys. Rev. B, 61, 10609 (2000).

[35]

R.J. Hill, J.R. Craig, and G.V. Gibbs, Systematics of the Spinel Structure Type, Phys. Chem. Miner., 4, 317 (1979).

[36]

V.A. Fedorov, Y.A. Kesler, and E.G. Zhukov, Magnetic Semiconducting


85 Chalcogenide Spinels: Preparation and Physical Chemistry, Inorg. Mater., 39, S68 (2003). [37]

S. Wei and S.B. Zhang, First-Principles Study of Cation Distribution in Eighteen Closed-Shell AIIBIII2O4 and AIVBII2O4 Oxides, Phys. Rev. B, 63, 045112 (2001).

[38]

H.C. O'Neill and A. Navrotsky, Simple Spinels: Crystallographic Parameters, Cation Radii, Lattice Energies, and Cation Distribution, Am. Mineral., 68, 181 (1983).

[39]

O. Muller and R. Roy, The Major Ternary Structural Families, New York: Springer-Verlag (1974).

[40]

H. Ledbetter and S. Kim, Handbook of Elastic Properties of Solids, Liquids, and Gases, M. Levy, H. Gass, and R. Stern, Eds., II, 249, San Diego: Academic Press (2001).

[41]

P. Mori-Sánchez, M. Margués, A. Beltrán, J.Z. Jiang, L. Gerward, and J.M. Recio, Origin of the Low Compressibility in Hard Nitride Spinels, Phys. Rev. B, 68, 064115 (2003).

[42]

C. Li, Y.L. Chin, and P. Wu, Correlation between Bulk Modulus of Ternary Intermetallic Compounds and Atomic Properties of Their Constituent Elements, Intermetallics, 12, 103 (2004).

[43]

E. Soignard, P.F. McMillan, T.D. Chaplin, S.M. Farag, C.L. Bull, M.S. Somayazulu, and K. Leinenweber, High-Pressure Synthesis and Study of LowCompressibility Molybdenum Nitride Phases, Phys. Rev. B, 68, 132101 (2003).

[44]

M.L. Cohen, Calculation of Bulk Moduli of Diamond and Zinc-blende Solids, Phys. Rev. B, 32, 7988 (1985).

[45]

V.V. Brazhkin, A.G. Lyapin, and R.J. Hemley, Harder Than Diamond: Dreams and Reality, Philos. Mag. A, 82, 231 (2002).

[46]

T. Soma, The Electronic Theory of III-V and II-VI Tetrahderal Compounds: I. Crystal Energy and Bulk Modulus, J.Phys. C:Solid State Phys., 11, 2669 (1978).

[47]

P.K. Lam, M.L. Cohen, and G. Martinez, Analytic Relation between Bulk Moduli and Lattice Constants, Phys. Rev. B, 35, 9190 (1987).

[48]

H. Schlosser and J. Ferrante, Universality Relationships in Condensed Matter: Bulk Modulus and Sound Velocity, Phys. Rev. B, 37, 4351 (1988).


86 [49]

D. Cheng, S. Wang, and H. Ye, Calculations Showing a Correlation between Electronic Density and Bulk Modulus in fcc and bcc Metals, Phys. Rev. B, 64, 024107 (2001).

[50]

A. Simunek and J. Vackar, Correlation between Core-Level Shift and Bulk Modulus in Transition-Metal Carbides and Nitrides, Phys. Rev. B, 64, 235115 (2001).

[51]

M. Fang, P. Ye, and Z. Yan, The Effect of Temperature on the Product of Bulk Modulus and Volume Thermal Expansion Coefficient, and Its Application to the Thermal Expansion of MgO and Other Minerals, Phys. Stat. Sol. (b), 241, 2464 (2004).

[52]

O.L. Anderson and J.E. Nafe, The Bulk Modulus-Volume Relationship for Oxide Compounds and Related Geophysical Problems, J. Geophys.Res, 70, 3951 (1965).

[53]

Y. Makino and S. Miyake, Estimation of Bulk Moduli of Compounds by Empirical Relations between Bulk Modulus and Interatomic Distance, J. Alloy Compd., 313, 235 (2000).

[54]

C. Li and P. Wu, Correlation of Bulk Modulus and the Constituent Element Properties of Binary Intermetallic Compounds, Chem. Mater., 13, 4642 (2001).

[55]

R.M. Hazen and L.W. Finger, Bulk Modulus-Volume Relationship for CationAnion Polyhedra, J. Geophys. Res., 84, 6723 (1979).

[56]

S.A. Serebrinsky, J.L. Gervasoni, J.P. Abriata, and V.H. Ponce, Characterization of the Electronic Density of Metals in Terms of the Bulk Modulus, J. Mater. Sci., 33, 167 (1998).

[57]

H.G. Zimmer, H. Winzen, and K. Syassen, High-Pressure Phase Transition in CaTe and SrTe, Phys. Rev. B, 32, 4066 (1985).

[58]

E. Ziambaras and E. Schrรถder, Theory for Structure and Bulk Modulus Determination, Phys. Rev. B, 68, 064112 (2003).

[59]

B.W. Dodson, Universal Scaling Relations in Compressibility of Solids, Phys. Rev. B, 35, 2619 (1987).

[60]

W.B. Holzapfel, Physics of Solids Under Strong Compression, Rep. Prog. Phys., 59, 29 (1996).


87 [61]

A. Jayaraman, A.K. Singh, A.Chatterjee, and S.U. Devi, Pressure-Volume Relationship and Pressure-Induced Electronic and Structural Transformations in Eu and Yb Monochalcogenides, Phys. Rev. B, 9, 2513 (1974).

[62]

F. Birch, Finite Strain Isotherm and Velocities for Single-Crystal and Polycrystalline NaCl at High Pressures and 300oK, J. Geophys. Res., 83, 1257 (1978).

[63]

A. Jayaraman, B. Batlogg, R.G. Maines, and H. Bach, Effective Ionic Charge and Bulk Modulus Scaling in Rocksalt-Structured Rare-Earth Compounds, Phys. Rev. B, 26, 3347 (1982).

[64]

M.J. Mehl, J.E. Osburn, D.A. Papaconstantopoulos, and B.M. Klein, Structural Properties of Ordered High-Melting-Temperature Intermetallic Alloys from First-Principles Total-Energy Calculations, Phys. Rev. B, 41, 10311 (1990).

[65]

H. Schlosser, J. Ferrante, and J.R. Smith, Global Expression for Representing Cohesive-Energy Curves, Phys. Rev. B, 44, 9696 (1991).

[66]

H. Schlosser, Cohesive Energy-Lattice Constant and Bulk Modulus-Lattice Constant Relationships: Alkali Halides, Ag Halides, Tl Halides, J. Phys. Chem. Solids, 53, 855 (1992).

[67]

M.L. Cohen, Predicting Useful Materials, Science, 261, 307 (1993).

[68]

J.M. LĂŠger, J. Haines, M. Schmidt, J.P. Petitet, A.S. Perelra, and J.A.H. da Jornada, Discovery of Hardest Known Oxide, Nature, 383, 401 (1996).

[69]

A.A. Quong and A.Y. Liu, First-Principles Calculations of the Thermal Expansion of Metals, Phys. Rev. B, 56, 7767 (1997).

[70]

R. Randey, J.D. Gale, S.K. Sampath, and J.M. Recio, Atomistic Simulation Study of Spinel Oxides: Zinc Aluminate and Zinc Gallate, J. Am. Ceram. Soc., 82, 3337 (1999).

[71]

M. Kumar, Equation of State and Bulk Modulus under the Effect of High Pressure-High Temperature, Phys. Chem. Miner., 27, 650 (2000).

[72]

A.M. Pendas, A. Costales, M.A. Blanco, J.M. Recio, and V. Luana, Local Compressibilties in Crystals, Phys. Rev. B, 62, 13970 (2000).

[73]

Q. He and Z. Yan, Study of Temperature Dependence of Bulk Modulus and Interatomic Separation for Ionic Solids, Phys. Stat. Sol. (b), 223, 767 (2001).


88 [74]

P. Vinet, J. Ferrante, J.R. Smith, and J.H. Rose, A Universal Equation of State for Solids, J. Phys. C: Solid State Phys., 19, L467 (1986).

[75]

J.M. Recio, R. Franco, A.M. Pendas, M.A. Blanco, L. Pueyo, and R. Pandey, Theoretical Explanation of the Uniform Compressiblity Behavior Observed in Oxide Spinels, Phys. Rev. B, 63, 184101 (2001).

[76]

Y. Al-Douri, H. Abid, and H. Aourag, Empirical Formula Relating the Bulk Modulus to the Lattice Constant in Tetrahedral Semiconductors, Mater. Chem. Phys., 87, 14 (2004).

[77]

G.B. Olson, Computational Design of Hierarchically Structured Materials, Science, 277, 1237 (1997).

[78]

K. Rajan, C. Suh, A. Rajagopalan, and X. Li, Quantitative Structure-Activity Relationships (QSARs) for Materials Science, Mat. Res. Soc. Symp. Proc., 700, S7.5.1 (2002).

[79]

A.R. Katritzky, U. Maran, V.S. Lobanov, and M. Karelson, Structurally Diverse Quantitative Structure-Property Relationship Correlations of Technologically Relevant Physical Properties, J. Chem. Inf. Comp. Sci., 40, 1 (2000).

[80]

E. Mooser and W.B. Pearson, On the Crystal Chemistry of Normal Valence Compounds, Acta Cryst., 12, 1015 (1959).

[81]

D.G. Pettifor, Intermetallic Compounds: Principles and Practice, J.H. Westbrook and R.L. Fleischer, Eds., 1, 419, Chichester: John Wiley & Sons (1995).

[82]

P. Villars, Intermetallic Compounds: Crystal Structures of Intermetallic Compounds, J.H. Westbrook and R.L. Fleischer, Eds., Chichester: John Wiley & Sons (2000).

[83]

J.C. Phillips and J.A. Van Vechten, Dielectric Classification of Crystal Structures, Ionization Potentials, and Band Structures, Phys. Rev. Lett., 22, 705 (1969).

[84]

J. St. John and A.N. Bloch, Quantum-Defect Electronegativity Scale for Nontransition Elements, Phys. Rev. Lett., 33, 1095 (1974).

[85]

E.S. Machlin and T.P. Chow, Structural Stability of Suboctet Simple Binary Compounds, Phys. Rev. Lett., 38, 1292 (1977).

[86]

R.E. Watson and L.H. Bennett, A Mulliken Electronegativity Scale and the


89 Structural Stability of Simple Compounds, J. Phys. Chem. Solids, 39, 1235 (1978). [87]

R.E. Watson and L.H. Bennett, Transition Metals: d-Band Hybridization, Electronegativities and Structural Stability of Intermetallic Compounds, Phys. Rev. B, 18, 6439 (1978).

[88]

W. Andreoni, A. Baldereschi, E. BiĂŠmont, and J.C. Phillips, Hard-Core Pseudopotentials and Structural Maps of Solids, Phys. Rev. B, 20, 4814 (1979).

[89]

A.N. Bloch and G.C. Schatteman, Structure and Bonding in Crystals, M. O'keeffe and A. Navrotsky, Eds., I, 49, New York: Academic Press (1981).

[90]

A. Zunger, Structure and Bonding in Crystals, M. O'keeffe and A. Navrotsky, Eds., I, 73, New York: Academic Press (1981).

[91]

P. Villars, A Three-Dimensional Structure Stability Diagram for 998 Binary AB Intermetallic Compounds, J. Less-Common Met., 92, 215 (1983).

[92]

J.K. Burdett and T.J. McLarnan, The Structures of Transition Metal-Transition Metal Alloys, J. Solid State Chem., 53, 382 (1984).

[93]

D.G. Pettifor, Structure Maps for Pseudobinary and Ternary Phases, Mater. Sci. Tech., 4, 675 (1988).

[94]

U. Walzer, Systematization of Binary Intermetallic Compounds, Phys. Stat. Sol. (b), 162, 75 (1990).

[95]

Y. Harada, M. Morinaga, J. Saito, and Y. Takagi, New Crystal Structure Maps for Intermetallic Compounds, J. Phys.: Condens. Matter, 9, 8011 (1997).

[96]

P. Villars and F. Hulliger, Structural-Stability Domains for Single-Coordination Intermetallic Phases, J. Less-Common Met., 132, 289 (1987).

[97]

P. Villars and J.C. Phillips, Quantum Structural Diagrams and High-Tc Superconductivity, Phys. Rev. B, 37, 2345 (1988).

[98]

G.G. Koerber, Properties of Solids, Englewood Cliffs: Prentice Hall (1962).

[99]

K. Kugimiya and H. Steinfink, The Influence of Crystal Radii and Electronegativities on the Crystallization of AB2X4 Stoichiometries, Inorg. Chem., 7, 1762 (1968).

[100] J.K. Burdett, G.D. Price, and S.L. Price, Factors Influencing Solid-State Structure-An Analysis Using Pseudopotential Radii Structural Maps, Phys. Rev.


90 B, 24, 2903 (1981). [101] J.K. Burdett, G.D. Price, and S.L. Price, Role of the Crystal-Field Theory in Determining the Structures of Spinels, A. Am. Chem. Soc., 104, 92 (1982). [102] E. Hovestreydt, A Three-Dimensional Structure-Stability Diagram for Ternary Equiatomic RTM Intermetallic Compounds, J. Less-Common Met., 143, 25 (1988). [103] H. Haeuseler, Structure Field Maps for Sulfides of Composition AB2X4, J. Solid. State. Chem., 86, 275 (1990). [104] B. Zhang and W.A. Jesser, Formation Energy of Ternary Alloy Systems Calculated by an Extended Miedema Model, Physica B, 315, 123 (2002). [105] C. Li, J.L. Hoe, and P. Wu, Empirical Correlation between Melting Temperature and Cohesive Energy of Binary Laves Phases, J. Phys. Chem. Solids, 64, 201 (2003). [106] N. Chen, C. Li, S. Yao, and X. Wang, Regularities of Melting Behavior of Some Binary Alloy Phases. Part 1. Criteria for Congruent and Incongruent Melting, J. Alloy Compd., 234, 125 (1996). [107] R. Griessen and A. Driessen, Heat of Formation and Band Structure of Binary and Ternary Metal Hydrides, Phys. Rev. B, 30, 4372 (1984). [108] J.F. Herbst, On Estimating The Enthalpy of Formation and Hydrogen Content of Quaternary Hydrides, J. Alloy Compd., 368, 221 (2004). [109] C. Suh and K. Rajan, Combinatorial Design of Semiconductor Chemistry for Bandgap Engineering: "Virtual" Combinatorial Experimentation, Appl. Surf. Sci., 223, 148 (2004).

[110] M. Karelson, V.S. Lobanov, and A.R. Katritzky, Quantum-Chemical Descriptors in QSAR/QSPR Studies, Chem. Rev., 96, 1027 (1996). [111] P.F. McMillan, New Materials from High-Pressure Experiments, Nat. Mater., 1, 19 (2002). [112] A. Rajagopalan, C. Suh, X. Li, and K. Rajan, "Secondary" Descriptor Development for Zeolite Framework Design: An Informatics Approach, Appl. Catal. A-Gen., 254, 147 (2003). [113] D. Livingstone, Data Analysis for Chemists: Applications to QSAR and


91 Chemical Product Design, Oxford: Oxford University Press (1995). [114] K. Park, H. Lee, C. Jun, K. Park, J. Jung, and S. Kim, Rapid Determination of FeO Content in Sinter Ores Using DRIFT Spectra and Multivariate Calibrations, Chemometr. Intell. Lab., 51, 163 (2000). [115] L. Eriksson, E. Johansson, N. Kettaneh-Wold, and S. Wold, Multi- and Megavariate Data Analysis - Principles and Applications, Ume책: Umetrics Academy (1999). [116] D.M. Hawkins, S.C. Basak, and D. Mills, Assessing Model Fit by CrossValidation, J. Chem. Inf. Comp. Sci., 43, 579 (2003). [117] B.M. Wise, N.B. Gallagher, R. Bro, and J.M. Shaver, PLS Toolbox 3.0 for Use with Matlab, Manson: Eigenvector Research, Inc. (2002). [118] M.D. Segall, R. Shah, C.J. Pickard, and M.C. Payne, Population Analysis of Plane-Wave Electronic Structure Calculations of Bulk Materials, Phys. Rev. B, 54, 16317 (1996).

[119] H. Ledbetter and S. Kim, Handbook of Elastic Properties of Solids, Liquids, and Gases, M. Levy, H. Gass, and R. Stern, Eds., II, 281, San Diego: Academic Press (2001). [120] W.Y. Ching, S. Mo, I. Tanaka, and M. Yoshiya, Prediction of Spinel Structure and Properties of Single and Double Nitrides, Phys. Rev. B, 63, 064102 (2001). [121] S. Raju, K. Sivasubramanian, and E. Mohandas, An Integrated Thermodynamic Approach Towards Correlating Thermal and Elastic Properties: Development of Some Simple Scaling Relations, Solid State Commun., 124, 151 (2002). [122] H. Cynn, J.e. Klepeis, C. Yoo, and D.A. Young, Osmium Has the Lowest Experimentally Determined Compressibility, Phys. Rev. Lett., 88, 135701 (2002). [123] C. Sung and M. Sung, Carbon Nitride and Other Speculative Superhard Materials, Mater. Chem. Phys., 43, 1 (1996). [124] L.E. Ramos, L.K. Teles, L.M.R. Scolfaro, J.L.P. Castineira, A.L. Rosa, and J.R. Leite, Structural, Electronic, and Effective-Mass Properties of Silicon and Zincblende Group-III Nitride Semiconductor Compounds, Phys. Rev. B, 63, 165210 (2001). [125] L.S. Dubrovinsky, N.A. Dubrovinskaia, Y. Swamy, J. Muscat, N.M. Harrison, R.


92 Ahuja, B. Holm, and B. Johansson, The Hardest Known Oxide, Nature, 410, 653 (2001). [126] C. Kittel, Introduction to Solid State Physics, 7th Edn., New York: John Wiley & Sons (1996). [127] H.E. Lowther, Symmetric Structures of Ultrahard Materials, J. Am. Ceram. Soc., 85, 55 (2002).

[128] S.B. Zhang, M.L. Cohen, and J.C. Phillips, Determination of Diatomic Crystal Bond Lengths Using Atomic s-Orbital Radii, Phys. Rev. B, 38, 12085 (1988). [129] S.B. Zhang and M.L. Cohen, Determination of AB Crystal Structures from Atomic Properties, Phys. Rev. B, 39, 1077 (1989). [130] E. Soignard, M. Somayazulu, J. Dong, O.F. Sankey, and P.F. McMillan, High Pressure-High Temperature Synthesis and Elasticity of the Cubic Nitride Spinel γ-Si3N4, J. Phys.: Condens. Matter, 13, 557 (2001). [131] G. Serghiou, G. Miehe, O. Tschauner, A. Zerr, and R. Boehler, Synthesis of a Cubic Ge3N4 Phase at High Pressures and Temperatures, J. Chem. Phys., 111, 4659 (1999). [132] K. Leinenweber, M. O'Keeffe, M. Somayazulu, H. Hubert, P.F. McMillan, and G.H. Wolf, Synthesis and Structure Refinement of the Spinel, γ-Ge3N4, Chem. Eur. J, 5, 3076 (1999). [133] N. Scotti, W. Kockelmann, J. Senker, S. Traβel, and H. Jacobs, Sn3N4, A Tin(IV) Nitride-Syntheses and the First Crystal Structure Determination of a Binary TinNitrogen Compound, Z. Anorg. Allg. Chem., 625, 1435 (1999). [134] E. Soignard, M. Somayazulu, H. Mao, J. Dong, O.F. Sankey, and P.F. McMillan, High Pressure-High Temperature Investigation of the Stability of Nitride Spinels in the Systems Si3N4-Ge3N4, Solid State Commun., 120, 237 (2001). [135] J.E. Iglesias and H. Steinfink, Crystal Chemistry of AB2X4 (X=S, Se, Te) Compounds, J. Solid State Chem., 6, 119 (1973). [136] F. Gao, J. He, E. Wu, S. Liu, D. Yu, D. Li, S. Zhang, and Y. Tian, Hardness of Covalent Crystals, Phys. Rev. Lett., 91, 015502 (2003). [137] H. Siethoff, Homopolar Band Gap and Thermal Activation Parameters of Plasticity of Diamond and Zinc-Blende Semiconductors, J. Appl. Phys., 87,


93 3301 (2000). [138] J.C. Phillips, Ionicity of the Chemical Bond in Crystals, Rev. Mod. Phys., 42, 317 (1970). [139] E.C. Malthouse, A.C. Tamhane, and R.S.H. Mah, Nonlinear Partial Least Squares, Comput. Chem. Eng., 21, 875 (1997). [140] T. Li, H. Mei, and P. Cong, Combining Nonlinear PLS with the Numeric Genetic Algorithm for QSAR, Chemmometr. Intell. Lab., 45, 177 (1999). [141] K.P. Bennett and C. Campbell, Support Vector Machines: Hype or Hallelujah? SIGKDD Explorations, 2, 1 (2000). [142] C.J.C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Min. Knowl. Disc., 2, 121 (1998). [143] V.N. Vapnik, The Nature of Statistical Learning Theory, New York: SpringerVerlag (1995). [144] K. Rajan and M. Zaki, Data Mining through Information Association: A knowledge Discovery Tool for Materials Science, in CODATA Proceedings, (2000) [145] B. Goethals, Efficient Frequent Pattern Mining, Ph. D. thesis, University of Limburg (2002). [146] F. Torrens, Table of Periodic Properties of Fullerenes Based on Structural Parameters, J. Chem. Inf. Comp. Sci., 44, 60 (2004). [147] D.L. Massart, B.G.M. Vandeginste, S.N. Deming, Y. Michotte, and L. Kaufman, Chemometrics: A Textbook, Amsterdam: Elsevier (1988). [148] I.T. Jolliffe, Principal Component Analysis, 2nd Edn., New York: SpringerVerlag (2002). [149] J.C. Davis, Statistics and Data Analysis in Geology, 2nd Edn., New York: John Wiley & Sons (1986). [150] B. Jørgensen, Multivariate Data Analysis and Chemometrics, Department of Statistics, University of Southern Denmark, http://statmaster.sdu.dk/courses/ST02/ (2003). [151] NIST/Sematech, e-Handbook of Statistical Methods, http://www.itl.nist.gov/div898/handbook/ (2005).


94 [152] S. Curtarolo, Coarse-Graining and Data Mining Approaches to the Prediction of Structures and Their Dynamics, Ph. D. thesis, Massachusetts Institute of Technology (2003). [153] A. Phatak and S. de Jong, The Geometry of Partial Least Squares, J. Chemometr., 11, 311 (1997).

[154] S. Wold, M. Sjรถstrรถm, and L. Eriksson, PLS-Regression: A Basic Tool of Chemometrics, Chemometr. Intell. Lab., 58, 109 (2001). [155] P. Geladi and B.R. Kowalski, Partial Least-Squares Regression: A Tutorial, Anal. Chim. Acta, 185, 1 (1986). [156] S. de Jong, SIMPLS: An Alternative Approach to Partial Least Squares Regression, Chemometr. Intell. Lab., 18, 251 (1993). [157] M.T. Yin and M.L. Cohen, Theory of Static Structural Properties, Crystal Stability, and Phase Transformations: Application to Si and Ge, Phys. Rev. B, 26, 5668 (1982). [158] M.L. Cohen, Novel Materials from Theory, Nature, 338, 291 (1989). [159] S.B. Zhang and M.L. Cohen, High-Pressure Phases of III-V Zinc-blende Semiconductors, Phys. Rev. B, 35, 7604 (1987). [160] Meister and W.H.E. Schwarz, Principal Components of Ionicity, J. Phys. Chem., 98, 8245 (1994).

[161] Z. Gu and W.Y. Ching, Electronic-Structure and Magnetic-Moment Calculation for Y2Fe14B, Phys. Rev. B, 33, 2868 (1986). [162] J. Cioslowski, P.J. Hay, and J.P. Ritchie, Charge Distributions and Effective Atomic Charges in Transition-Metal Complexes using Generalized Atomic Polar Tensors and Topological Analysis, J. Phys. Chem., 94, 148 (1990). [163] R.S. Mulliken, Electronic Population Analysis on LCAO-MO Molecular Wave Functions, J. Chem. Phys., 23, 1833 (1955). [164] M. Cocchi, M.C. Menziani, P.G. De Benedetti, and G. Cruciani, Theoretical versus Empirical Molecular Descriptors in Monosubstituted Benzenes, Chemmometr. Intell. Lab., 14, 209 (1992).


APPENDIX A CLUSTER ANALYSIS BASED ON K-NEAREST NEIGHBOR METHOD A.1

Introduction The K-nearest neighbor (KNN) technique is a very useful and mathematically

simple tool for cluster analysis (CA) pattern recognitions. The key idea of KNN is that similar observations belong to similar classes. An unknown or test observation is classified by the decision of its K-nearest neighbors in the training set in N-dimensional space. The number of the nearest neighbors, K, should be small to avoid possible misclassifications (low sensitivity). In Figure A.1, while an unknown object is assigned to class 1 in case of K=1, it belongs to class 2 if K=5. Therefore, the choice of a reasonable number for K is crucial in KNN. For cases of small K, KNN can be unstable and sensitive to noise.

3

Class 1 2

1

Y

K=1

0

Unknown

Class 2

K=5 Class 3

-1

-2 -6

-4

-2

0

2

4

6

8

10

X

Figure A.1: 1-NN and 7-NN classification of unknown object

95


96 In KNN, the matrix of distances of the data is calculated and the relationships between clusters are visualized using a dendrogram. In other words, multidimensional distances between samples (objects) can be shown as a cluster tree, i.e. dendrogram. A short distance indicates that two objects are similar, whereas a long distance represents dissimilarity [146]. The most used measure of distance between two data objects is the Minkowski distance and is defined as q

q

q

d (i, j ) = q ( xi1 − x j1 + xi 2 − x j 2 + L + xip − x jp )

(A.1)

where i=(xi1, xi2,…, xip) and j=(xj1, xj2,…, xjp) are two p-dimensional objects, and q is a positive integer. If q is 1, it is called the Manhattan distance. If q is 2, it is the Euclidean distance. For most cases of KNN, the Euclidean distance is normally used. While KNN is a simple theory and easy to interpret for multi-category problems, a drawback is its sensitivity to the unknown object, if the object is situated near the center of the other classes [147]. Moreover distances to all data points should be calculated.

A.2

Numerical Example The easiest way to understand KNN is to use a numerical example. Suppose the

following data is to be analyzed by KNN.

Element

1st Ionization Potential (eV)

Atomic Radius (Å)

Coefficient of Melting Point Thermal Expansion (K) (10-6 K-1)

Li

5.39

1.52

453.69

56

Mg

7.64

1.6

922

26.1

Si

8.15

1.17

1683

4.2

Sc

6.56

1.61

1814

10

Table A.1: Example of dataset for KNN


97

Element

1st Ionization Potential (eV)

Atomic Radius (Å)

Coefficient of Melting Point Thermal Expansion (K) (10-6 K-1)

Li

-1.2614

0.2171

-1.1876

Mg

0.5756

0.603

-0.4601

0.0872

Si

0.9919

-1.4714

0.7221

-0.8562

Sc

-0.3062

0.6513

0.9256

-0.6064

1.3754

Table A.2: Autoscaled dataset for KNN

Element

Li

Mg

Si

Sc

Li

0

2.3900

4.0688

3.0812

0

2.6008

1.7836

0

2.5089

Mg Si

0

Sc

Table A.3: Distance matrix for KNN

Since the four variables have different ranges and scales, the first step of KNN is to normalize the given dataset. Using the autoscaled method, the dataset is scaled to have unit variance and zero mean. The autoscaled dataset is shown in Table A.2. The second step of KNN is to calculate the distances between the objects to create the distance matrix (Table A.3). For example, the Euclidean distance between Li and Si is calculated as follows. 2

2

2

2

d ( Li, Si ) = 2 ( xLi1 − xSi1 + xLi 2 − xSi 2 + xLi 3 − xSi 3 + xLi 4 − xSi 4 ) = 4.0688

(A.2)

With the calculated distance matrix, a dendrogram can be generated. From the distance matrix, the order of magnitude is dMg-Sc < dLi-Mg< dSi-Sc< dMg-Si< dLi-Sc< dLi-Si. Therefore the sequence of partitions to create the dendrogram is z Step 1: Combine Mg and Sc at a similarity of 1.7836, [(Mg, Sc), Li, Si] z Step 2: Combine (Mg,Sc) and Li at a similarity of 2.39, [(Mg, Sc, Li), Si]


98 z Step 3: Combine (Mg,Sc, Li) and Si at a similarity of 2.5089, [(Mg, Sc, Li, Si)]

By choosing the distance, the number of clusters is assigned. In Figure A.2, there are three clusters at distance=2. If the distance is assigned at 2.5, there are two clusters.

1st Step 5

2nd Step

Dendrogram of Data with Preprocessing: Autoscale

4.5

4.5

4 Si

4 Si

3.5

3.5

3 Li

3 Li

2.5

2.5

2 Sc

2 Sc

1.5

1.5

1 Mg

1 Mg

0.5 0

Dendrogram of Data with Preprocessing: Autoscale

5

0.5 0

0.5

1

1.5

2

0

2.5

0

Distance to K-Nearest Neighbor

0.5

1

1.5

3rd Step Dendrogram of Data with Preprocessing: Autoscale

5 4.5 4 Si 3.5 3 Li 2.5

2 Sc 1.5 1 Mg 0.5 0

0

2

Distance to K-Nearest Neighbor

0.5

1

1.5

2

2.5

Distance to K-Nearest Neighbor

Figure A.2: Steps for generating a dendrogram

2.5


APPENDIX B PRINCIPAL COMPONENT ANALYSIS B.1

Introduction Since there is a vast array of variables for crystal chemistry, the appropriate use

of multivariate analysis is crucial in the field of materials informatics. However, a statistical assessment and search for each descriptor in a multivariate way is a computationally formidable task. Principal Component Analysis (PCA) is a technique to reduce the information dimensionality of a dataset consisting of a large number of interrelated variables obtained from a combinatorial experiment or from a well organized database by projection methods, in a way that minimizes the loss of information. A graphical representation of PCA is shown in Figure B.1. In this example, to probe the structure of a data set using PCA, the point of view (different projection) for the dataset should be rotated while the dataset itself is fixed. To thoroughly describe PCA in a effective way, the treatments from many sources [147-152] are summarized here. A unified mathematical notation is used here. A rectangular from this projection

x3

A circle from this projection

x2

x1

Figure B.1: A graphical depiction of the PCA method with the assumption of a ring shaped dataset

99


100

B.2

Projection and Maximum Variance

Considering the two dimensional space as per Massart et al. [147], there exist two variables, y1 and y2, that describe six samples, which can be shown in two dimensional space (Figure B.2 (a)). Since the spread of the data points can be expressed as the sum of squares of their distances to their centroid 0, equation (A.1) describes the spread of Figure B.2 (a). 2

2

the spread of data = 01 + 02 + L + 06

2

(B.1)

If 1', 2 'L , 6 ' on the line shown in Figure B.2 (b) are the projections of each point from above, then 2

2

0i = 0i ' + ii '

2

(B.2)

Therefore, the maximum value of the total variation is given by 2

2

2

2

2

2

the spread of data = 01' + 02 ' + L + 06 ' + 11' + 22 ' + L + 66 ' 1444424444 3 1444 424444 3 to be maximized

y2 y2

2 1 3

y2

0

y1

y2

5 4

(a)

to be minimized

1' 1 2'

2 3'

x2 0

3

(b)

x1

5 4' 6' 4 5' 6

6

y1

(B.3)

y1

y1

Figure B.2: Two dimensional data representation for six samples with two variables, y1 and y2 where 0 is a centroid of six samples after the mean centering. The reduction of dimensionality of space from two to one is shown in (b). The line is created by the projection from the two dimensional space [147].


101 The line of maximum variance describes the trends for the six data points in Figure B.1 (a). Furthermore, PCA is a way to describe multivariate data by maximizing variances. Thus the first principal component captures the maximum variance in the original dataset and the second principal component, which is orthogonal (uncorrelated) to the first, explains most of the remaining variance. B.3

Covariance Matrix

Suppose XT is our n×p observation mean-centered data matrix with nobservations of p-variables. Mean centering is expressed as xi ← xi − x

(B.4)

where xi is the i-th element of the vector x and x bar is the mean of its elements [153]. The mean centered data matrix is ⎡ x1,1 K x1, p ⎤ ⎢ ⎥ X =⎢ M O M ⎥ ⎢ xn ,1 K xn , p ⎥ ⎣ ⎦ T

(B.5)

The covariance matrix S of X is defined as, ⎡ s1,1 K s1, p ⎤ 1 ⎢ ⎥ T S = cov( X ) ≡ XX = ⎢ M O M ⎥ n −1 ⎢ s p ,1 K s p , p ⎥ ⎣ ⎦

(B.6)

The (i, j) elements of the covariance matrix represent the covariance between the i-th and j-th elements of x when i≠j and the variance of the j-th element of x when i=j. B.4

Derivation of Principal Components

Having the above information, we now consider the case of a vector x of p number of variables. The objective in this case is to calculate the variances of the p variables and structure of the covariance between the p variables. The variance of the linear function z1=αT1x is maximized in PCA.

α1T = [α11 , α11 L , α1 p ]

(B.7)


102 p

z1 = α1T x = α11 x1 + α11 x1 + L + α1 p x p = ∑ α1 j x j

(B.8)

j =1

The linear function, z2=αT2x which is uncorrelated with z1=αT1x, can then be calculated to capture the remaining variance. Therefore the k-th linear function, zk=αTkx, is calculated to have maximum variance and to be uncorrelated with αT1x, αT2x, …, αTk-1x. Consider the case where the vector of random variables x has a known covariance matrix S. As explained before, for k=1,2, …, p, the k-th principal component is given by zk=αTkx. The αk is an eigenvector of covariance matrix S corresponding to its k-th largest eigenvalue λk. If αk is chosen to have unit length (αTkαk=1), then the var(zk)= λk. In the following section, for reason of introducing the eigenvalue problem, the graphical and mathematical representations for PCA are discussed. B.4.1 1st Principal Component Calculation

The vector α1 in z1=αT1x maximizes the variance, var(αT1x)= αT1Sα1. Its constraint is to have unit length (αT1α1=1). Since the variance should be maximized with constraint αT1α1=1, the method of Lagrange multipliers should be used. This method states that the variation of f(x)-λ(g(x)-c) is zero on the stationary point if given differentiable function of p variables f(x1, …, xp) is subject to a constant g(x1, …, xp)=c [152]. Mathematically, it is summarized as follows

δ { f ( x) − λ ( g ( x) − c)} = 0 δx

(B.9)

The λ is called a Lagrange multiplier in the above equation. For case of principal components, the method of Lagrange multipliers is described as

max(L) = [α1T Sα1 − λ (α1T α1 − 1)]

(B.10)

Since differentiation gives the maximum value, equation (B.11) results Sα1 − λα1 = 0 or ( S − λ I p )α1 = 0

(B.11)

where Ip is an (p×p) identity matrix. This is known as the problem of eigenstructure for the covariance matrix. To avoid a trivial null solution, (S- λIp) should be zero. λ and α1 should be an eigenvalue of S and the corresponding vector respectively.


103 S1,1 − λi S2,1

L S1,2 S 2,2 − λi L

M S p ,1

M S1, p

S1, p S2, p

⎡0 ⎤ ⎢0 ⎥ =⎢ ⎥ M M ⎢0 ⎥ ⎢ ⎥ L S p , p − λi ⎣0 ⎦

(B.12)

Therefore, the eigenvalue λ represents the variance because:

var(α1T x) = α1T Sα1 = α1T λα1 = λα1T α1 = λ

(B.13)

Since variance should be maximized in PCA, the eigenvalue λ must be as large as possible. The vector α1 is the eigenvector corresponding to the largest eigenvalue λ1 of S.

B.4.2 2nd Principal Component Calculation The second principal component maximizes the variance, var(αT2x)= αT2Sα2. Considering the constraints, it should be uncorrelated with αT1x. Therefore cov(α1T x, α 2T x) = 0 = α1T Sα 2 = α 2T Sα1 = λ1α 2T α1 = λ1α1T α 2

(B.14)

Using the method of Lagrange multipliers,

max(L) = [α 2T Sα 2 − λ(α 2T α 2 − 1) − φ (α 2T α1 − 0) ]

(B.15)

where λ and φ are Lagrange multipliers. By the differentiation for α2, Sα 2 − λα 2 − φα1 = 0

(B.16)

α1T Sα 2 − λα1T α 2 − φα1T α1 = 0

(B.17)

Or by multiplying αT1,

The first term and the second term are zero from equation (B.14) and therefore φ is also zero. The following relations result Sα 2 − λα 2 = 0 or ( S − λ I p )α 2 = 0

(B.18)

The eigenvalue, λ, of S is expressed as:

λ = α 2T Sα 2 or ( S − λ I p )α 2 = 0

(B.19)

where α2 is its eigenvector. The vector αk is called the loadings for the k-th principal components.


104

B.5

The Graphical Representations of Eigenvalues As discussed above, the determination of the eigenvalues and eigenvector is

crucial for PCA. The mathematical procedure of the eigenvalue problems is quite simple but the real difficulty is to understand the meaning of the eigenvalue and eigenvector as mentioned in the book by J. C. Davis [149]. Most of the following explanation for eigenvalue problems are based on the treatment of J. C. Davis [149]. The eigenvalue problem can be stated as follows. [ A] ⋅ [ X ] = λ[ X ]

(B.20)

where [A] is a matrix of coefficients, [X] is a vector of unknowns, and λ is a constant. The object of eigenvalues problems is to find λ. Equation (B.20) is also expressed as

([ A] − λ[ I ]) ⋅ [ X ] = [0]

(B.21)

where I is the identity matrix. Since null vectors are not under consideration, ([A]- λI) should be zero to get a solution. Suppose A is a (2×2) matrix with eigenvalues r1 and r2 where r1 > r2. ⎡a b ⎤ ⎥ ⎣c d ⎦

[ A] = ⎢

(B.22)

The eigenvalues then represent the length (magnitude) of the major and minor axes of an ellipse and the envelope of ellipse should be on the two data points. All these cases can be shown in graphical manner as shown in Figure B.3. If the two data points are closer than in above example, the length of minor axis will be decreased. If the two points are identical, the ellipse becomes a line. However, the ellipse becomes a circle when the two data points are perpendicular. A few numerical examples of changing shape of the ellipse are shown in literature [149].


105

(4,8)

(a,b)

8

r

1

6

1=

(8,4) (c,d)

λ

4

λ

2=

r

2

2

0 0

2

4

6

8

10

12

Figure B.3: A graphical representation of the data points and their eigenvalues

B.6

The Graphical Representations of Eigenvectors If the eigenvalue, r1 and r2, have been calculated, the corresponding

eigenvectors can be calculated. Let us assign the eigenvectors as follows. Thus, for an eigenvalue λ1=r1, the corresponding eigenvector is ⎡ x1 ⎤ ⎡ e1 ⎤ ⎢ x ⎥ = ⎢e ⎥ ⎣ 2⎦ ⎣ 2⎦

(B.23)

For an eigenvalue λ2=r2, the corresponding eigenvector is ⎡ x1 ⎤ ⎡ e1' ⎤ ⎢x ⎥ = ⎢ ' ⎥ ⎣ 2 ⎦ ⎣e2 ⎦

(B.24)

While the eigenvector of the largest eigenvalue represents a slope of the major axis of ellipse, the second largest eigenvalue represents a slope of the minor axis of ellipse. The graphical representation for the eigenvectors is shown in Figure B.4.


106 A slope of major axis is the ratio of eigenvectors for r1 i.e.

slope of major axis = (4,8)

(a,b)

8

A slope of minor axis is the ratio of eigenvectors for r2 i.e. '

r

1

6

(8,4) (c,d)

1=

e2 e1'

4

slope of major axis =

e2 e1

2=

r

2

2

0 0

2

4

6

8

10

12

Figure B.4: A graphical representation of the data points and their eigenvectors

The eigenvectors of symmetric matrices are always orthogonal (right angle). This property is useful in PCA because the covariance matrix is always symmetric by its definition. In the following section, the relationships between the eigenvalue problems and PCA are described.

B.7

The Eigenvalue Problem and PCA PCA is a technique to decompose eigenvalues of a covariance matrix, S, of a

given data matrix. By using equation (B.6), the covariance matrix is calculated and the graphical representation is shown in Figure B.5.


107

y2

⎡a b ⎤ S=⎢ ⎥ ⎣c d ⎦

y1 Figure B.5: A covariance matrix, S, calculated from a given dataset (filled circles), data taken from [149]

x2

y2

PC2

λ1 x1

y1 λ2 PC1

Figure B.6: Determination of two principal components (PC1 and PC2) in a new scaled coordinate, x1 and x2


108

While the eigenvalues represent the length of each of the principal axes, i.e. scores, the eigenvectors of the covariance matrix represents the orientation of principal axes of the ellipsoid (i.e. loadings).

B.8

Singular Value Decomposition The algorithms for calculation of principal components are mainly based on the

factorization of matrices. Singular vector decomposition (SVD) and (or) eigenvalue decomposition are main techniques for factorization of matrices. Using SVD, the (I×J) matrix X can be expressed as X = UDPT

(B.25)

where U is an (I×I) orthonormal matrix, D is an (I×J) diagonal matrix, and P is a (J×J) orthonormal matrix. The diagonal elements of the D matrix are called the singular values which are assumed to be in decreasing order such that d1 ≥ d 2 ≥ L ≥ d m ≥ 0

(B.26)

where m is smaller value in I and J. If X is a matrix of rank r, then d1 ≥ d 2 ≥ L ≥ d r > 0 and d r +1 = L d m = 0

B.9

(B.27)

Eigenvalue Decomposition For any (I×I) matrix A and P is non zero orthonormal matrix, the eigenvalue

problem can be expressed as

AP = PΛ

(B.28)

where Λ is an eigenvalue matrix and its components are Λ=diag{λ1, …, λI}. Then matrix A is I

A = PΛPT = ∑ λi pi piT

(B.29)

i =1

Here, the property PT=P-1 was used from the fact that P is orthonormal. Equation (B.29) represents eigenvalue decomposition.


109

B.10 SVD, Eigenvalue Decomposition, and Principal Components From SVD and eigenvalue decomposition, the logic of PCA can be understood. Consider the covariance matrix S from equation (B.6). If the 1/(n-1) term in equation (B.6) is ignored,

S = XT X

(B.30)

Using equation (B.25), equation (B.30) can be expressed as

S = X T X = PDT U T UDPT

(B.31)

Since U is orthonormal, UTU=I and DTUTUD=DTD= Λ. Therefore

S = X T X = PΛPT

(B.32)

If T=UD, equation (B.25) becomes,

T T T = DT U T UD = DU TUD = D 2 = Λ

(B.33)

Since X=TPT and P is orthonormal,

S = X T X = PT T TPT = PΛPT

(B.34)

T = XP

(B.35)

The equation (B.34) exactly corresponds to equation (B.13) and the columns of T are known as scores and those of P are called loadings.

B.11 Optimal Number of Principal Components As discussed above, PCA is a method to decompose the data matrix X as

X = t1 p1T + t2 p2T + L + t F pFT + EF = TF PFT + EF

(B.36)

where TFPTF is the information of X and EF is a noise. The optimum number F should be found while EF is small. Since the eigenvalue λ (λ1, …, λp) represents variance from equation (B.13), the percent variation explained by the corresponding principal component can be calculated from the respective eigenvalues. Thus

λ1

λ1 + L + λ p

×100

(B.37)

Therefore the percent variation explained by the first f components is

λ1 + L + λ f × 100 λ1 + L + λ p

(B.38)

Generally, F should be chosen to explain at least about 80-90% of variation of


110 data in PCA. Kaiser’s rule-of-thumb suggests choosing the number of eigenvalues larger than unity can also be used to effectively choose the optimal number of principal components [151].


APPENDIX C PARTIAL LEAST SQUARES C.1

Introduction In the appendix B, PCA was discussed as a classification tool by reducing the

dimensionality of a multivariate space. In this thesis, partial least squares (PLS) is used to predict bulk modulus and phase stability by handling the multivariate dataset with collinearities. All the sequences and mathematical expressions taken from the literature [115, 117, 153-156] are integrated and succinctly summarized to fully describe the logic of PLS.

C.2

Ordinary Least Squares Since ordinary least squares (OLS) is a well known technique, it shall be

described prior to PLS. OLS is also known as multiple linear regression. Suppose y is an n×1 observation vector, X is an n×p matrix, β is a p×1 vector of parameters, and ε is an n×1 vector of errors. The general regression model can be written as

X β { + ε{ {y = n{ ×p n×1

n×1

p×1

(C.1)

Generally in the field of QSAR (Quantitative Structure-Activity Relationships),

X is a set of descriptors for chemical structures and the y variable is a measure of activity[154]. In this thesis, y is an objective function such as bulk modulus or phase stability while X consists of crystallographic, thermodynamic, and quantum mechanical parameters. For multiresponse data, equation (C.1) will be generalized as:

Y{ = { X { B+{ E

n×q

n× p p × q

n×q

(C.2)

For handling above equation (C.1), these cases need to be considered [155]. z p>n: There are number of solutions for β. This is not considered in the general

problem. z p=n: It is possible to get a unique solution for β and the error, ε, is zero.

111


112 z p<n: Exact solution for β cannot be calculated. A linear least squares method

used to minimize the ε term is used for this case. Using the approach of Phatak and de Jong [153], the procedure to get the least squares solution is as follows: Equation (C.1) can be modified as:

XT y = XT Xβ

(C.3)

If X is a full rank quadratic matrix, XTX is nonsingular. Therefore the least squares solution is expressed as

β OLS = ( X T X ) −1 X T y = X + y

(C.4)

where X+ is called the pseudoinverse of X. From equation (C.4), the prediction of y is: + $y = X β OLS = XX y OLS

(C.5)

The properties of X X+ are symmetric and idempotent (i.e. (X X+)( X X+)=(X X+)). If there is collinearity of X, then (XTX)-1 in equation (C.3) does not exist. Therefore, OLS is not useful for the collinearity case.

C.3

Principal Components Regression As discussed in appendix A, principal components includes all the aspects of the

variables. In principal components regression (PCR), the calculated scores from the PCA become the data matrix. Therefore PCR is called a linear fitting method in reduced dimensions. If we choose the first m principal components, the prediction of y is $y m = X ( X T X ) −1 X T y = T (T T T ) −1T T y m m m m PCR

(C.6)

where Tm is the scores for the first m principal components and Tm(TTmTm)-1TmT is an orthogonal projection matrix. Since TTmT=Λm and Tm =XPm from the equation (B.33) and (B.35), the above equation (C.6) is also expressed as

$y m = XP Λ −1 PT X T y m m m PCR

(C.7)

The least squares solution is m

β PCR = Pm Λ m−1 PmT X T y

(B.8)

The term (TTmTm)-1 in equation (C.6) exists from the fact that the scores are mutually orthogonal. For this reason, PCR is a useful tool to handle collinearity. If all the principal components are used, the result is same as that of OLS.


113

C.4

Partial Least Squares Partial least squares (PLS) and PCR find the maximum variance in the predictor

variables (X). While OLS finds a factor which correlates the predictor variables and predicted variables(Y), PCR does not consider the response variables. PLS finds the correlation factors between X and Y that have maximum variance. In PLS, two linear combinations are generated from the X and Y respectively and the maximum covariance between X and Y is calculated. Consider an X matrix of size N×K and a N×M matrix Y (Figure C.1).

A

K

X N

Y

N

A

U

=

E

+

A

N

M

N

PT

T

=

K

K

M

M

CT A

N

G

+ N

Figure C.1: The matrix description of PLS method [155]

The following descriptions are mainly based on the treatment of Wold et al. [154]. The scores of X, ta (a=1, 2, …, A=the number of PLS components) are calculated as linear combinations of the original variables with the weights w*ka. The mathematical expression is * tia = ∑ wka xik or T = XW * k

(C.9)

where k=(1, …, K=the number of X variables). The predictor variables, X, are also expressed as the same way described by equation (B.36).


114

xik = ti1 p1Tk + ti 2 p2Tk + L + tiA pTAk + eik = ∑ tia pak + eik or X = TPT + E a

(C.10)

where eik is the X residuals. Similarly, for predicted variables Y, if the scores of Y are ua and the weights cam: yim = ∑ uia cam + gim or Y = UC T + G a

(C.11)

Since scores X are good predictors of Y in PLS, then: yim = ∑ tia cam + fim or Y = TC T + F a

(C.12)

where F represents the error between observed values and the predicted response. Using equation (C.9), the equation (C.12) is also expressed as * yim = ∑ cam ∑ wka xik + f im = ∑ bmk xik + fim or Y = XW *C T + F = XB + F a

k

k

(C.13)

From equation (C.13), the PLS regression coefficients βmk is written as

β mk = ∑ cam wka* or B = W *C T a

(C.14)

Geometrically, all the above parameters are shown in Figure C.2. As discussed before, the multidimensional space of X is reduced to the A-dimensional hyper plane. Since the scores are good predictors of Y, the correlation of Y is formed on this hyper plane. As in PCA, the loadings of X (P) represent the orientation of each of the components of the hyper plane. According to the approach of Phatak and de Jong [153], after n dimensions have been extracted the following equations are available. Tn = XWn* , Pn = X T Tn (TnT Tn ) −1 , W * = Wn ( PnT Wn ) −1

(C.15)

The prediction of y then has a general form given by equation (C.6) $y n = T (T T T ) −1T T y n n n n PLS

(C.16)

From the equations (C.15) and (C.5), equation (C.16) is written as: $y n = X β n PLS PLS = XWn* (WnT X T XWn* ) −1WnT X T X β OLS

(C.17)


115

A-dimensional Hyper-plane Direction is defined as the loadings, p

1st component: score Projections gives scores t1 and t2 2nd component: score Direction in plane defining best correlation with Y (c1t1+c2t2+‌): Eqn. (B.12)

Figure C.2: Geometrical representation of PLS method [154]

C.5

Algorithms for Partial Least Squares

C.5.1 Non-Iterative Partial Least Squares All of the above equations should be implemented in the algorithms. Since Noniterative Partial Least Squares (NIPALS) is the most common algorithm for PLS, it will be explained by following the treatment of Wold et al. [154] and Geladi et al.[155]. After the data are scaled and centered, the steps of NIPALS algorithm are as follows. (1) take ustart = some y j (for a single y , u = y ) c.f. usually the column with greatest variance in Y is chosen

In the X block: (2) the weights, w =

X Tu uT u

(3) normalize, wnew =

wold wold


116 (4) calculate X scores, t =

Xw wT w

In the Y block: (5) the weights, c =

YTt tT t

(6) normalize, cnew =

cold cold

(7) calculate Y scores, u =

Yc cT c

(8) Compare t in step (4) with the one in the preceding iteration step. If they are equal, go to step (9), else go to step (2) and use u calculated in step (7). c.f. If the Y has only one variable, the procedure converges in a single iteration and go to the step (9). (9) Calculate X loadings, p =

X Tt , tT t

(10) Rescale the scores and weight pnew =

pold t w , tnew = old , wnew = old pold pold pold

(11) Find the regression coefficient b , b =

uT t tT t

(12) Remove the present component from X and Y and use deflated matrices as new X and Y in the next component X = X − tpT , Y = Y − tcT

(13) Continue with next component (step (1))

C.5.2 SIMPLS SIMPLS as described by de Jong[156] is another common PLS algorithm. It produces the same results as NIPALS for univariate Y but slightly different for the multivariate Y case [156]. SIMPLS can directly calculate the PLS factors as linear combinations of the original variables by maximizing a covariance with orthogonality


117 and normalization restrictions. In addition, there are no deflated data matrices as in the NIPALS algorithm [156]. Computationally the SIMPLS algorithm is faster than NIPALS [117, 156]. Detailed descriptions of this algorithm can be found in literature [153, 156].


APPENDIX D QUANTUM MECHANIC DESCRIPTORS

D.1

Bulk Modulus The bulk modulus data in Table 3.1 was obtained by ab initio calculations [1]

that was based on total energy calculations. According to Yin and Cohen [157], the total energy is given by

Etot = Ekin + Eec' + EH' + E xc [ ρ ] + Ecc'

(D.1)

where Ekin: electronic kinetic energy, Eec': the electron-core interaction energy, EH': the electron-electron Coulomb energy, Exc[ρ]: the electronic exchange and correlation energy, Ecc': the core-core Coulomb energy. As described by Cohen [158], the core-core repulsion is calculated by the Coulomb electrostatic energy between fixed cores using Madelung sums. The electron-electron contribution is assessed from the electron density as a function of position in the crystal. The electron-core contribution has two parts: the attractive Coulomb interaction and the repulsive part from the Pauli’s exclusion principle. The combined potential of interaction with repulsion terms is called the pseudopotential [67]. For structural properties, we can assume that each valence electron moves in the average potential near other valence electrons. Moreover, this potential depends only on the position-related electron density. This is called the “local density approximation” [67]. The pseudopotential and local density approximations are the main tools used to calculate structural properties in modern computational approaches. For perfect crystals, bulk modulus is a degree of hardness which heavily depends on microscopic properties [67]. Therefore the exact calculations of bulk modulus are crucial in both microscopic aspects of the crystal and high pressure studies. The bulk modulus (B) of a material at zero temperature is defined as

118


119 B=−

V ∂p ∂2E =V 2 ∂V ∂V

(D.2)

where V, p and E are the volume, pressure, and total energy, respectively. This definition results in reasonable values of B for inert-gas solids and alkali-halide crystals [44]. For metals, a free electron gas model is generally used to determine the value of B. 3

⎡ 6.13 ⎤ 2 B = nEF = ⎢ ⎥ GPa 3 ⎣ rs ⎦

(D.3)

where n is the electron concentration, rs is the electron gas parameter, and EF is the Fermi energy. Although the effects of exchange, correlation, and ionic potential are ignored in equation (D.3), its ability to calculate B is quite reasonable [44]. From the calculated total energy, the bulk modulus becomes: B =V

∂2E = B0 + B0' P 2 ∂V

(D.4)

where E, B0, and B0' are the total energy, the equilibrium bulk modulus, and the derivative of the bulk modulus with pressure respectively [159]. If an equation of state is used to connect the bulk modulus and total energy, the total energy by Murnaghan’s equation of state, for example, can be expressed as

BV Etot (V ) = 0 ' B0

⎡ (V0 / V ) B0 ⎤ + 1⎥ + const. ⎢ ' ⎢⎣ B0 − 1 ⎥⎦ '

(D.5)

where B0 and B0' are the bulk modulus and its pressure derivative at the equilibrium volume V0. As discussed above, the pseudopotential ab initio method based on density functional theory is capable of performing optimization of crystal geometries as well as molecular dynamics simulations at constant temperature and pressure. In the first stage of calculation the pseudopotentials for the chemical elements forming the compounds are optimized. Next, a series of simulations are performed at zero Kelvin and zero


120 pressure to minimize the energy of the system with respect to: 1- electronic configuration, 2- atomic positions geometry and 3- lattice parameters and finally a similar procedure is applied at high pressure and/or high temperature. These types of calculations are then compared to experimental studies which permit a unique opportunity to validate theoretical predictions. Due to the complexity of calculating B, there have been many semiempirical methods with various parameters such as the effective valences of cation/anion [52, 63], sound velocity [48], and nearest neighbor distance [44, 47].

D.2

Effective Charges Since bonding in crystal structures is due to the distribution of electron density

which can be achieved with charge values for individual atoms, the concept of atomic charge is indispensable for describing the nature of chemical interactions [160]. Although there are various ways to assign charge distributions to each constituent atom, Mulliken population analysis is the most popular method. In other words, Mulliken’s approach is used for the calculation of the charge distribution in a compound by a decomposition of the electronic density into atomic contributions. Mulliken’s approach is based on the linear combination of atomic basis orbitals, χi (LCAO), which is a common tool to represent wave functions. Furthermore LCAO

assigns the electron density described orbital products χiχj to each atom [160]. The effective charge on each atom is determined from the atomic positions (crystal potential) and concomitant interactions between the various involved orbitals [161]. Mulliken’s effective charge is defined as Qa* = Z a − ( ∑ Pii + ∑∑ Sij Pij ) i∈a

i∈a j ≠i

(D.6)

where Q*a is the charge assigned to atom a, Za is the atomic number of a, Sij is the overlap matrix, and Pij is bond order matrix [160, 162]. The first summation represents the basis set functions centered at atom a and the second summation describes the overlap contributions from the basis set functions centered at other atoms [162].


121 The calculated Ďƒ- and Ď€- electron densities on a particular atom represent the possible orientations of the chemical interactions and net charges on atoms describes non-directional reactions [110]. By investigating the Mulliken effective charges, bonding character is also identified (i.e. positive value of overlap population: bonding state, negative value: antibonding state) from the overlap charge between atoms [163]. Detailed descriptions of various atomic charges, and their related descriptors, are shown in literature [95, 110, 118, 160, 163, 164].


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.