Page 12

x

CONTENTS

6.4

6.5

Time for doers: popular aggregation meta-algorithms . . . . . .

150

6.4.1

Bagging . . . . . . . . . . . . . . . . . . . . . . . . . . .

151

6.4.2

Boosting . . . . . . . . . . . . . . . . . . . . . . . . . . .

152

6.4.3

Forests for bipartite ranking and scoring . . . . . . . . .

154

Time for thinkers: Theory of aggregated rules . . . . . . . . . .

157

6.5.1

Aggregation of classification rules . . . . . . . . . . . . .

157

6.5.2

Consistency of Forests . . . . . . . . . . . . . . . . . . .

158

6.5.3

From bipartite consistency to K-partite consistency . .

160

7 MIXTURE MODELS Christophe Biernacki 7.1

7.2

7.3

7.4

7.5

7.6

165

Mixture models as a many-purpose tool . . . . . . . . . . . . .

165

7.1.1

Starting from applications . . . . . . . . . . . . . . . . .

165

7.1.2

The mixture model answer . . . . . . . . . . . . . . . .

168

7.1.3

Classical mixture models

. . . . . . . . . . . . . . . . .

170

7.1.4

Other models . . . . . . . . . . . . . . . . . . . . . . . .

175

Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

175

7.2.1

Overview . . . . . . . . . . . . . . . . . . . . . . . . . .

175

7.2.2

Maximum likelihood and variants . . . . . . . . . . . . .

176

7.2.3

Theoretical difficulties related to the likelihood . . . . .

179

7.2.4

Estimation algorithms . . . . . . . . . . . . . . . . . . .

180

Model selection in density estimation . . . . . . . . . . . . . . .

186

7.3.1

Need to select a model . . . . . . . . . . . . . . . . . . .

186

7.3.2

Frequentist approach and deviance . . . . . . . . . . . .

189

7.3.3

Bayesian approach and integrated likelihood . . . . . . .

194

Model selection in (semi-)supervised classification . . . . . . . .

200

7.4.1

Need to select a model . . . . . . . . . . . . . . . . . . .

200

7.4.2

Error rates-based criteria . . . . . . . . . . . . . . . . .

203

7.4.3

A predictive deviance criterion . . . . . . . . . . . . . .

205

Model selection in clustering . . . . . . . . . . . . . . . . . . . .

208

7.5.1

Need to select a model . . . . . . . . . . . . . . . . . . .

208

7.5.2

Partition-based criteria . . . . . . . . . . . . . . . . . .

209

7.5.3

The Integrated Completed Likelihood criterion . . . . .

211

Experiments on real data sets . . . . . . . . . . . . . . . . . . .

217

7.6.1

BIC: extra-solar planets . . . . . . . . . . . . . . . . . .

218

7.6.2

AICcond /BIC/AIC/BEC/ˆ ecv : benchmark data sets . . .

219

Model Choice and Model Aggregation, F. Bertrand - Editions Techip  

For over fourty years, choosing a statistical model thanks to data consisted in optimizing a criterion based on penalized likelihood (H. Aka...

Model Choice and Model Aggregation, F. Bertrand - Editions Techip  

For over fourty years, choosing a statistical model thanks to data consisted in optimizing a criterion based on penalized likelihood (H. Aka...

Advertisement