logo Insalogo Insa

High Dimensional Statistical Learning


Introduction to statistical learning and to model selection

  • Forecasting quality of a statistical model, notion of risk, optimal rules, risk estimation with penalisation or by simulation
  • Model selection and variable selection in linear models : AIC, BIC criterion, Ridge and Lasso methods
  • Projection and regularisation methods : Splines, wavelet bases and thresholding, Reproducing  Kernel Hilbert Spaces
  • Support Vector Machine for classification
  • Kernel estimators in density or regression
  • Introduction to deep learning
  • Anomaly detection in functional data



  • Lectures : 20H
  • Practical work of applications on real data sets with the software R or Python’s libraries( Scikit Learn) : 18H


At the end of this module, the student will have understood and be able to explain (main concepts):

  • Fitting statistical models in regression or classification in high dimension with various approaches
  • Estimation of the prediction error
  • Optimal model selection for prediction
  • Application of statistical learning methods on real data sets


 The student will be able to:

  •  Fit and select a statistical model in high dimension for prediction purposes
  • Implement statistical learning methods in high dimension on real data sets with the software R or Python’s libraries.

Needed prerequisite

Probability and statistics
Elements of statistical modelling [I4MMMS71]

Form of assessment

The evaluation of outcome prior learning is made as a continuous training during the semester. According ot the teaching, the assessment will be different: as a written exam, an oral exam, a record, a written report, peers review...