Machine Learning
Presentation
- Introduction to machine learning
- Optimization of the bias / variance balance
- PLS regression, Linear and quadratic parametric discriminant analysis, k nearest neighbors.
- Neural networks, multilayer perceptron, introduction to deep learning.
- Classification and regression trees
- Bagging, random forests, gradient boosting
- Missing data imputation
- Outlier detection and one class classification
- Scientific deontology and statistical decision
Objectives
At the end of this module, the student will have understood and be able to explain (main concepts):
- Properties and limits of principal machine learning methods.
- Bias - variance balance.
- Algorithms of risk estimation – bootstrap, cross validation.
- Optimization and algorithmic implementations in Python (Scikit-learn) of principal methods.
The student will be able to:
- Analyse big data from different domains: insurance, marketing, industry, by using Python librairies.
- Execute principal machine learning methods and algorithms (PLS, discriminant analysis, k-nn, classification and regression trees, neural networks, boosting, random forest, SVM...)
- Optimize hyper-parameters values et construct python pipelines for automatization.
Recommended prerequisite
Statistical modelling
Exploratory Data Analysis
R and Python languages
Form of assessment
The evaluation of outcome prior learning is made as a continuous training during the semester. According ot the teaching, the assessment will be different: as a written exam, an oral exam, a record, a written report, peers review...