 # Machine Learning

## Presentation

Program (detailed contents) :

• Introduction to machine learning
• Optimization of the bias / variance balance
• PLS regression, Linear and quadratic parametric discriminant analysis, k nearest neighbors.
• Neural networks, multilayer perceptron, introduction to deep learning.
• Classification and regression trees
• Bagging, random forests, gradient boosting
• Missing data imputation
• Outlier detection and one class classification
• Scientific deontology and statistical decision

Organization :

• Lectures : 20H
• Practical work of applications on real data sets with the Python’s libraries( Scikit Learn) : 18H

Main difficulties for students :

• Apprehend new methods and apply them to complex data sets.

## Objectives

At the end of this module, the student will have understood and be able to explain (main concepts) :

• Properties and limits of principal machine learning methods.
• Bias - variance balance.
• Algorithms of risk estimation – bootstrap, cross validation.
• Optimization and algorithmic implementations in R and Python (Scikit-learn) of principal methods.
• Ethical concepts of artificial intelligence.

The student will be able to :

• Analyse big data from different domains: insurance, marketing, industry, by using R and Python librairies.
• Execute principal machine learning methods and algorithms (PLS, discriminant analysis, k-nn, classification and regression trees, neural networks, boosting, random forest, SVM...)
• Optimize hyper-parameters values et construct pipelines for automatization.
• Optimize the missing values management.
• Detect ethical or legal failures (bias, discrimination, opacity) of machine learning algorithms.

## Needed prerequisite

• Elements of statistical modelling [I4MMMS71]
• Softwares and methods of statistical exploratory data analysis [I4MMSL81]
• R and Python languages

## Form of assessment

The evaluation of outcome prior learning is made as a continuous training during the semester. According ot the teaching, the assessment will be different: as a written exam, an oral exam, a record, a written report, peers review...

## Bibliography

http://wikistat.fr

Coelho,L. P., Richert W. « Building Machine Learning Systems with Python », 2nd Edition, Packt Publishing, 2015.

Hastie, T. Tibshirani, R., Friedman, J. « The elements of statistical learning », Springer, 2001