# Machine Learning

## Presentation

Program (detailed contents) :

• Introduction to machine learning
• Optimization of the bias / variance trade-off
• Model selection via penalized criterion: Mallows CP, BIC, Ridge, Lasso…
• Linear and quadratic discriminant analysis, k nearest neighbors.
• Classification and regression trees
• Bagging, random forests
• Neural networks, multilayer perceptron, backpropagation algorithms, optimization algorithms, introduction to deep learning.
• Missing data imputation
• Scientific deontology and statistical decision
• Legal framework and societal impacts of AI

Organization :

• Lectures : 20H
• Practical work of applications on real data sets with the Python’s libraries (Scikit Learn) : 30H

Main difficulties for students :

Apprehend new methods and apply them to complex data sets.

## Objectives

At the end of this module, the student will have understood and be able to explain (main concepts) :

• Properties and limits of the main machine learning algorithms.
• Bias - variance trade-off, model selection.
• Algorithms for risk estimation : bootstrap, cross validation.
• Optimization and algorithmic implementations with R and Python (Scikit-learn) of the studied algorithms.
• Ethical and legal concepts of artificial intelligence.

The student will be able to :

• Analyse big data sets from various domains: insurance, marketing, industry, by using R and Python libraries.
• Execute the main machine learning methods and algorithms (discriminant analysis, k-nn, classification and regression trees, random forests, neural networks..)
• Optimize hyper-parameters values and construct pipelines for automating.
• Optimize the missing values management.
• Detect ethical or legal failures (bias, discrimination, opacity) of machine learning algorithms.

## Needed prerequisite

Statistical modelling

Introduction to R and Python languages

## Form of assessment

The evaluation of outcome prior learning is made as a continuous training during the semester. According ot the teaching, the assessment will be different: as a written exam, an oral exam, a record, a written report, peers review...

## Bibliography

http://wikistat.fr

https://github.com/wikistat

Coelho,L. P., Richert W. « Building Machine Learning Systems with Python », 2nd Edition, Packt Publishing, 2015

Hastie, T. Tibshirani, R., Friedman, J. « The elements of statistical learning », Springer, 2001

Goodfellow I., Bengio Y., Courville A. “ Deep Learning”MIT Press, 2016