logo Insalogo Insa

Machine Learning

Presentation

Program (detailed contents) :

 

  • Introduction to machine learning
  • Optimization of the bias / variance trade-off
  • Model selection via penalized criterion: Mallows CP, BIC, Ridge, Lasso…
  • Linear and quadratic discriminant analysis, k nearest neighbors.
  • Classification and regression trees
  • Bagging, random forests
  • Neural networks, multilayer perceptron, backpropagation algorithms, optimization algorithms, introduction to deep learning.
  • Missing data imputation
  • Scientific deontology and statistical decision
  • Legal framework and societal impacts of AI

 

 

 

Organization :

  • Lectures : 20H
  • Practical work of applications on real data sets with the Python’s libraries (Scikit Learn) : 30H

 

 

Main difficulties for students :

Apprehend new methods and apply them to complex data sets.

Objectives

At the end of this module, the student will have understood and be able to explain (main concepts) :

 

  • Properties and limits of the main machine learning algorithms.
  • Bias - variance trade-off, model selection.
  • Algorithms for risk estimation : bootstrap, cross validation.
  • Optimization and algorithmic implementations with R and Python (Scikit-learn) of the studied algorithms.
  • Ethical and legal concepts of artificial intelligence.

 

The student will be able to :

  • Analyse big data sets from various domains: insurance, marketing, industry, by using R and Python libraries.
  • Execute the main machine learning methods and algorithms (discriminant analysis, k-nn, classification and regression trees, random forests, neural networks..)
  • Optimize hyper-parameters values and construct pipelines for automating.
  • Optimize the missing values management.
  • Detect ethical or legal failures (bias, discrimination, opacity) of machine learning algorithms.

Needed prerequisite

Statistical modelling

Introduction to R and Python languages

Form of assessment

The evaluation of outcome prior learning is made as a continuous training during the semester. According ot the teaching, the assessment will be different: as a written exam, an oral exam, a record, a written report, peers review...

Bibliography

http://wikistat.fr

https://github.com/wikistat

 

Coelho,L. P., Richert W. « Building Machine Learning Systems with Python », 2nd Edition, Packt Publishing, 2015

Hastie, T. Tibshirani, R., Friedman, J. « The elements of statistical learning », Springer, 2001

Goodfellow I., Bengio Y., Courville A. “ Deep Learning”MIT Press, 2016