logo Insalogo Insa

Machine Learning

Presentation

Program (detailed contents) :

  • Introduction to machine learning
  • Optimization of the bias / variance balance
  • PLS regression, Linear and quadratic parametric discriminant analysis, k nearest neighbors.
  • Neural networks, multilayer perceptron, introduction to deep learning.
  • Classification and regression trees
  • Bagging, random forests, gradient boosting
  • Missing data imputation
  • Outlier detection and one class classification
  • Scientific deontology and statistical decision

 

Organization :

  • Lectures : 20H
  • Practical work of applications on real data sets with the Python’s libraries( Scikit Learn) : 18H

 

Main difficulties for students :

  • Apprehend new methods and apply them to complex data sets.

Objectives

At the end of this module, the student will have understood and be able to explain (main concepts) :

  • Properties and limits of principal machine learning methods.
  • Bias - variance balance.
  • Algorithms of risk estimation – bootstrap, cross validation.
  • Optimization and algorithmic implementations in R and Python (Scikit-learn) of principal methods.
  • Ethical concepts of artificial intelligence.

 

The student will be able to :

  • Analyse big data from different domains: insurance, marketing, industry, by using R and Python librairies.
  • Execute principal machine learning methods and algorithms (PLS, discriminant analysis, k-nn, classification and regression trees, neural networks, boosting, random forest, SVM...)
  • Optimize hyper-parameters values et construct pipelines for automatization.
  • Optimize the missing values management.
  • Detect ethical or legal failures (bias, discrimination, opacity) of machine learning algorithms.

Needed prerequisite

  • Elements of statistical modelling [I4MMMS71]
  • Softwares and methods of statistical exploratory data analysis [I4MMSL81]
  • R and Python languages

Form of assessment

The evaluation of outcome prior learning is made as a continuous training during the semester. According ot the teaching, the assessment will be different: as a written exam, an oral exam, a record, a written report, peers review...

Bibliography

http://wikistat.fr

Coelho,L. P., Richert W. « Building Machine Learning Systems with Python », 2nd Edition, Packt Publishing, 2015.

Hastie, T. Tibshirani, R., Friedman, J. « The elements of statistical learning », Springer, 2001