logo Insalogo Insa

AI frameworks

Presentation

  • Introduction to the Spark Hadoop framework with PySpark.
  • Big data munging with SparkSQL, SparML. Map Reduce.
  • Comparison of different frameworks: R, Python, Spark for big data analytics, on three or four use (workshops) cases.
  • Recommendation system, image processing, text mining.
  • Introduction to deep learning technologies.
  • Introduction to cloud computing

Objectives

At the end of this module, the student will have understood and be able to explain (main concepts):

  • Scalability concepts (volume, variety, velocity) of big data analytics methods.
  • Properties of main big data frameworks (Python, Spark). Map Reduce.
  • Implementation on different hardwares.

 

The student will be able to:

  • Clean, prepare, transform (munging) big data within Python or Spark frameworks.
  • Identify the good method to analyse these big data on classical use cases (images, recommendation system, text mining...)
  • Execute, optimize, these methods and algorithms in the best adapted framework and validate their performances.
  • Learn by himself and develop a use case for a recent technology of his choice.

Needed prerequisite

Probability and statistics
Elements of statistical modelling [I4MMMS71]
Softwares and methods of statistical exploratory data analysis [I4MMSP81]

Form of assessment

The evaluation of outcome prior learning is made as a continuous training during the semester. According ot the teaching, the assessment will be different: as a written exam, an oral exam, a record, a written report, peers review...