Data Analysis ENSIIE
Projects
- Projet 2022 Kmeans++
- Projet 2023 Generalized Kmeans
- EM Algorithm (2023)
- Implement the initialization of the EM algorithm
- Implement the E-step
- Implement the M-step
- Test your EM algorithm on the simulated data from Exercise 1 of the worksheet
Lectures Notes
Exercices
- Exercices on multivariate normal
- Exercices on multivariate normal correction
- Exercices on clustering
- Exercices on mixture models
- Exercices on PCA
- Exercices on Kernels
Elements of solution
Data files
Document and Links
Specific reference
Reference books about machine learning
-
Machine Learning: A Probabilistic Perspective from Kevin P. Murphy
- Chapter 4: Gaussians models
- Chapter 11: Mixture Models and EM algo (with kmeans)
- Chapter 25: Clustering (HAC)
- Chapter 12: Latent Linear Models (PCA)
- Chapter 14: Kernels
-
Pattern Recognition and Machine Learning from Chris M Bishop
R base
Official manuals about R base can be retrieved from
https://cran.r-project.org/manuals.html
Contribution by the community can be retrieved from
https://cran.r-project.org/other-docs.html
The short introduction from Emmanuel Paradis allows a quick start
- ‘‘R for Beginners’’ by Emmanuel Paradis.
Longer book allow a deepening. See for example
- ‘‘Using R for Data Analysis and Graphics - Introduction, Examples and Commentary’’ by John Maindonald.
R from RStudio developers
- R for Data Science The book of Wickam about more recent R development for data science
And if you want more see https://www.rstudio.com/resources/books/