Course website: https://mvcomp2-w2324.tristanbereau.com/
Course description
This lecture will introduce basic methods and approaches in computational statistics and data analysis, of great importance to empirical problems in the natural sciences. An overview of relevant concepts and theorems in probability theorey and statistics will be covered, all the way to more modern approaches, including automatic differentiation and machine learning. Lectures will be accompanied by computational exercises in Python. Students will learn to analyze data sets and interpret the results from a solid, thoeretically grounded statistical perspective; devise statistical and machine learning models of experimental situations; infer the parameters of these models from empirical observations; and test hypotheses.
Prerequisites
-
Linear (Matrix) Algebra
-
Basic calculus (derivatives & integrals)
-
Basic programming skills in Python
Tentative course outline
-
Basic concepts in probability theory
-
Random variables; expectations, variances, co-variances, and their properties
-
Discrete & continuous probability distributions
-
Moment-generating functions, central limit theorem, and multivariate distributions
-
Statistical models & inference: parameter estimation
-
Hypothesis tests: tests, confidence intervals, bootstrap method
-
Linear regression: least squares, generalized linear model
-
Regularization: Ridge & LASSO regression, MAP estimation
-
Nonlinear regression: basis expansions, neural networks
-
Classification: k-nearest neighbors, logistic regression, linear discriminant analysis
-
Kernel methods: Mercer kernels, Gaussian processes, support vector machines
-
Model selection: Jeffreys scale, BIC, bias-variance tradeoff
-
Dimensionality reduction: principal component analysis, factor analysis
-
(Information theory)
Main references
-
Wackerly, D., Mendenhall, W., & Scheaffer, R. L. (2014). Mathematical statistics with applications. Cengage Learning.
-
Kevin P. Murphy, Probabilistic Machine Learning: An Introduction, MIT Press (2022), https://probml.github.io/pml-book/book1.html
-
Kevin P. Murphy, Probabilistic Machine Learning: Advanced Topics, MIT Press (2022), https://probml.github.io/pml-book/book2.html
-
Mehta, P., Bukov, M., Wang, C. H., Day, A. G., Richardson, C., Fisher, C. K., & Schwab, D. J. (2019). A high-bias, low-variance introduction to machine learning for physicists. Physics reports, 810, 1-124. https://doi.org/10.1016/j.physrep.2019.03.001
-
Luca Amendola, Lecture notes on Statistical Methods. https://www.thphys.uni-heidelberg.de/%7Eamendola/teaching/compstat-hd.pdf