Internship application season is coming up! If you’re studying for interviews, it can be hard to know where to start, so I’ve created a roundup of my favorite blog posts, course slides, Quora and StackExchange answers, and miscellaneous other resources that cover the basics of most machine learning and data science interviews.

Machine learning internship interviews often have three components: (1) an informational phone call, (2) a coding and/or machine learning phone screen, and (3) an in-person coding and/or machine learning interview, sometimes with a presentation. Depending on the company, some or all of these components may be required. The resources I listed below will be most useful for studying for the machine learning phone screen, in which you’ll probably be asked to answer lots of basic questions about machine learning algorithms (e.g. “What is precision?” “Do you know what SVMs are? How do they work?”).

These are the explanations I personally found most useful and intuitive. In particular, I look for resources that are both simple and complete: they lack clutter but they also explain every step without glossing or assuming knowledge on the part of the reader.

This list isn’t meant to be comprehensive, and it certainly shows my biases towards NLP and unsupervised methods. Pick and choose what’s useful for you! And if you’re just getting started with machine learning, I’d recommend taking a structured course, either through your university or through a free online resource like Coursera.

These explanations and visualizations are often very creative and thoughtful. Thank you to each of these authors for taking the time to craft and share these resources!

- Combinatorics Cheat Sheet
- On Measures of Entropy and Information by Gavin E. Crooks
- Harvard Statistics: Math Review for Stat 110 by Joe Blitzstein
- ML Cheatsheet: Calculus
- Stanford Probabilistic Graphical Models: Probability review by Volodymyr Kuleshov and Stefano Ermon
- Probability Cheat Sheet by Joe Blitzstein and William Chen
- PennState Statistics: Confidence Intervals

- Understanding Principal Component Analysis by Rishav Kumar
- A tutorial on Principal Components Analysis by Lindsay I. Smith
- A Tutorial on Principal Component Analysis by Jonathon Shlens
- Singular Value Decomposition Tutorial by Kirk Baker

- Brilliant: K-nearest Neighbors by Akshay Padmanabha and Christopher Williams

- Orebro University: The Simple Perceptron
- StackOverflow: Intuition for perceptron weight update rule by Ami Tavory
- StackExchange: Question regarding weight update rule in Perceptron by Arman
- Deep Learning: The Straight Dope: The Perceptron

- Yale Statistics Course by Michelle Lacey
- ML Cheatsheet: Linear Regression
- Toronto Course by Roger Grosse
- Testing the Assumptions of Linear Regression by Robert Nau
- Ordinary Least Squares Linear Regression: Flaws, Problems, and Pitfalls

- ML Cheatsheet: Logistic Regression
- Is Logistic Regression a linear classifier? by Marco Tulio Ribeiro
- StackExchange: What’s the difference between logistic regression and perceptron? by Antoni Parellada
- Stanford Deep Learning Tutorial: Softmax Regression
- Deep Learning Tutorial - Softmax Regression by Chris McCormick
- Difference Between Softmax and Sigmoid Function by Saimadhu Polamuri
- Deep Learning - The Straight Dope: Binary classification with logistic regression

- SVM Tutorial by Zoya Gavrilov
- An Idiot’s guide to Support vector machines (SVMs) by Robert Berwick
- Support Vector Machines, Succinctly by Alexadre Kowalczyk
- Quora: What is the kernel trick? by Nikhil Garg
- ResearchGate: SVM with large feature space? by Adrian Letchford, Katharina Morik, and Dmytro Prylipko

- Introduction to Probabilistic Topic Models by David Blei + slides
- Probabilistic Topic Models by Mark Steyvers and Tom Griffiths
- Latent Dirichlet Allocation: Toward a Deeper Understanding by Colorado Reed
- Latent Dirichlet Allocation by Nicholas Ruozzi
- Introduction to Latent Dirichlet Allocation by Edwin Chen

- Aylien: Naive Bayes for Dummies: A Simple Explanation by Mike Waldron
- Quora: What are the disadvantages of using a naive bayes for classification? by Simone Scardapane

- fast.ai by Jeremy Howard and Rachel Thomas
- Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville
- Understanding LSTM Networks by Christopher Olah
- A Primer on Neural Network Models for Natural Language Processing by Yoav Goldberg

- StackExchange: What is regularization in plain english? by Toby Kelsey
- Quora: What is regularization in machine learning? by Yassine Alouini
- Quora: What’s a good way to provide intuition as to why the lasso (L1 regularization) results in sparse weight vectors? by Phillip Adkins (the second answer)
- L1 Norm Regularization and Sparsity Explained for Dummies by Shi Yan
- Machine Learning Explained: Regularization by Antoine Guillot (?)
- l0-Norm, l1-Norm, l2-Norm, …, l-infinity Norm by Wattanit Hotrakool

- Machine Learning Crash Course: Part 4 - The Bias-Variance Dilemma by Daniel Geng and Shannon Shih
- Choosing a Machine Learning Classifier by Edwin Chen
- Machine Learning Done Wrong by Cheng-Tao Chu
- Machine Learning for Dummies Cheat Sheet by John Paul Mueller and Luca Massaron
- Classification Model Pros and Cons by Chris Tufts
- Handbook of Biological Statistics by John H. McDonald

(I’ve tried to credit the authors wherever possible, but I couldn’t identify a few of them. If you know who created these resources, please contact me!)

*November 19, 2018*