Jomit's Blog: Machine Learning Course Summary (Part 6)

Part 1
- Supervised vs. Unsupervised learning, Linear Regression, Logistic Regression, Gradient Descent
Part 2
- Regularization, Neural Networks
Part 3
- Debugging and Diagnostic, Machine Learning System Design
Part 4
- Support Vector Machine, Kernels
Part 5
- K-means algorithm, Principal Component Analysis (PCA) algorithm
Part 6
- Anomaly detection, Multivariate Gaussian distribution
Part 7
- Recommender Systems, Collaborative filtering algorithm, Mean normalization
Part 8
- Stochastic gradient descent, Mini batch gradient descent, Map-reduce and data parallelism

Examples
- Fraud Detection
  - x(i) = features of users i activities.
  - Model p(x) from data
  - Identify unusual users by checking which have p(x) < epsilon
- Manufacturing
- Monitoring computers in a data center
Algorithm
- Choose features x(i) that you think might be indicative of anomalous examples.
- Fit parameters u1,…un, sigma square 1,… sigma square n
- Given new example x, compute p(x):
- Anomaly if p(x) < epsilon

Aircraft engines example
- 10000 good (normal) engines
- 20 flawed engines (anomalous)
- Alternative 1
  - Training set: 6000 good engines
  - CV: 2000 good engines (y=0), 10 anomalous (y=1)
  - Test: 2000 good engines (y=0), 10 anomalous (y=1)
- Alternative 2:
  - Training set: 6000 good engines
  - CV: 4000 good engines (y=0), 10 anomalous (y=1)
  - Test: 4000 good engines (y=0), 10 anomalous (y=1)
- Algorithm Evaluation

Anomaly detection vs. Supervised learning

Anomaly detection	Supervised learning
Very small number of positive examples	Large number of positive and negative examples.
Large number of negative examples	Enough positive examples for algorithm to get a sense of what positive examples are like, future positive examples likely to be similar to ones in training set.
Many different “types” of anomalies. Hard for any algorithm to learn from positive examples what the anomalies look like; future anomalies may look nothing like any of the anomalous examples we’ve seen so far.
Fraud detection	Email spam classification
Manufacturing (e.g. aircraft engines)	Weather prediction (sunny/rainy/etc).
Monitoring machines in a data center	Cancer classification

Jomit's Blog