Saturday, January 24, 2015

Machine Learning Course Summary (Part 6)

Summary from the Stanford's Machine learning class by Andrew Ng


  • Part 1
    • Supervised vs. Unsupervised learning, Linear Regression, Logistic Regression, Gradient Descent
  • Part 2
    • Regularization, Neural Networks
  • Part 3
    • Debugging and Diagnostic, Machine Learning System Design
  • Part 4
    • Support Vector Machine, Kernels
  • Part 5
    • K-means algorithm, Principal Component Analysis (PCA) algorithm
  • Part 6
    • Anomaly detection, Multivariate Gaussian distribution
  • Part 7
    • Recommender Systems, Collaborative filtering algorithm, Mean normalization
  • Part 8
    • Stochastic gradient descent, Mini batch gradient descent, Map-reduce and data parallelism

Anomaly detection

  • Examples
    • Fraud Detection
      • x(i) = features of users i activities.
      • Model p(x) from data
      • Identify unusual  users by checking  which have p(x) < epsilon
    • Manufacturing
    • Monitoring computers in a data center
  • Algorithm
    • Choose features x(i) that you think might be indicative of anomalous examples.
    • Fit parameters u1,…un, sigma square 1,… sigma square n
    • Given new example x, compute p(x):
    • Anomaly if p(x) < epsilon

image

  • Aircraft engines example
    • 10000     good (normal) engines
    • 20     flawed engines (anomalous)
    • Alternative 1
      • Training set: 6000 good engines
      • CV: 2000 good engines (y=0), 10 anomalous (y=1)
      • Test: 2000 good engines (y=0), 10 anomalous (y=1)
    • Alternative 2:
      • Training set: 6000 good engines
      • CV: 4000 good engines (y=0), 10 anomalous (y=1)
      • Test: 4000 good engines (y=0), 10 anomalous (y=1)
    • Algorithm Evaluation

image

  • Anomaly detection vs. Supervised learning

      Anomaly detection

      Supervised learning

      • Very small number of positive examples
      • Large number of positive and negative examples.
      • Large number of negative examples
      • Enough positive examples for algorithm to get a sense of what positive examples are like, future positive examples likely to be similar to ones in training set.
      • Many different “types” of anomalies. Hard for any algorithm to learn from positive examples what the anomalies look like; future anomalies may look nothing like any of the anomalous examples we’ve seen so far.
      • Fraud detection
      • Email spam classification
      • Manufacturing (e.g. aircraft engines)
      • Weather prediction (sunny/rainy/etc).
      • Monitoring machines in a data center
      • Cancer classification
    • Choose what features to use
      • Plot a histogram and see the data

    image

      • Original vs. Multivariate Gaussian model

    image

    No comments:

    AddIn