Summary from the Stanford's Machine learning class by Andrew Ng
 Part 1
 Supervised vs. Unsupervised learning, Linear Regression, Logistic Regression, Gradient Descent
 Part 2
 Regularization, Neural Networks
 Part 3
 Debugging and Diagnostic, Machine Learning System Design
 Part 4
 Support Vector Machine, Kernels
 Part 5
 Kmeans algorithm, Principal Component Analysis (PCA) algorithm
 Part 6
 Anomaly detection, Multivariate Gaussian distribution
 Part 7
 Recommender Systems, Collaborative filtering algorithm, Mean normalization
 Part 8
 Stochastic gradient descent, Mini batch gradient descent, Mapreduce and data parallelism
Kmeans Algorithm
 Clustering
 Kmeans algorithm
 Put random cluster centroids
 Find mean and replot cluster centroids
 repeat…
 *Note – If a centroid has NO POINTS assigned to it, than eliminate that cluster centroid
 If we randomly choose WRONG X values than it might get stuck in “local optima”
 To solve the above random initialization problem, we run it 100 times and pick the clustering that gave lowest cost.
 Note:
 if K is small (between 2 to 10) than multiple random initialization will find better local optima
 if K is large than multiple random initialization may not help or make a huge difference.
 Note:
 How to choose the Number of Clusters ??
 Elbow method
Dimensionality Reduction

Reduce data from 2D to 1D and 3D to 2D
Principal Component Analysis (PCA) Algorithm
 Data Preprocessing
 Algorithm
 Reduce data from n dimensions to k dimensions
 Compute “covariance matrix”
 Compute “eigenvectors” of matrix sigma
 Summary
 Choosing K (number of principal components)
 Typically choose k to be smallest value:
 95% to 99% is a common variance value
 Keep changing K and see what gives us the smallest value which gives us 99% variance.
Applying Principal Component Analysis (PCA)
 Supervised Learning Speedup
 Extract inputs
 Apply PCA
 Get New Training Set
 Use logistic regression or other algorithms
 Application of PCA
 Compressions
 Reduce memory/disk needed to store data
 Speed up learning algorithm
 Visualization
 DO NOT USE PCA to prevent overfitting (to reduce the number of features), use regularization instead.
 Before implementing PCA, first try running whatever you want to do with the original/raw data. Only if that doesn’t do what you want, then implement PCA
 Compressions
No comments:
Post a Comment