I am not satisfied with the unsupervised learning courses at Udacity. It is just not well organized! Seems like a random collection of “Intro” course and “Gatech” course. I lose my focus several times during the study.
To me, unsupervised learning is actually more important than supervised learning. Because all human knowledge begins with unlabelled data. After human discover the natural patterns behind the phenomenon, they begin to label various things to accumulate knowledge and gain further insights. Unsupervised learning is difficult to teach because, in the first place, you don’t even know whether there is a pattern to look at, let alone what’s the important features.
Unsupervised algorithms
- K-means clusters. cons: bad starting points may lead to the bad local minimum.
- Single Linkage clustering. consider each object a cluster, merge the closest together.
- Expectation Maximization. soft clustering.
Feature selection
from m features, select n features is an NP-hard problem, has a complexity of n^m
speed | main characteristics | implement | |
---|---|---|---|
filtering | fast | ignore the learner and no feedback | Information gain |
wrapping | slow | takes into account model bias and learning | forward (adding) backward (subtract) |
- Relevance: information, Bayes optimal classifier (no bias)
- usefulness: reduce error, bias help break the tie.
PCA
- a systematic way to transform input features into principal components.
- use PCs as new features
- Maximum variance as the principal component, so as to minimize the information loss.
- PCs are independent features.
when to use:
- latent features driving the patterns
- dimensionality reduction (human can only draw 2D scatterplot!).
- It is a data preprocessing. So it can be used in both supervised or unsupervised learning to reduce noise and reduce overfitting.
facial recognition
http://scikit-learn.org/stable/auto_examples/applications/face_recognition.html
datailed analysis see my github
datailed analysis see my github
How many PCs to use? (measured by f1 score due to multi-class labels)
No of PC | F1 score |
---|---|
15 | 0.65 |
25 | 0.74 |
50 | 0.81 |
100 | 0.85 |
250 | 0.82 |
Feature transformation
have overlap with feature selection.
independent component analysis
Customer segments
content has been merged into p3_Customer Segments
Wow, this is fascinating reading. I am glad I found this and got to read it. Great job on this content. I liked it a lot. Thanks for the great and unique info. fake college diploma reviews
ReplyDelete