I totally understand why Udacity developed Deep Learning Foundation Nanodegree. Because its previous course: intro to deep learning by Vincent is too rush to “sell” TensorFlow without laying out a solid foundation.
However, I am still skeptical about Siraj’s rap-style lecture. The fly-in video clips are actually very distracting for learners. Well, he is trying to show “these deep concepts are really fun”.
Luckily, other lecturers are much more patient to guide you step by step, with crafted exercises to make sure you get it. These are the solid effort that worth the money.
I am now one capstone apart from graduation in Machine Learning Engineer Nanodegree. I already know quite a few knowledge. So the learning notes here are only complementary, which fill my knowledge gap. The most valuable thing is the projects and project feedbacks, which get you familiar how it actually works.
7 Intro to Neural Network
The derivation is very nice here.
There are many error functions, one is square of difference:
There are many activation functions, one is sigmoid:
where is learning rate.
For convenience, the middle term is defined as error term:
so the weight update function is more universal in case of different activation functions:
If there is one hidden layer, the weight update of the hidden layer is similar, just change input x to the hidden sigmoid a
The weight update of the input layer will need another chain rule:
Because the maximum derivative of the sigmoid function is 0.25, the decent power will diminish quickly with more hidden layers.
bonus: A great video from Frank Chen about the history of deep learning. It’s a 45-minute video, sort of a short documentary, starting in the 1950s and bringing us to the current boom in deep learning and artificial intelligence.
8 Project 1: your first neural network
Note for bike-sharing project: I spend some time to figure the right dimensional shape for each term. The shape is opposite to the previous exercise. Each time only feeds one instance.
Detailed implementation: https://github.com/jychstar/NanoDegreeProject/tree/master/DeepND
Dataset documentation: https://github.com/jychstar/datasets/blob/master/bikeShare/bikeShare_DC.md.
9 model evaluation
# Classification Accuracy from sklearn.metrics import accuracy_score # Regression Metrics from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score form skelarn.linear_model import LinearRegression # K-Fold cv from sklearn.model_selection import KFold kf = KFold(tota_size, test_size, shuffle=True)
- underfitting: error due to bias, oversimplify the problem
- overfitting: error due to variance, overcomplicating the problem, try to remember data not generalize
model complexity graph
4 Apply Deep Learnig
This is a fancy lesson, a collection of some interesting examples.
conda create -n style python=2 source activate style conda install -c conda-forge tensorflow=0.11.0 conda install scipy pillow python evaluate.py --checkpoint ./rain_princess.ckpt --in-path ./examples/content/chicago.jpg --out-path ./output_image.jpg