Update on 2017.2.16: TensorFlow 1.0
I started to learn machine learning in September 2016, and recently found that some python functions from the machine learning library was already deprived. I realize now machine learning is growing so fast that the API is being updated non-stop to better suit the real-world requirements.
Here are some notes to track the grammar changes that affect my projects.
tf.initialize_all_variables() # 0.11 tf.global_variables_initializer() # 0.12
from sklearn.model_selection import validation_curve, train_test_split, GridSearchCV, KFold, cross_val_score # 0.18, from sklearn.cross_validation import train_test_split # 0.17
model_selectionis a new module, which groups several functionalities together:
cross_val_score(svc, X, y, cv=KFold(N_splits=3, n_jobs=-1)is very convenient. You get 3 sets of data, fit, prediction and score in one line of code. So you can easily see the variation caused by data fluctuation.
- A more fancy way is to use
validation_curve, in which you get both training score and test score for a set of hyperparameters. e.g.
train_scores, test_scores = validation_curve(SVC(), X, y, param_name="gamma", param_range=np.logspace(-6,-1,5), cv=10, scoring="accuracy", n_jobs=1)
- An even more fancy way is to use
learning_curve, in which you see the score change with data size. e.g.
train_sizes, train_scores, valid_scores = learning_curve (SVC(kernel='linear'), X, y, train_sizes=[50, 80, 110], cv=5)
- Don’t forget the previous
GridSearchCVis still powerful.