Probabilities is the cornerstone of AI. Express uncertainty and the management of uncertainty is the key to many, many things in AI.

naive Bayes: incidences are independent to each other.

Need 3 parameters to determine joint probability: P(A), P(B/A), P(B/~A)

The caveat is that B is dependent on the not observable A.

reading: AIMA: Chapter 13.

1st order Markove models only depend on the state immediately preceding them and not a history of states. We don’t necessarily know which state matches which physical event. Instead, each state can yield one or more outputs. we observe the output over time and determine a sequence of states, based on how likely they were to produce the output.

Because the base frequency may change, we use delta frequency and euclidean distance to compare the 2 signals.

transition probability x output probability.

AIMA: Chapter 15.1-15.3

Rabiner’s famous Tutorial on hidden Markov models and selected applications in speech recognition [errata]

Thad Starner’s MS thesis: Visual Recognition of American Sign Language Using Hidden Markov Models [PDF]

The Hidden Markov Model Toolkit (HTK)

Please read Chapter 1 The Fundamentals of HTK (pages 3-13) in The HTK Book (version 3.4) [PDF | HTML].

AIMA: Chapter 15.4-15.6 (provides another viewpoint on HMMs with a natural extension to Kalman filters, particle filtering, and Dynamic Bayes Nets), Chapter 20.3 (hidden variables, EM algorithm)

Huang, Ariki, and Jack’s book Hidden Markov Models for Speech Recognition.

Yechiam Yemini’s slides on HMMs used in genetics (gene sequencing, decoding).

Sebastian Thrun and Peter Norvig’s AI course:

Resources for Segmentally Boosted HMMs

SBHMM project at Georgia Tech
HMM Tool Kit (HTK)
Gesture and Activity Recognition Toolkit (GART; formerly Georgia Tech Gesture Toolkit)

Pei Yin’s dissertation: Segmental discriminative analysis for American Sign Language recognition and verification

HMMs for Speech Synthesis

Junichi Yamagishi’s An Introduction to HMM-Based Speech Synthesis

Heiga Zen’s Deep Learning in Speech Synthesis

DeepMind’s WaveNet

project: build a sign language recognizer

In this project, you will build a system that can recognize words communicated using the American Sign Language (ASL). You will be provided a preprocessed dataset of tracked hand and nose positions extracted from video. Your goal would be to train a set of Hidden Markov Models (HMMs) using part of this dataset to try and identify individual words from test sequences.

https://github.com/udacity/AIND-Recognizer

walk-through: https://www.youtube.com/watch?v=EyTM0e2DlEM&feature=youtu.be

learning notes

It is best to dig into the original asl_data.py to see how to data is wrangled.

asl.df is a dataframe with 15.746 k entries and 7 columns. The rows are from (98,0) to (125,56) . 6 columns are from hands_condensed.csv, 1 column is from speaker.csv

asl.build_training() gives a ssl_data.WordsData object. Each word in the training set has multiple examples from various videos.

feature selection: feature_ground, features_norm, features_polar, features_delta.

The base model is GaussianHMM: https://hmmlearn.readthedocs.io/

Each time the model is trained on a single word. We are mostly interested in the number of hidden states.

training = asl.build_training(features)  # worksdata object
X, lengths = training.get_word_Xlengths(word)
model = GaussianHMM(n_components=num_hidden_states, n_iter=1000).fit(X, lengths)
logL = model.score(X, lengths)

Note that the training set and test set are the same in the above example.

Submission

Once you have completed the project and met all the requirements set in the rubric (see below), please save the notebook as an HTML file. You can do this by going to the File menu in the notebook and choosing “Download as” > HTML. Submit the following files (and only these files) in a .zip archive:

asl_recognizer.ipynb
asl_recognizer.html
my_model_selectors.py
my_recognizer.py

https://review.udacity.com/#!/rubrics/749/view

https://review.udacity.com/#!/rubrics/749/submit-zip

the goal is to get 40% correct or 72 of 178.

features	selector	correct/178	time
ground	SelectorConstant	59	24
ground	SelectorCV	59	82
ground	SelectorBIC	67	63
ground	SelectorDIC	71	174
polar	BIC	69	64
polar	DIC	75	187
delta	DIC	63	176
norm	DIC	68	200
norm	BIC	67	66
norm	CV	67	81
norm-delta	BIC	78	71
norm-delta	DIC	78	201

number of free params

In this project, however, we are using the “diag” in the hmmlearn model and we are not specifying starting probabilities. Therefore, if we say that m = num_components and f = num_features…
The free parameters are a sum of:
the free transition probability parameters, which is the size of the transmat matrix less one row because they add up to 1 and therefore the final row is deterministic, so m*(m-1)
PLUS
the free starting probabilities, which is the size of startprob minus 1 because it adds to 1.0 and last one can be calculated so m-1
PLUS
number of means, which is m*f
number of covariances which is the size of the covars matrix, which for “diag” is m*f
WHICH EQUALS :
m^2 + 2*m*f - 1

Yuchao's blogspot

Friday, August 25, 2017

Artificial Intelligence A4, Hidden Markov Model

project: build a sign language recognizer

learning notes

Submission

number of free params