Showing posts with label udactiy. Show all posts

Saturday, August 26, 2017

Artificial Intelligence A3, planning

Note: Due to time constraint, I have to pause AIND at the planning section. If I have time in the future, I may restore my study in the following:

pac-man lab
simulated annealing lab
airport transportation planning project.

Peter Norvig: AI is all about figuring out what to do when you don’t know what to do. Regular programming is writing instructions to make the computer do what you want when you do know what to do. In AI, it’s for when you don’t know. search is one of key areas where we can figure out what to do by planning a sequence of steps, even when we have no idea what the first step should be until we solve the search problem.

Rubik’s Cube: God’s Number is 26 in the Quarter-Turn Metric.

breadth first search (shortest)
uniform cost search (cheapest)
depth first search (least storage space, fastest computing time)

A* search

Also called best estimated total path cost first

only expanse the minimum value. f = g+h

g(path) = path cost so far

h(path) = h(s) = estimated distance to goal = straigh line distance

g is to keep the path short, h is to keep us focus on the goal.

A search finds the lowest cost path if *h(s) < true cost. h is optimistic.

A problem solver that uses a heuristic function given to it really is not intelligent at all.

crossing out parts of the rules is called generating a relaxed problem.

searching is one of the most fundamental methods of solving problems in AI.

Peter: To me, AI is programming a computer to do the right thing when you don’t know for sure the right thing is.

The book was successful because of a fortuitous accident of timing. we happened to come along just as AI was making this transition from Boolean logic to probability and from handcarfted knowledge to machine learning.

lab: pac man

http://inst.eecs.berkeley.edu/~cs188/sp09/pacman.html

simulated annealing

do the stupid thing first, and only add intelligence when necessary.

global optimum: an element of randomness is what you need

random restart
genetic algorithms
simulated annealing

taboo search: it is taboo if somewhere you have already been.

stage algorithm: keep a list of all local maximum and try to get the general shape of the problem，try to predict where a new maximum might be, given the areas I haven’t explored yet.

local beam search/ stochastic beam search: keep track of k particles. At each time frame, we randomly generate neighbors of these particles and keep the best ones for the next iteration. There are messages sharing between these particles.

genetic algorithm is a fancy version of beam search that happens to make a nice analogy to biology. some people call it the 2nd best solution to any problem.

We’ll see these ideas repeatedly in aI, in techniques like particle filters, pattern recognition with hidden Markov models, and Monte Carlo Markov chains. Sometimes let them go and embrace randomness is the key to the solution.

randomized optimization

lab

https://github.com/udacity/AIND-Simulated_Annealing

Follow the instructions to implement Simulated Annealing from AIMA

constraint satisfaction

techniques:

backtracking
forward checking
minimum remaining values
least constraining value

problems with preference constraints are called constraint optimization problems and are with solving linear programming.

Examples:

car assembly
job scheduling in factories
class scheduling
spreadsheets
transportation scheduling
floor planning

Constraint Satisfaction Problems by solving the N-queens problem using symbolic constraints and backtracking search: https://github.com/udacity/AIND-Constraint_Satisfaction

complete the constraints for the N-Queens CSP and implement Backtracking-Search from AIMA (chapter 6)

logic

AI started in the 50s~70s with the idea that what we needed was better language and systems for writing down logical statements. Once we have done enough of them, we should be able to do anything of them. Some progress but 2 major issues:

the world is uncertain. We didn’t get good algorithm to deal with probability until mid 1980s
too laborious to try to write down everything by hand.

Today, logic is still useful in many domains where we can eliminate much of the uncertainty of real world and solve significant problems. Like FedEx planning all the deliveries of its packages.

Expert system.

future of planning algorithms:

Learning from examples would be an important area.
transfer learning across domain.
interactive planning
explainable, adaptable.

value iteration is a key concept for Markov decision processing.

first order logic: relation , object,function
propositional logic: facts
probability theory

planning

Blindfold Hiker Test: We can’t just plan ahead and come up with a whole plan. We need feedback from environment. We have to interweave planning and execution:

environment is stochastic.
multiagent
partial observability
unkown/imcomplete knowledge
hierarchical

When we start, we only have the high level parts of the plan, then during execution that we schedule the rest of low level parts of the plan.

sensorless planning in a deterministic world.

observation updates our believe state. Observation alone can’t create new states, but can partition them into pieces.

project

implement my_air_cargo_problems.py and test it by python run_search.py -p 1 -s 5 in which 5 is uniform cost search/ best_first_graph_search.

my_planning_graph.py

hierachy structure of code:

void main (vector<int> p, vector<int> s){
  void run_search(Problem p, Search_func(),para){
    void show_solution()
  }
}
FluentState(vector<Expr>,vector<Expr>)

string encode_state(FluentState fs,vector<Expr> fluent_map){}  // state 

AirCargoProblem(vector<string> cargos, vector<string> planes, vector<string> airports, FluentState init, vector<Expr> goal){
  vector<Action> get_action()
}

Action(Expr action,array<vector<Expr>,2> precond, array<vector<Expr>,2> effect)

-> represent return type, : represent input type.

expr('At(C1, SFO)') is an Expr object.

precondition is only very small part of the whole initial state.

what next?:

try single Action and Expr, see how it works
try understand various planning algorithms

rubrics: https://review.udacity.com/#!/rubrics/681/view

submittion: https://review.udacity.com/#!/rubrics/681/submit-zip

walk-through: https://www.youtube.com/watch?v=SZ5fw4mEavk&feature=youtu.be

submit requirement when udacity submit

my_air_cargo_problems.py
my_planning_graph.py
heuristic_analysis.pdf (if present)
research_review.pdf (one-page report on three of these developments, highlighting the relationships between the developments and their impact on the field of AI as a whole. Appropriate sources (such as books or magazine or journal articles) should be cited. There’s a lot of resources at the end of AIMA chapter 10 )

AIMA

codes: aima.cs.berkeley.edu

AIMA chapter 2, Intelligent Agents

In a sense, all areas of engineering can be seen as designing artifacts that interact with the world; AI operates at the most interesting end of the spectrum, where the artifacts have signficant computational resources and the task environment requires nontrivial decision making.

As a general rule, it is better to design performance measures according to what one actually wants in the environment, rather than according to how one thinks the agent should behave.

which is better— a reckless life of highs and lows, or a safe but humdrum existence?

rational agent: given percept sequence and built-in knowledge it has, select an action that is expected to maximize its performance measure.

Rationality maximizes expected performance, while perfection maximizes actual performance. The latter is impossible unless you have a crystall ball or time machine.

autonomous: an agent not only relies on its prior knowledeg of its designer; it can use its own percepts to learn and compensate for partial or incorrect prior knowledge.

Task environment: performance, environment, actuatoors, sensors.

“percept,action” table-driven approach is doomed to failture due to super large number of entries, say, $10^{250B}$ much larger than $10^{80}$ (number of atoms in the observable universe).

4 kinds of agent programs:

Simple reflex agents: condition-action rule
Model-based. handle partial observability: maintain internal state. The model of “how the world works”: how the world evolves independently of the agent, how the agent’s own actions affect the world. Uncertainty about the current state is unavoidable, but the agent still has to make a decision—best guess.
Goal-based: external performance measure, crude binary distinction between happy and unhappy states.
Utility-based: use an internal utility function to measure goals in greater details. If there are conflicting goals (e.g. speed and safety), the utility function specifies the appropriate tradeoff. when uncertainty, provides a way in which the likelihood of success can be weighted against the importance of the goals.

Any rational agent must behave as if it possesses a utility function whose expected value it tries to maximize.

learning agent:

performance element: responsible for selecting external actions.
learning element: responsible for making improvements
critic: a fixed performance standard. It can be thought as outside the agent because the agent must not modify it to fit its own standard.
probelm generator: suggest new exploratory actions to discover much better actions for the long run.

state representations:

atomic: inside is an identical black box. Algorithms: search, game-playing, hidden markov models, markov decision process
factored: splits up each state into a fixed set of key-value pairs. Algorithms: constraint satisfaction, propositional logic, planning, Bayesian networks, machine learning.
structured: having thins in it that are related to each other, like a class object. Algorithms: relational databases, first-order logic, first-order probablity models, knowledge-based learning, natural language processing.

More expressive representation is much more concise.

AIMA chapter 10, classical planning

fully observable, deterministic, static environments with single agents

factored representation.

a fluent is a condition that can change over time. In logical approaches to reasoning about actions, fluents can be represented in first-order logic by predicates having an argument that depends on time.

To describe the planning problem concisely, we use PDDL (Planning Domain Definition Language):

initial state: a conjunction of fluents that are ground, functionless atoms.
actions: a set of action schemas that consists of action name, a list of all the variables, a precondition and an effect.
result
goal test

Friday, August 25, 2017

Artificial Intelligence A4, Hidden Markov Model

Probabilities is the cornerstone of AI. Express uncertainty and the management of uncertainty is the key to many, many things in AI.

naive Bayes: incidences are independent to each other.

Need 3 parameters to determine joint probability: P(A), P(B/A), P(B/~A)

The caveat is that B is dependent on the not observable A.

reading: AIMA: Chapter 13.

1st order Markove models only depend on the state immediately preceding them and not a history of states. We don’t necessarily know which state matches which physical event. Instead, each state can yield one or more outputs. we observe the output over time and determine a sequence of states, based on how likely they were to produce the output.

Because the base frequency may change, we use delta frequency and euclidean distance to compare the 2 signals.

transition probability x output probability.

AIMA: Chapter 15.1-15.3

Rabiner’s famous Tutorial on hidden Markov models and selected applications in speech recognition [errata]

Thad Starner’s MS thesis: Visual Recognition of American Sign Language Using Hidden Markov Models [PDF]

The Hidden Markov Model Toolkit (HTK)

Please read Chapter 1 The Fundamentals of HTK (pages 3-13) in The HTK Book (version 3.4) [PDF | HTML].

AIMA: Chapter 15.4-15.6 (provides another viewpoint on HMMs with a natural extension to Kalman filters, particle filtering, and Dynamic Bayes Nets), Chapter 20.3 (hidden variables, EM algorithm)

Huang, Ariki, and Jack’s book Hidden Markov Models for Speech Recognition.

Yechiam Yemini’s slides on HMMs used in genetics (gene sequencing, decoding).

Sebastian Thrun and Peter Norvig’s AI course:

Resources for Segmentally Boosted HMMs

SBHMM project at Georgia Tech
HMM Tool Kit (HTK)
Gesture and Activity Recognition Toolkit (GART; formerly Georgia Tech Gesture Toolkit)

Pei Yin’s dissertation: Segmental discriminative analysis for American Sign Language recognition and verification

HMMs for Speech Synthesis

Junichi Yamagishi’s An Introduction to HMM-Based Speech Synthesis

Heiga Zen’s Deep Learning in Speech Synthesis

DeepMind’s WaveNet

project: build a sign language recognizer

In this project, you will build a system that can recognize words communicated using the American Sign Language (ASL). You will be provided a preprocessed dataset of tracked hand and nose positions extracted from video. Your goal would be to train a set of Hidden Markov Models (HMMs) using part of this dataset to try and identify individual words from test sequences.

https://github.com/udacity/AIND-Recognizer

walk-through: https://www.youtube.com/watch?v=EyTM0e2DlEM&feature=youtu.be

learning notes

It is best to dig into the original asl_data.py to see how to data is wrangled.

asl.df is a dataframe with 15.746 k entries and 7 columns. The rows are from (98,0) to (125,56) . 6 columns are from hands_condensed.csv, 1 column is from speaker.csv

asl.build_training() gives a ssl_data.WordsData object. Each word in the training set has multiple examples from various videos.

feature selection: feature_ground, features_norm, features_polar, features_delta.

The base model is GaussianHMM: https://hmmlearn.readthedocs.io/

Each time the model is trained on a single word. We are mostly interested in the number of hidden states.

training = asl.build_training(features)  # worksdata object
X, lengths = training.get_word_Xlengths(word)
model = GaussianHMM(n_components=num_hidden_states, n_iter=1000).fit(X, lengths)
logL = model.score(X, lengths)

Note that the training set and test set are the same in the above example.

Submission

Once you have completed the project and met all the requirements set in the rubric (see below), please save the notebook as an HTML file. You can do this by going to the File menu in the notebook and choosing “Download as” > HTML. Submit the following files (and only these files) in a .zip archive:

asl_recognizer.ipynb
asl_recognizer.html
my_model_selectors.py
my_recognizer.py

https://review.udacity.com/#!/rubrics/749/view

https://review.udacity.com/#!/rubrics/749/submit-zip

the goal is to get 40% correct or 72 of 178.

features	selector	correct/178	time
ground	SelectorConstant	59	24
ground	SelectorCV	59	82
ground	SelectorBIC	67	63
ground	SelectorDIC	71	174
polar	BIC	69	64
polar	DIC	75	187
delta	DIC	63	176
norm	DIC	68	200
norm	BIC	67	66
norm	CV	67	81
norm-delta	BIC	78	71
norm-delta	DIC	78	201

number of free params

In this project, however, we are using the “diag” in the hmmlearn model and we are not specifying starting probabilities. Therefore, if we say that m = num_components and f = num_features…
The free parameters are a sum of:
the free transition probability parameters, which is the size of the transmat matrix less one row because they add up to 1 and therefore the final row is deterministic, so m*(m-1)
PLUS
the free starting probabilities, which is the size of startprob minus 1 because it adds to 1.0 and last one can be calculated so m-1
PLUS
number of means, which is m*f
number of covariances which is the size of the covars matrix, which for “diag” is m*f
WHICH EQUALS :
m^2 + 2*m*f - 1

Monday, July 10, 2017

Artificial intelligence ND A2, isolation

15 Lessons, including 4 Projects:

sudoku game
isolation game: due in 2017.5.15
planning: due in 2017.6.12
hidden Markov models: due in 2017.7.10

This project 2 is much more difficult than I originally expected. But once I accomplish it, I feel so great. I really enjoy the beauty of the minimax algorithm and alpha-beta pruning!

5 Intro to AI

Examples of AI application:

Navigation
tic tac toe game
Monty Hall problem: prior vs posterior probability

We often tend to use the word intelligence to describe things that we don’t know how to explain. Once we can explain them, we just say that they’re algorithms or formulas. They’re only intelligent if we can’t explain them.

We consider ourselves intelligent because right now we don’t fully understand how our brain work.

Equating intelligence with human behavior will pretty much rule out all artificial systems. Besides, even humans are not great at all kinds of tasks.

Intelligence should be defined within the context of a task.

agent (only the software part)
environment (including the physical body of the agent: sensors, motors)

cognition is perception and actions.

Environment states:

fully observable vs partially observable
deterministic vs stochastic
discrete vs continuous
benign vs adversarial

Bounded Optimality. An intelligent agent is one that takes actions to maximize its expected utility given the desired goal. Given these constraints, we can not expect an agent to always behave optimally. But we can come up with a level of performance or bound that we desire the agent to meet.

6 Intro to Game playing

Course teachers:

Peter Norvig: search and logical planning
Sebastian: probability and Baynes nets
Thad Starner: the rest

Main topics:

Adversarial search
minimax algorithm
alpha-beta pruning
evaluation function
isolation game
multi-player, probabilistic

7 isolation game

depth limit search.

limit time to 2 seconds per move. If the computer speed is 2GHz, and the average branching factor is 8, then we have a depth limit of 9.

Prof.

I always like to test my assumptions because they are often wrong.

quiescent search

24-27: search depth=3. The first moves on the edge have the best score.

28-29: search depth=2. The first moves on the center have the best score.

A state of quiescent: the recommended branches are not changing much.

iterative deepening means computer play always has an answer ready in case it runs out of time and it can search as far as possible within its time constraints.

It doesn’t waste that much time. Because of the exponential nature of the problem, the amount of time is dominated by the last level search.

If we limit the total time a player can take for the entire game, a player will want to search deeper in some parts of the game and shallower in other parts of the game.
A strategy for how deep we want to search: standard initial moves for the beginning of the game, then search deeper in the middle, use less time toward to the end.
6.Horizon Effect: the computer may not see deep enough, goes to the local optimum.

change evaluation function: Number of my move - 2*Number of an opponent move

Shelly: the only way to really know is to try lots of variants of evaluation functions and see which ones are the best.

The strategy for an adversarial game is min-max algorithm, which has a run time of $branch^{depth}$

Alpha-Beta Pruning:

our goal is to play the game, not keep a list of all the equally good moves on a given level. It means we don’t need to look at most of the nodes.
run time is $branch^{depth/2}$ if the nodes are ordered optimally with the best moves being first, or $branch^{0.75depth}$ for random ordering.

Many problems in artificial intelligence are exponential in time or space. In fact, NP hard problems are so common in AI that researchers joke that AI is the study of finding clever hacks to exponential problems. Once the solution is found or computer is fast enough to address that particular problem, the world no longer thinks the problem belongs to artificial intelligence.

AI consists of all the NP hard problems that have not yet been solved.

What is exciting about AI is you get a chance to work on everyday problems that can improve people’s lives.

searching complex games: reading AIMA 5.3-5.4

tricks for 5x5 isolation:

2nd player can always win
1st play’s best strategy is to place the center and then play reflection
there are 8 positions you can’t play reflection.

project

starting code: https://github.com/udacity/AIND-Isolation

my solution: https://github.com/jychstar/NanoDegreeProject/

starting code breakdown:

isolation.py: 1 class Board, 18 class methods.

Note the board index is numbered vertically.

class Board(Player p1,Player p2, int width=7, int height=7)
    vector<mix> _board_state(width*height+3,0); // reserve 3 for player state
bool is_winnner(Player p)
    return (p==inactive_player && active player no legal move);
bool is_loser(Player p)
    return (p==active player && has no legal move);
int get_legal_moves(Player p)
    return __get_moves(get_player_location(p));
array<int,2> get_player_location(Player p)
  int index = ...; //identify player and the move
  return (index%height, index/height);
vector<array<int>> __get_moves(array<int,2> loc)
  // propose 8 positions for L-shaped motion, like a knight
  // only return the legal one
bool move_is_legal(array<int,2> move)
  int idx = move[0] + move[1] * height;
   return (0 <= move[0] < height && 0 <= move[1] < width && _board_state[idx] == None);
void apply_move(array<int,2>)
  int idx = move[0] + move[1] * height;
 // add board state, who's last player, last player's move
// swap active_player and inactive player
tuple<Player,vector<array<int,2>>,string> play(time_limit=150)
    curr_move=_active_player.get_move(game, time_left);
    move_end = time_left();
    if move_end < 0:
      return _inactive_player, move_history, "timeout";
    if curr_move not in legal_player_moves:
      if len(legal_player_moves) > 0:
          return _inactive_player, move_history, "forfeit";
        return _inactive_player, move_history, "illegal move";

sample_player.py: 4 methods, 3 class: RandomPlayer, GreedyPlayer, HumanPlayer, 1 main method

float null_score(Board game, Player p)
    if (p win) return float("inf");
    else if (lose) return float("-inf"); 
    else return 0.;
float open_move_score(Board game, Player p)
  return game.get_legal_moves(p).size();
float improved_score(Board game, Player p)
  int own_moves = game.get_legal_moves(p).size();
  if (p == _active_player)    int opp_moves = game.get_legal_moves(_inactive_player).size();
    else int opp_moves = game.get_legal_moves(_active_player).size();      
    return float(own_moves - opp_moves);
float center_score(Board game, Player p)   
      int w= game.width / 2., h =game.height / 2.;
    int y, x = game.get_player_location(player);
    return float((h - y)**2 + (w - x)**2);
class RandomPlayer()
  array<int,2> get_move(Board game, Player p)
    auto legal_moves = game.get_legal_moves();
    if (legal_moves.size()==0) return make_tuple(-1,-1);
    return legal_moves[randint(0, len(legal_moves) - 1)];
class GreedyPlayer()
  array<int,2> get_move(Board game, Player p)
       legal_moves = game.get_legal_moves()
        _, move = max([(self.score(game.forecast_move(m), self), m) for m in legal_moves]) // ? 
        return move;
class HumanPlayer()
  array<int,2> get_move(Board game, Player p)
    legal_moves = game.get_legal_moves()
    // print board state and possible move, manually choose
int main()
    //RandomPlayer and GreedyPlayer play against each other

tournament.py: 3 methods, 1 main method

Agent = namedtuple("Agent", ["player", "name"])
int main()
  array<Agent,4> test_agents;
  array<Agent,7> cpu_agents;
  play_matches(cpu_agents,test_agents,5);
    return 0;
array<int,2> play_round(cpu_agent, test_agents, map<Agent,int> wins, int num_matches)
     // each round is num_matches*2*len(test_agents) games  
     return {timeout_count, forfeit_count};                  
int update(map total_wins, map new_wins)
    // absorb new wins into total wins
      return total_wins; // no need to return if use &
void play_matches(cpu_agents, test_agents, num_matches)

game_agent.py: 3 custom_score method, 3 class: IsolationPlayer, MinimaxPlayer, AlphaBetaPlayer.

Notes:

dirty jobs are done by get_move of individual class.
RandomPlayer and GreedyPlayer from sample_player.py only deal with the first level search, so there is no timeout issue.
For depth-limited search, we have to check the time before going to the next level. TIME_LIMIT_MILLIS is set to 150 in isolation.py. time_left function is lambda defined in Board.playmethod and passed into Agent.get_move. In parent class IsolationPlayer, TIMER_THRESHOLD is set to 10, so an answer should be returned if time_left is smaller than this value.

function imbeded relation

play_matches(cpu_agents, test_agents, NUM_MATCHES):
    play_round(cpu_agent, test_agents, win, NUM_MATCHES):
        game.play(time_limit=TIME_LIMIT):
            game._active_player.get_move():
                player.minimax(game, depth):
                    game.get_legal_moves():
                        game.score(game.forecast_move(m))

http://giocc.com/concise-implementation-of-minimax-through-higher-order-functions.html