Note: Due to time constraint, I have to pause AIND at the planning section. If I have time in the future, I may restore my study in the following:
- pac-man lab
- simulated annealing lab
- airport transportation planning project.
Peter Norvig: AI is all about figuring out what to do when you don’t know what to do. Regular programming is writing instructions to make the computer do what you want when you do know what to do. In AI, it’s for when you don’t know. search is one of key areas where we can figure out what to do by planning a sequence of steps, even when we have no idea what the first step should be until we solve the search problem.
Rubik’s Cube: God’s Number is 26 in the Quarter-Turn Metric.
- breadth first search (shortest)
- uniform cost search (cheapest)
- depth first search (least storage space, fastest computing time)
A* search
Also called best estimated total path cost first
only expanse the minimum value. f = g+h
g(path) = path cost so far
h(path) = h(s) = estimated distance to goal = straigh line distance
g is to keep the path short, h is to keep us focus on the goal.
A search finds the lowest cost path if *h(s) < true cost. h is optimistic.
A problem solver that uses a heuristic function given to it really is not intelligent at all.
crossing out parts of the rules is called generating a relaxed problem.
searching is one of the most fundamental methods of solving problems in AI.
Peter: To me, AI is programming a computer to do the right thing when you don’t know for sure the right thing is.
The book was successful because of a fortuitous accident of timing. we happened to come along just as AI was making this transition from Boolean logic to probability and from handcarfted knowledge to machine learning.
lab: pac man
simulated annealing
do the stupid thing first, and only add intelligence when necessary.
global optimum: an element of randomness is what you need
- random restart
- genetic algorithms
- simulated annealing
taboo search: it is taboo if somewhere you have already been.
stage algorithm: keep a list of all local maximum and try to get the general shape of the problem,try to predict where a new maximum might be, given the areas I haven’t explored yet.
local beam search/ stochastic beam search: keep track of k particles. At each time frame, we randomly generate neighbors of these particles and keep the best ones for the next iteration. There are messages sharing between these particles.
genetic algorithm is a fancy version of beam search that happens to make a nice analogy to biology. some people call it the 2nd best solution to any problem.
We’ll see these ideas repeatedly in aI, in techniques like particle filters, pattern recognition with hidden Markov models, and Monte Carlo Markov chains. Sometimes let them go and embrace randomness is the key to the solution.
lab
Follow the instructions to implement Simulated Annealing from AIMA
constraint satisfaction
techniques:
- backtracking
- forward checking
- minimum remaining values
- least constraining value
problems with preference constraints are called constraint optimization problems and are with solving linear programming.
Examples:
- car assembly
- job scheduling in factories
- class scheduling
- spreadsheets
- transportation scheduling
- floor planning
Constraint Satisfaction Problems by solving the N-queens problem using symbolic constraints and backtracking search: https://github.com/udacity/AIND-Constraint_Satisfaction
complete the constraints for the N-Queens CSP and implement Backtracking-Search from AIMA (chapter 6)
logic
AI started in the 50s~70s with the idea that what we needed was better language and systems for writing down logical statements. Once we have done enough of them, we should be able to do anything of them. Some progress but 2 major issues:
- the world is uncertain. We didn’t get good algorithm to deal with probability until mid 1980s
- too laborious to try to write down everything by hand.
Today, logic is still useful in many domains where we can eliminate much of the uncertainty of real world and solve significant problems. Like FedEx planning all the deliveries of its packages.
Expert system.
future of planning algorithms:
- Learning from examples would be an important area.
- transfer learning across domain.
- interactive planning
- explainable, adaptable.
value iteration is a key concept for Markov decision processing.
- first order logic: relation , object,function
- propositional logic: facts
- probability theory
planning
Blindfold Hiker Test: We can’t just plan ahead and come up with a whole plan. We need feedback from environment. We have to interweave planning and execution:
- environment is stochastic.
- multiagent
- partial observability
- unkown/imcomplete knowledge
- hierarchical
When we start, we only have the high level parts of the plan, then during execution that we schedule the rest of low level parts of the plan.
sensorless planning in a deterministic world.
observation updates our believe state. Observation alone can’t create new states, but can partition them into pieces.
project
implement
my_air_cargo_problems.py
and test it by python run_search.py -p 1 -s 5
in which 5 is uniform cost search/ best_first_graph_search.my_planning_graph.py
hierachy structure of code:
void main (vector<int> p, vector<int> s){
void run_search(Problem p, Search_func(),para){
void show_solution()
}
}
FluentState(vector<Expr>,vector<Expr>)
string encode_state(FluentState fs,vector<Expr> fluent_map){} // state
AirCargoProblem(vector<string> cargos, vector<string> planes, vector<string> airports, FluentState init, vector<Expr> goal){
vector<Action> get_action()
}
Action(Expr action,array<vector<Expr>,2> precond, array<vector<Expr>,2> effect)
->
represent return type, :
represent input type.expr('At(C1, SFO)')
is an Expr
object.
precondition is only very small part of the whole initial state.
what next?:
- try single Action and Expr, see how it works
- try understand various planning algorithms
submit requirement when
udacity submit
my_air_cargo_problems.py
my_planning_graph.py
heuristic_analysis.pdf
(if present)research_review.pdf
(one-page report on three of these developments, highlighting the relationships between the developments and their impact on the field of AI as a whole. Appropriate sources (such as books or magazine or journal articles) should be cited. There’s a lot of resources at the end of AIMA chapter 10 )
AIMA
codes: aima.cs.berkeley.edu
AIMA chapter 2, Intelligent Agents
In a sense, all areas of engineering can be seen as designing artifacts that interact with the world; AI operates at the most interesting end of the spectrum, where the artifacts have signficant computational resources and the task environment requires nontrivial decision making.
As a general rule, it is better to design performance measures according to what one actually wants in the environment, rather than according to how one thinks the agent should behave.
which is better— a reckless life of highs and lows, or a safe but humdrum existence?
rational agent: given percept sequence and built-in knowledge it has, select an action that is expected to maximize its performance measure.
Rationality maximizes expected performance, while perfection maximizes actual performance. The latter is impossible unless you have a crystall ball or time machine.
autonomous: an agent not only relies on its prior knowledeg of its designer; it can use its own percepts to learn and compensate for partial or incorrect prior knowledge.
Task environment: performance, environment, actuatoors, sensors.
“percept,action” table-driven approach is doomed to failture due to super large number of entries, say, much larger than (number of atoms in the observable universe).
4 kinds of agent programs:
- Simple reflex agents: condition-action rule
- Model-based. handle partial observability: maintain internal state. The model of “how the world works”: how the world evolves independently of the agent, how the agent’s own actions affect the world. Uncertainty about the current state is unavoidable, but the agent still has to make a decision—best guess.
- Goal-based: external performance measure, crude binary distinction between happy and unhappy states.
- Utility-based: use an internal utility function to measure goals in greater details. If there are conflicting goals (e.g. speed and safety), the utility function specifies the appropriate tradeoff. when uncertainty, provides a way in which the likelihood of success can be weighted against the importance of the goals.
Any rational agent must behave as if it possesses a utility function whose expected value it tries to maximize.
learning agent:
- performance element: responsible for selecting external actions.
- learning element: responsible for making improvements
- critic: a fixed performance standard. It can be thought as outside the agent because the agent must not modify it to fit its own standard.
- probelm generator: suggest new exploratory actions to discover much better actions for the long run.
state representations:
- atomic: inside is an identical black box. Algorithms: search, game-playing, hidden markov models, markov decision process
- factored: splits up each state into a fixed set of key-value pairs. Algorithms: constraint satisfaction, propositional logic, planning, Bayesian networks, machine learning.
- structured: having thins in it that are related to each other, like a class object. Algorithms: relational databases, first-order logic, first-order probablity models, knowledge-based learning, natural language processing.
More expressive representation is much more concise.
AIMA chapter 10, classical planning
fully observable, deterministic, static environments with single agents
factored representation.
a fluent is a condition that can change over time. In logical approaches to reasoning about actions, fluents can be represented in first-order logic by predicates having an argument that depends on time.
To describe the planning problem concisely, we use PDDL (Planning Domain Definition Language):
- initial state: a conjunction of fluents that are ground, functionless atoms.
- actions: a set of action schemas that consists of action name, a list of all the variables, a precondition and an effect.
- result
- goal test