Lecture 2: Analysing Algorithms
See the sample exam. The stuff with
Lecture 3: Search Problems
Lecture 4: Local Search MethodsReward, error surface. Imagine your solutions as co ordinates, and you'll get a landscape.
Lecture 5: Partial Solution Search, I
Lecture 8: Probability
Transforming Distributions
Randomness and Information
Monte Carlo Simulation
There's part A, and part B
Part A
This is basically, say you have a blob, and it's too hard to work it out algebraically, so you draw a box around it, and poke it randomly, counting how many hits you get inside the blob.
Part B
is, you have a line on a graph, and you want to find the points between x and y on the x axis. You poke at random points along the line, and keep track of the average of the results you get from f(x) - the function. And you take that average is the area under the graph.
(see the napkin)
You select it randomly because this covers you if there's some regular pattern in the function.
Simulated Annealing
Temperature: The higher the temperature, the more likely the search will accept a move to a worse position in the search space (it always accepts improvements).
The search is done multiple times for progressively cooler temperatures, so the simulated annealing algorithm becomes less likely to accept worse positions as it continues to search.
Evolutionary Algorithms
Lecture 15: The Church-Turing Thesis
Lecture 16: P and NP
Lecture 17: NP CompletenessProbabilityDistribution transformation
discreet random numbers
distribution
continuos, how is it different from discreet
Real numbers and integers are infinite, but the reals are a larger infinite!
Heuristic Evaluation.
This has to do with making a kind of educated guess as to the 'best next' state to visit.
L'Hospital's Ruleas n->inf f(x) / g(x) does something,
and this rule says as n->inf f'(x) / g'(x) does the same thing, which is handy when function are nasty.
Search"Solution"This term is used somewhat ambiguously in the subject. Generically means any point in the state space of the problem.
It could also mean:
Optimal SolutionA solution that maximises some evaluation criterion or function.
Candidate SolutionAssignment of values to variables which may or may not be optimal or matching a common goal. Hey, candidate solutions may even violate hard constraints! Which means they're infeasable - they're just not proper.
Constraints
Hard ConstraintsStateFor a system identified by a set of random variables, a state is a joint assignment of values to those variables.
For example:
Board Games The position on the board, plus whose turn it is.
TSP complete soln/ space search a permutation
TSP partial soln/ space search a permutation over a subset of cities.
Hmmm I'm not really getting these...
Apparently, The state space of a problem means the space of all possible joint assignments of values to variables.
Search Space
This is the all-mighty space which a search algorithm searches. Typically (but not always) the same as the state space for a problem. Often, we can do away with much of the search space because it's impossible for our answer to be in that space.
Local Search
This one is kind of intuitive. It works the way any old computing student would answer if you asked them how they think
Begins somewhere in the search space and looks for something better in the neighbourhood of that state, until we can do no better.
Can be either Deterministically greedy - examine every state, selecting the best or Stochastically greedy - randomly select some subset from the neighbourhood, selecting the best.
Complexity Theory
There's a handy diagram of the grand hierarchy of complexity theory. Although the one in the supplimentry notes isn't that clear... No wait I get it. It's not just badly formed LaTeX it's just, well, minimal.
We use this to give labels to problems (like, search problems), to put them in their category, and thus derive all kinds of delicious information about them from it.
Lets Read it!
NP-HARD Any NP problems that can be transformed within poly-time.
An
NP-Hard problem is NP-Complete if it also happens to be NP (that's that little overlap there) But some
NP-Hard problems are too hard to be in NP, meaning that their solutions require more than poly time to check.
NP problems which can be checked in polynomial time.
EXP stands for Exponential
P stands for polynomial - problems which can be solved in polynomial time.
Linear stands for .. whoops
A problem is in P if it can be solved in polynomial time on a deterministic Turing machine (think: ordinary computer). It's in NP if it can be solved in polynomial time on a nondeterministic Turing machine. That's harder to explain, but Cook proved that any problem in NP is reducible to 3-SAT, so it's in NP if it's reducible to (i.e. not harder than) an NP-complete problem.
Difference between checking and solving? (Robyn)
Use the SAT as an example.
All we need to do to check a proposed solution is try out each clause until one fails, or we get to the end, in which case, hooray! We've found our solution. We can do this in polynomial time.
But to actually solve a SAT, you have to go through each instantiation in the search space and apply that check routine to all such possible solutions. The solution space for the SAT is 2^n. Woweee Rik! We've got ourselves an NP-Hard problem where the best known solutions are exponential.
All this is really saying is that a check is elemental to one item in the search space, like our eval function, and to solve is to get the whole solution to a problem.
Cooks Theorem
Speaking quite roughly what it does is turn any nondeterministic
Turing machine into input for the satisfiability problem, and shows
that if the machine halts in polynomial time, then this transformation
can also be done in polynomial time.
Since the definition of an NP problem is a problem that can be solved
in polynomial time on a nondeterministic Turing machine, this shows
that any NP problem can be transformed into input for the SAT problem
(in polynomial time), and so can't be harder than SAT. (Joel)
Kolmogrov
iii) F - Kolmogorov complexity - length of a computer program to compute something, like sqroot 2. Or π. It's not - you can write something finite program to do it. Related to randomness. Shortest possible program to produce something. If that program is shorter than the string, then you have compressed the string.
Kraft's Inequality
Says that you can actually place proper codes in the tree so that its prefix.
Prefix TreesGreedy SearchMay be deterministic; making it blind.
StochasticA stochastic process is one whose behaviour is non-deterministic - the next state of the environment is not fully determined by the previous state of the environment. Stochastic search incorporates some degree of randomness.
At some decision points, we'll decide what to do next, yes, randomly!
Just means random - algorithms that have some random aspect to them. (Politics?) Just one bit aglorithmic RULE with some random variables. It would be predictable if all the pollys made the same decision in a predicable manner. But they don't so the algorithm as defined by each law becomes random!
There are degrees of randomness.
Examples:
Stochastic Hill Climber
2-optYou might remember this one from that online java example of it. Looked at it a lot while trying to fathom assignment 2 so many Sundays ago.
Basically it randomly swaps two non adjacent edges
Greedy search with random restarts,
GSAT too
Yes, it's that random restarting that's making it random.
Simulated Annealing
Evolutionary AlgorithmsMy personal favourite!
Genotype is what is inherited from gen to the next
Phenotype whatever develops out of that mating. Like in our case, height, intelligence, hair colour etc.
Normal Distribution
Deterministic SearchIs a search that will run the same steps every time it is given the same input. (A stochastic search will not).
Examples:
Dynamic Programming
Exhaustive Search
Dijkstra
BFS
DFS
A*
Greedy Search
Gradient DescentJoel gave a good description for this one:
Normal local search finds a local minimum by (possibly randomly)
walking around looking for points that are lower.
Gradient descent does not immediately walk around; instead, it "looks"
at the land (search space) and figures out (usually mathematically)
which direction is downhill, and *then* walks.
It finds a local minimum much faster, but it's not always possible to
do this kind of calculation, which is why we still need ordinary local
search.
Hill Climbing
GSAT's inner loop
TABU searchTakes the best state in the neighbourhood, with the neighbourhood reduced by tabu considerations.
GreedyLimit it by the number of solutions tested. They store the best answer so far, and when you reach the limit, return that.
LOTS of informal proofs in the lecture notes, so you can read them there.
Note that neither deterministic local search not stochastic algorithms can be complete. Local search get stuck at local optima (oh dear!) and the latter take a random step missing the global optimum. Supplementary Notes 3
Huffman Code
This is the only stuff I got down in lectures on Huffman code:
Wikipedia is ok, but this is also a good description.
a b c d e f
2/9 1/9 1/9 1/6 1/3 1/18
e a d b c f
| | |
5/19
What's That!?
Anyway, it seems like a nifty (say, genius) way of creating a table of symbols which can then be used to compress data in a lossless way!
It uses a variable length code table based on the estimated probability of occurrence for each possible source symbol. It's so nifty it uses a specific method to generate the code that is prefix free, and expresses the most occurring characters with a smaller strings of bits. You have to have a good estimation of the true probability value of each input symbol to get a good compression.
What a beautiful algorithm. It's so simple yet efficient.
Is mp3 compression really lossy? If the only information taken from the recording is that that human ears can't hear anyway, is that really information loss? After all, if there's music playing in a forest, and there's nobody there to hear it, is there any music at all?
Huffman encoding is really useful. It's NOT lossy. Compression is really useful.
Busy Beaver
Being an example a noncomputable function is pretty much all it's for.
There are two versions of the function, that both take a weeny little parameter,
n.
One version is the largest number of 1s that can be written by a Turing machine with n states.
The other version is the largest number of steps (execution time) that can be taken by a Turing machine with n states.
In both cases only Turing machines which halt are allowed to be
considered.
(lecture on church turing)
Sigma (n) is not computable
produces biggest string of 1's
The point of the function is to have a direct and clear illustration of the non computable functions.
We can produce an enumeration of all turing machines.
AUTOSOUP/FLIB
Cantors Argument
there are infinite sets which cannot be put into one-to-one correspondance with the infinite set of natural numbers.
Maximal clique problem
Aleph-nullIn case this is confusing anyone, aleph-null is "countable infinity",
i.e. how many natural numbers, or integers, or rationals there are.
Euler problem
the perturbation of bridges (as opposed to city)
Perturbation operators
For the maximal clique problem: Like 2-opt or mutation or any number of things you can do to evaluate what to do next while searching.
Stochastic