Formal Methods II

CSE3305

Information Theory

0 comments Published Friday, April 28, 2006 by Mathieu.

Information Theory
Core of Computer Science!
It is the basis of coding theory. Which is very practical, storage, transmission, telecommunications etc, Which is where it arose.

Optimal Codes
Maximally efficient codes.
You minimise the average code length
Huffman codes
Simple alg.
Will construct a code that is optimal. And will therefore satisfy shannon's rule.
As you increase the block size, the huffman code becomes more efficient.

Next lecture:

Entropy
"The average information in a code"
It is non negative.

The more uncertainty the greater the entropy.

Channel Capacity

Randomness

0 comments Published Wednesday, April 26, 2006 by Mathieu.

Prob 3:
w is a number
The questions are not about c and h, they are presumed to be given relative to some problem.
The answers are intended to be in when and only when w is in the interval a, b where a and b is specified, of course.
The answer might be never!

Problem 1, part 3.

Problem 4
i.
ii. are too easy
the heart of the 問題 is in iii.

Lecture 10.

Cover and Thomas can be used as a reference to this section of the lectures.

What is randomness? An event is random iff it is produced by an indeterministic process.

But then,
Everything is random
Or nothing is. (there is no such thing as an indetermanism)

We'll leave that to philosophy and physics.

We think about the computer science side of things.

The star is to indicate how long these bit strings might be.

ごく普通の英語で説明してみたらどう？もっと覚えるかもしれない。

You can compress the bit string that represents pi, it's just a program that generates the string (it's got to be smaller than infinity!)

Algorithmic randomness
Computable numbers (at large expansions) are by definition algorithmically non-random!

Cantors Diagonal Argument
Something every CS should know!

You just have to imagine that you have a list of all real and bit string numbers.

Turing - "There are Non computable numbers"
We suppose that all these numbers are computable.
There has to be a program to compute it.
Let's enumerate all possible programs.
We can't enumerate their output, but we can pretend.

We can't actually get a real random number, because they're uncomputable. So we have to put up with almost or good enough randomness that we can compute.

We more random something is, the less information there is.

English and Logic are inefficient because there is more than one way to say the same thing. Which is an indirect way of saying that it can't satisfy shannon information.

Assignment 2 Hints

0 comments Published Sunday, April 23, 2006 by Mathieu.

(i) Theoretic union for everything
(ii) Basically working with limits and infinite sequences etc.
It doesn't have to be a proof - if it's a clear argument then that's ok.

TSP problem
In each case you figure out the best and worst case, define in those terms, and exemplify by showing how it could be the best case. Find a graph where that happens and then indicate how that happens.

Problem 3.
Take the evel function, throw in a weight. the expresions are
Complete: If there is a definition, in the search space, it will find it eventually.

Problem4
Fuel costs etc. YOu can have a heuristic graph next to it. When you're doing with

Lecture: Probability 2

0 comments Published Sunday, April 09, 2006 by Mathieu.

Expectation Functions or operators

Law of large numbers.
In a statistical context, laws of large numbers imply that the average of a random sample from a large population is likely to be close to the mean of the whole population.
(WPedia)

Binomial Distribution
Coin tosses
random bit strings
computer storage
computer programs (when translated into computer programs)

"In probability theory and statistics, the binomial distribution is the discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p. Such a success/failure experiment is also called a Bernoulli experiment or Bernoulli trial. In fact, when n = 1, then the binomial distribution is the Bernoulli distribution. The binomial distribution is the basis for the popular binomial test of statistical significance."
WP

B(n, p)

poisson Distribution
The number of visitors to a web page, the number of objects sent in a Java program.
The sole parameter is Lambda.
Lambda = 5:
Means that the poisson peaks on 5.

Uniform Distributions (the most boring!)
Can have a discreet case or continuous case.

Normal Distribution

The central limit therom shows that when you're modeling preactically anything you can get the mean (which is usually the most important thing you want).

Pseudo-random Numbers
We live as CS students in the land of fake - so we have to somehow generate random numbers for our simulations.

Lecture: Probability

0 comments Published Thursday, April 06, 2006 by Mathieu.

On the other side of that, the frequencies aren't definitive.

EG
You're flipping a coin. It's got a bias. You want to find out what that bias is. So you toss the coin 10 times. Even if you got 10 tails, it wouldn't garuntee the that that is the bias. There is no garuntee bw/ the frequencies and the actual probability.

waffle waffle.

We believe things with varying degrees of strength.

Frequentist: P(h) is (somehow) related to the
frequency with which things like h turn out to be
true.

Subjective: P(h) expresses the agent’s strength
of belief in h.

Both interpretations are quite reasonable, and it's not really right to say one is right over the other.

There is all sorts of uncertainty. By moot point we're going to not address it - it's open (see Cognitive Psychology).

Venn’s identity “probability = frequency” clearly fails
in small samples.

Frequencies vary all over in small samples
But in large samples they vary hardly at all

In small samples you can't really be told very much!

As you build up 10,000 of these small samples, the law of large numbers will make the combination of the results seem like the actual probability.

Richard von Mises’ Theory
Finite set of outcomes

A place selection is essentially just a bit string of 01101001010101010... that is infinitely large.
It turns out it's inadequate because if we choose the outcomes...

So what's the point?
Why as CS people are we studying probability. Algorithms are deterministic!
Uncertainty is central to the human condition. And CS are intended to the service of humanity. CS to help us cope with uncertainty. Bring on probabilistic reasoning. The application side is one reason to look at probability.

Google and Amazon are ranked. If they're not, they're useless! A lot of problems cannot be solved definitely, or deterministically. You can do probabilistic, probability samples. in order to get probabilistic solutions.

It's about sorting and managing information. Low probablity, you have a lot of information. If the probability is high, you get no new information.

Formal Methods II

Information Theory

Randomness

Assignment 2 Hints

Lecture: Probability 2

Lecture: Probability

Last posts

Archives

Links

Formal Methods II Links