Get even by knowing the odds – or the odds will be against you.

If I toss a well weighted coin in random conditions (ie no cheating) the probability of it falling heads or tails are half each way or 50:50 (50% each way). It can also be stated as 1 chance in 2 or ½.  They are all different ways of saying the same thing.

Probability is expressed slightly differently to odds. These are written to show how much you will get back (or win). So a 50% probability would have odds of 1:1. You bet 1 and win 1 plus your money back.  Odds of 1:5 mean you have a 1/6 chance of winning.  In maths, odds are called the odds ratio which is the chance an outcome will occur given a particular exposure.  To go from probability to odds the formula is

numerator:(denominator-numerator) or the (chance of success):(chance of failure)

So if you have a 5/13 chance of winning the odds are 5:8

I will stick to using a fraction for probability (eg 1/6) and : for odds (eg 1:5)

Who knew betting on horses was so complicated? How do non mathematicians intuitively know it?

For a single dice (die) the chance of any particular number coming up is 1/6. There are 6 different numbers (possibilities) with each having the same chance. If I have 2 dice, say a red and a blue one then the probability of throwing a 1 on the blue dice and a 5 on the red dice is 1/36.  It is the same odds as for any other combination. When I throw the blue dice first there are 6 possibilities  or 1/6 chance of a particular number. When I throw the second (red) dice there are 6 possibilities for each of the possibilities on the first dice throw.  So it is 6×6 possibilities with only 1 the desired outcome.  If both dice are the same colour then there are 2 ways to get my desired outcome, either a 1 and a 5 or a 5 and a 1, so my chance of success halves to 1/18.

This is called unconditional probability and works for 2 independent, random events. Conditional probability is a whole heap more complicated, and this is where most people become unstuck (including me!).

Let’s stick with the easy stuff first. If I am playing craps with the two identical dice then I need the probability for the sum of each number for any throw.

2 3 4 5 6 7

3 4 5 6 7 8

4 5 6 7 8 9

5 6 7 8 9 10

6 7 8 9 10 11

# 6

7 8 9 10 11 12

The table above shows all the possible outcomes (36 of them). There is only 1 way of throwing a 2 or 12 so these both have a probability of 1/36.  There are 6 ways of throwing a 7 so the probability is 6/36 = 1/6. 1 die             2 dice                 3 dice                                     many

We can plot the result for more and more dice to get another limit, the bell shape curve at the end.

The central limit theorem (CLT) states that, when independent random variables are added, their sum tends toward a normal distribution, (also called Gaussian distribution or bell curve if it is a continuous function). Binomial distribution is much the same but has discrete points. They are all essentially the same thing.

So with random events adding together to form a normal distribution, these are common in biology where random genetic effects are summed. Things like height, weight etc in a large crowd will be approximately normally distributed.

To define a distribution we usually do it by determining the mean (average in this case) value and a width or tightness of the curve. This is defined as the standard deviation where 68% of events are within 1 SD and 95% within 2 SDs.

In the quantum world we get many events occurring which sum into a single event in our macro world. If an event is nearly certain to occur then it will be a tight distribution tending toward a spike. Probability distributions are the way we go from quantum to macro. Schrodinger’s Wave equation is a probability distribution.

If we multiply our 2 dice instead of adding we get a distribution that, in the limit becomes lognormal with few high numbers and many small numbers. 1 2 3 4 5 6

2 4 6 8 10 12

3 6 9 12 15 18

4 8 12 16 20 24

5 10 15 20 25 30

# 6

6 12 18 24 30 36

Lognormal distributions are common in geology and geography where parameters are generally multiplied. Things like lake size, oil and gas pool size, rock porosity are all lognormal distributions (or close to).

Many things can be modelled on these basic distributions, but in reality they are often complicated, mixed versions of these. You might read about a number of different distribution types including Pareto, Beta, Chi squared etc.  These are all useful in different statistical situations.

So a multitude of independent random events can be modelled as various distributions. The trick, and it is often difficult to determine, is to work out how your particular data is distributed.  The various gambling games can be described quite simply in terms of distributions.  Casinos and poker machine owners study these probability distributions and know how to ensure that they profit from them, which means statistically you lose!  You will win some of the time but not in the long run.  If your maths is good enough you can make small percentages in some games.  If playing poker you need a bit of maths, or a very good memory.  The latter is always handy.

From a 52 card pack the chance of drawing 2 aces is 4/52 * 3/51 = 0.45 %. There are 4 aces you can pick from 52 cards for your first pick and 3 aces from 51 cards for your second.  If you can keep track of exposed cards you can reduce your odds lower down in the pack.  Drawing any pair is 1 * 3/51 or 5.9% as any card will do for the first pick.  You can use this method to quickly calculate (approximate) your odds, or probability of success.  It’s probably easier to read up on all the probabilities and memorise them first.

I mentioned conditional probability. This is where events are no longer independent.  Health statistics are a good example.  You might do a blood test to find out if you are iron deficient, or have cancer, or diabetes for example.  Here you have two related things going on.  First there is the question of what percentage of people have an iron deficiency (and those who do not) and then there is the accuracy of the test (how many false positives and how many false negatives).

Now we need a couple of formula to determine what is going on, and this gets very complicated.

The first is the Conditional probability equation So the probability of B given A has already occurred = Probability of both and B occurring / Probability of A.

We also need Bayes’ formula Phew. Lucky we don’t have to remember all these details.  Your handy computer will do it for you. P(Ac) here is the complement of P(A) or the probability of A not occurring.

So if 10 % of people have an iron deficiency (90%) don’t and the test is 80 % effective (20 % not) then your chance of having an iron deficiency given a positive test is only 30%. Just look up Bayesian calculator and enter your own numbers if you want to have a go.  The results are often quite surprising and hard to fathom.  Just remember that you don’t have to understand the details but you can accept the results (at least first up).  If you want to get into the details you may think up a better methodology and become a famous mathematician.  It is probably easiest just to go with the flow.

This sort of analysis can help you understand medical probabilities, spread of viruses, how vaccinations may work etc.

Combinations and Permutations

Calculating probabilities is largely a question of defining the issue at hand. It quickly becomes complex!  Take the Birthday problem that always gets people puzzled.  How many people do you need in a room to have a 50% chance that any 2 people in the room have the same birthday. The surprise is that it is the very low number of 23 people.  If the question was how many people to have the same birthday as you, then the answer is more like 250 people as this is a more specific probability. from Wikipedia

The probability comes out as a distribution of all the possible combinations of birthdays amongst any number of people in the room. You can see here that 23 gives about 50 % chance and about 60 people gives a near 100 % chance.

There are also many different ways to approximate the probability, with Wikipedia showing one using Euler’s number e (above).  A simpler one is perhaps looking at 23 people in the room, which gives 23*22/2 =253 different combinations of 2 people and 364/365 possible birthdays to give a  =49.95 percent chance of not having identical birthdays.  It’s pretty easy to show that the results are real by running Monte Carlo style trials.  In these the computer program randomly selects dates and plots the results from millions of samples.

… and the odds of drawing a pair in 2 cards from a full pack?  First card has a probability of 1 ( you are certain to get a card).  There are 3 cards which will match your first left in the pack and 51 cards total left.  This gives you 3/51 = 1/17 probability and odds of 1:16.