4 Activity 4

The Binomial Distribution is a model of doing something random several times. For example, rolling multiple dice or flipping multiple coins.

If \(X\) represents the number of successes in \(n\) random experiments with a probability of success of \(p\) and probability of failure \(q = 1 - p\) then \[ P(X = k) = nCk \cdot p^k q^{n - k}, \quad \text{Where } nCk = \frac{n!}{k!(n - k)!}. \tag{4.1}\]

For example, if \(X\) is the number of \(6\)s rolled among \(10\) dice then \(p = \frac16\) is the probability of getting a \(6\) and \(q = 1 - \frac16 = \frac56\) is the probability of not getting a \(6\). So \[ P(X = k) = P(\text{rolling } k \text{ sixes}) = 10Ck \cdot \left( \frac16 \right)^k \left( \frac56 \right)^{10-k}. \]

In R, we calculate these probabilities with the dbinom function. E.g. for \(n = 10, p = \frac16, k = 3\):

Note

This can also be written dbinom(3, 10, 1/6) without writing size= or prob= but I find it helps to write those explicitly to help know which number is \(k, n\) or \(p\).

Exercise 4.1 Suppose in basketball, you are able to successfully make a free throw \(70\%\) of the time or \(p = 0.7\).

What is the probability that you miss: \(q = 1 - p\)?
If you shoot \(5\) times, write down the formula (i.e. write Equation 4.1 with the given \(p, q, n, k\)) for the probability that you land \(3\) out of \(5\) free throws.
Using dbinom(k, size=n, prob=p) compute this probability (fill in your values for \(p, n, k\))

4.1 Cumulative probabilities

If we change dbinom to pbinom, we get not the probability of \(k\) successes but the probability of \(\le k\) successes.

E.g. the probability of rolling \(0, 1\) or \(2\) ones among \(5\) dice rolls is

In math notation: \[ P(X \le 2) = P(X = 0, 1 \text{ or } 2) = P(X = 0) + P(X = 1) + P(X = 2) \]

Exercise 4.2

Go back to Exercise 4.1 and now instead of calculating the probability of making exactly \(3\) free throws to the probability of making \(\le 3\) free throws (change dbinom to pbinom).
To get the probability of making \(> k\) free throws rather than \(\le k\) we can add lower.tail=FALSE to the function (lower tail referring to \(\le\)). Compute the probability of making \(> 4\) free throws this way.

Tip

Alternatively, if you have, say, a \(35\%\) chance of making \(\le k\) throws, then you must have a \(65\%\) chance of making \(> k\) throws. These are complementary events. \[ P(X > k) = 1 - P(X \le k). \] In R: that means that 1 - pbinom(k, n, p) is equivalent to pbinom(k, n, p, lower.tail=FALSE).

4.2 Random number generation

We’ve seen how to calculate probabilities. We can also make simulations. For instance, to simulate this experiment of throwing 5 free throws 100 times, we can use the rbinom function. The output is a list of \(k\)’s. E.g. if the output starts 5 4 2 1 ... that means the first simulation had 5 successes, the second 4, the third 2, the fourth 1, etc.

Exercise 4.3

Run this cell and write down the first 10 numbers.
In those first 10 experiments, what percent of the time did you land exactly \(3\) free throws? Is this relative frequency greater than or smaller than the probability you calculated in Exercise 4.1 Q2?
Now change the line to table(rbinom(...)) to make the summary table we looked at in Lab 1. Write down that table.
How many times out of 100 was the number of successes \(0, 1, 2\) or \(3\)? Is this relative frequency bigger or smaller than the true probability you calculated in Exercise 4.2 Q1?

4.3 Summary

For \(X\) representing the number of successes in \(n\) experiments with a probability \(p\) of success:

\(P(X = k)\) is dbinom(k, size=n, prob=p)
\(P(X \le k)\) is pbinom(k, size=n, prob=p)
\(P(X > k)\) is pbinom(k, size=n, prob=p, lower.tail=FALSE)
Generating \(100\) sample values for \(X\) is rbinom(100, size=n, prob=p)