5 Activity 4
The Binomial Distribution is a model of doing something random several times. For example, rolling multiple dice or flipping multiple coins.
If \(X\) represents the number of successes in \(n\) random experiments with a probability of success of \(p\) and probability of failure \(q = 1 - p\) then \[ P(X = x) = nCx \cdot p^x q^{n - x}, \quad \text{Where } nCx = \frac{n!}{x!(n - x)!}. \tag{5.1}\]
Example 5.1 For example, if \(X\) is the number of \(6\)s rolled among \(10\) dice then \(p = \frac16\) is the probability of getting a \(6\) and \(q = 1 - \frac16 = \frac56\) is the probability of not getting a \(6\). So \[ P(X = x) = P(\text{rolling } x \text{ sixes}) = 10Cx \cdot \left( \frac16 \right)^x \left( \frac56 \right)^{10-x}. \]
In R, we calculate these probabilities with the dbinom function. E.g. let’s say \(x = 3\):
This can also be written dbinom(3, 10, 1/6) without writing size= or prob= but it can help to write those explicitly to help know which number is \(x, n\) or \(p\).
Exercise 5.1 Suppose in basketball, you are able to successfully make a free throw \(70\%\) of the time or \(p = 0.7\).
- What is the probability that you miss: \(q = 1 - p\)?
- If you shoot \(5\) times, write down the formula (i.e. write Equation 5.1 with the given \(p, q, n, x\)) for the probability that you land \(3\) out of \(5\) free throws.
- Using
dbinom(x, size=n, prob=p)compute this probability (fill in your values for \(p, n, x\))
5.1 Cumulative probabilities
If we change dbinom to pbinom, we get not the probability of \(x\) successes but the probability of \(\le x\) successes.
E.g. the probability of rolling \(0, 1\) or \(2\) ones among \(5\) dice rolls is
In math notation: \[ P(X \le 2) = P(X = 0, 1 \text{ or } 2) = P(X = 0) + P(X = 1) + P(X = 2) \]
Exercise 5.2
- Go back to Exercise 5.1 and now instead of calculating the probability of making exactly \(3\) free throws to the probability of making \(\le 3\) free throws (change
dbinomtopbinom). - To get the probability of making \(> 3\) free throws rather than \(\le 3\) we can compute \(P(X > 3) = 1 - P(X \le 3)\). Do this above.
- Are you more likely to make \(\le 3\) free throws or \(> 3\)?
If you have, say, a \(35\%\) chance of making \(\le x\) throws, then you must have a \(65\%\) chance of making \(> x\) throws. These are complementary events. \[ P(X > x) = 1 - P(X \le x). \]
5.2 Random number generation
We’ve seen how to calculate probabilities. We can also make simulations. For instance, to simulate this experiment of throwing 5 free throws 100 times, we can use the rbinom function. The output is a list of \(x\)’s. E.g. if the output starts 5 4 2 1 ... that means the first simulation had 5 successes, the second 4, the third 2, the fourth 1, etc.
Exercise 5.3
- Run this cell and write down the first 10 numbers.
- In those first 10 experiments, what percent of the time did you land exactly \(3\) free throws? Is this relative frequency greater than or smaller than the probability you calculated in Exercise 5.1 Q2?
- Now change the line to
table(rbinom(...))to make the summary table we looked at in Lab 1. Write down that table. - How many times out of 100 was the number of successes \(0, 1, 2\) or \(3\)? Is this relative frequency bigger or smaller than the true probability you calculated in Exercise 5.2 Q1?
5.3 Interval conversions
We looked at the relation \(P(X > x) = 1 - P(X \le x)\) already.
Another one is that \[P(2 \le X \le 8) = P(X \le 8) - P(X < 2) = P(X \le 8 - P(X \le 1).\]
Exercise 5.4 Suppose we flip a coin 1000 times (\(n = 1000, p = 1/2\))
- What is the probability of getting \(< 480\) heads? Also give the R-function that calculates it.
- What is the probability of getting \(470 \le X \le 530\) heads? Also give the R-function(s) that calculate it.
- Using any LLM, ask it how to do this calculation in Excel and write down its answer. You should notice that the answer it gives is very similar to your answer with R. E.g.
How do I calculate the probability of 470 <= X <= 530 in Excel if X follows a binomial distribution with n = 1000 and p = 1/2?
(And make sure it isn’t using a normal approximation!)
The point: these kinds of calculations in R are transferable skills to other pieces of software.
5.4 Summary (No questions to answer)
For \(X\) representing the number of successes in \(n\) experiments with a probability \(p\) of success:
- \(P(X = x)\) is
dbinom(x, n, p) - \(P(X \le x)\) is
pbinom(x, n, p) - \(P(X > x)\) is
1 - pbinom(x, n, p) - Generating \(100\) sample values for \(X\) is
rbinom(100, n, p)