9  Activity 6

A bag contains \(10\) balls, some black and some white. You experiment to find the number of black and white balls by sampling from the bag with replacement. The following function outputs a 0 if the ball was white and 1 if the ball was black. We do this 10 times.

Exercise 9.1  

  1. Let \(p = \frac{\# \mathrm{black}}{10}\) be the probability of drawing a black ball. Based on the random experiment above, what is the point estimate for \(p\)?
  2. Make a rough bar plot where you rank (from 1 to 5) the likelihood of each of \(0\) through \(10\) as being the true number of black balls in the bag.

9.1 Using the likelihood function to estimate \(p\)

Note

A likelihood function is just a probability calculation as a function of some parameter. E.g. the probability of drawing \(5\) black balls out of \(10\) is higher (more likely) when \(p = 0.5\) than it is when \(p = 0.05\).

Open this Desmos graph in a new tab (right-click → open in new tab).

Exercise 9.2 Complete the following steps than write down your numbers for \(u\) and \(v\):

  1. At the top, let \(b =\) the number of 1s you observed in the previous experiment.
  2. Starting with \(u = 0\) and \(v = 1\), slowly drag \(u\) to the right until \(A\), which represents the area under the curve, is as close to 0.975 (97.5%) as you can make it. Moving u to the right until A is approximately 0.975
  3. Now move \(v\) to the left until \(A\) is as close to 0.95 (95%) as you can make it. Moving v to the left until A is approximately 0.95
  4. Report \(u\) and \(v\).
Note

Context: moving \(u\) until \(A\) was \(\approx 97.5\%\) means there was a \(2.5\%\) chance of being less than \(u\) and likewise a \(2.5\%\) change of being bigger than \(v\). So overall there is a \(5\%\) chance of not being between \(u\) and \(v\) and a \(95\%\) chance of being between \(u\) and \(v\).

Exercise 9.3 The qbeta function performs what we just did in Desmos. Specifically, we run

to find \(u\) and \(v\) respectively. (The \(+1\) is due to some \(-1\)s that appear in the beta distribution behind the scenes.)

  1. Change the \(8\) and \(2\) to your number of 1s (black balls) and 0s (white balls) respectively. Then report the exact values of \(u\) and \(v\).
  2. Rewrite this with your values of \(u\) and \(v\): “a 95% CI for p is \((u, v)\).”
  3. You will notice that this confidence interval is quite wide because with only \(10\) samples it’s hard to pinpoint the true proportion of black balls. Let’s change \(n\) to \(1000\) and try again:

Take these numbers back to the qbeta box to compute a new \(u\) and \(v\) and report a new confidence interval (as in 2.).

Note

Technically these qbeta-intervals are “credible intervals” not “confidence intervals” but they function the same.

9.2 Using the Normal approximation

Recall our formula for the normal approximation confidence interval is \[ \hat p \pm 1.96 \sqrt{\frac{\hat p \hat q}{n}} \] or more generally, use qnorm(1 - alpha / 2) instead of 1.96.

Exercise 9.4  

  1. With the numbers for \(n = 1000\) what are \(\hat p\) and \(\hat q\)? Hint: if we saw \(413\) black balls, then \(\hat p = 413/1000 = 0.413\) (move the decimal place over). \(\hat q\) is the same but for the white balls.
  2. Compute \(E = 1.96 \sqrt{\hat p \hat q / n}\):

(modify the \(.413\) and \(.587\) as necessary). 3. Compute \(\hat p - E\) and \(\hat p + E\):

  1. Are these numbers similar to your \(u\) and \(v\) from the previous exercise?

Exercise 9.5 Now run through the steps of the previous exercise but with the numbers for \(n = 10\) from Exercise 9.1. (Remember to change the /1000 to /10 in part 2.) Are these numbers similar to the first set of \(u\) and \(v\) we found?

9.3 Miscellanea

Using the formula \(n = \frac{1.96^2 \cdot 0.25}{E^2}\) (which we use if \(\hat p\) is unknown) we can estimate how many samples we need to achieve a desired error.

Exercise 9.6  

  1. Suppose we want to be really really sure of \(p\) and we want the error to be no more than \(0.001\) (e.g. our confidence interval would be \(0.399\) to \(0.401\)). Compute this \(n\) (either by typing in the formula or using your own calculator)
  1. Now let’s use this for a qbeta confidence interval:
  1. This has all assumed we wanted to be 95% confident. Suppose instead we want to be 99% confident.
    • What is \(\alpha = 1 - 0.99\)?
    • What is \(\alpha / 2\)?
    • What is \(1 - \alpha / 2\)?
  2. Take your value of \(1 - \alpha / 2\) and compute qnorm(1 - alpha / 2)
  1. Replace the \(1.96\) with this value when we were computing the normal approximation confidence interval:
  1. Compared to the confidence interval from Exercise 9.4, is this 99% confidence interval wider or narrower* than the 95% confidence interval?

9.4 t-distribution

For the following scenarios, compute

  • \(1 - \alpha/2\)
  • qt(1 - \alpha / 2, df = n - 1) (as an analogue of the 1.96 threshold)
  • \(E = \text{quantile-for-t} \cdot \frac{s}{\sqrt n}\).
Tip

Reminder: quantiles are percentiles but for a decimal rather than a percent and the “t distribution” is like the normal distribution but with slightly larger percentiles due to the extra uncertainty in using \(s\) rather than \(\sigma\).

Report something like \(95\%\text{-CI for } \mu = \bar x \pm E = 41 \pm 3\)

Example 9.1 Suppose scores for an exam have a mean of \(75\) and sample standard deviation of \(6\) with \(n = 30\). For a 95% confidence interval for \(\mu\) (\(\alpha = 0.05\)):

  • \(1 - 0.05 / 2 = 0.975\)
  • qt(0.975, df=29) = 2.04523
  • \(E = 2.04523 \cdot 6 / \sqrt{30} = 2.240437\)

A 95% CI for \(\mu\) is \(75 \pm 2.24\) or \((72.76, 77.24)\)

Exercise 9.7  

  1. \(\bar x = 70, s = 10, n = 49\) and we want a \(99\%\) confidence interval (\(\alpha = 0.01\))
  2. \(\bar x = 125, s = 12, n = 217\) and we want a \(90\%\) confidence interval (\(\alpha = 0.1\))
  3. \(\bar x = 60, s = 8, n = 5103\) and we want a \(96\%\) confidence interval (\(\alpha = 0.04\))