tosses <- do (10000) * rflip(3)
gf_props(~heads, data=tosses)Class 11
I flip a coin. If it lands heads up, I roll a regular die. If it lands tails up, I roll a die with three 2s and there 6s.
Find \(P(T \text{ and } 6)\)
I remove two tokens from a bag that contains 3 blue and 2 yellow tokens.
Find \(P(\text{both tokens are yellow})\)
A random variable is a mathematical model of a random quantity (measurement, count)
It is a function that assigns each outcome from a sample space a number.
Example: The number of dots on the side of a die that is facing up.
Example: The total number of dots on the sides of two dice that are facing up.
Example: The number of infants that choose the friendly character.
Example: The height of a plant randomly selected from a field.
\(X = {}\) the number of heads when flipping 3 coins:
\(X\) models “the number of heads in 3 tosses of a fair coin”.
3 coin tosses
Taking a token 3 times from a bag with one red and one blue token, with replacement. Counting the number of red tokens.
Asking 3 people to randomly select one of two characters, when they have no preference. Counting the number of times the friendly character was selected.
Randomly selecting 3 people from a large crowd that has 50% males and 50% females. Counting the number of females.
Randomly selecting 3 plants from a field in which 50% of the plants have some specific genetic mutation. Counting the number of plants with the mutation.
Mathematically, all of those are the same.
The distribution of a discrete random variable is the collection of its values and the probabilities associated with those values.
The probability distribution for \(X\) is as follows:
| \(x_i\) | 0 | 1 | 2 | 3 |
|---|---|---|---|---|
| \(P(X = x_i)\) | 1/8 | 3/8 | 3/8 | 1/8 |
The probabilities must add up to 1
| \(x_i\) | 0 | 1 | 2 | 3 |
|---|---|---|---|---|
| \(P(X = x_i)\) | 1/8 | 3/8 | 3/8 | 1/8 |
tosses <- do (10000) * rflip(3)
gf_props(~heads, data=tosses)If \(X\) has outcomes \(x_1\), …, \(x_k\) with probabilities \(P(X=x_1)\), …, \(P(X=x_k)\), the expected value of \(X\), also called the mean of \(X\), is the sum of each outcome multiplied by its corresponding probability:
\[\begin{aligned} E(X) &= x_1 P(X=x_1) + x_2 P(X = x_2) + \cdots + x_k P(X=x_k)\\\\ &= \sum_{i=1}^{k}x_iP(X=x_i) \end{aligned}\]
The Greek letter \(\mu\) may be used in place of the notation \(E(X)\) and is sometimes written \(\mu_X\).
In the coin tossing example,
| \(x_i\) | 0 | 1 | 2 | 3 |
|---|---|---|---|---|
| \(P(X = x_i)\) | \(\color{red}1/8\) | \(\color{green}3/8\) | \(\color{blue}3/8\) | \(\color{orange}1/8\) |
\[\begin{align*} E(X) &= 0{\color{red}P(X=0)} + 1{\color{green}P(X=1)} + 2{\color{blue}P(X=2)} + 3{\color{orange}P(X = 3)} \\[1.2em] &\class{fragment}{{}= 0\cdot{\color{red}\frac{1}{8}} + 1\cdot{\color{green}\frac{3}{8}} + 2\cdot{\color{blue}\frac{3}{8}} + 3\cdot{\color{orange}\frac{1}{8}}} \\[1.2em] &\class{fragment}{{}= \frac{12}{8}} \\[1.2em] &\class{fragment}{{}= 1.5} \end{align*}\]
Intuitively, the expected value of \(X\) is approximately the number you would get if you took a lot of values of \(X\) and calculated the mean.
tosses <- do (10000) * rflip(3)
mean(~heads, data=tosses)[1] 1.5054
If \(X\) takes on outcomes \(x_1\), …, \(x_k\) with probabilities \(P(X=x_1)\), …, \(P(X=x_k)\) and has the expected value \(\mu=E(X)\), then the variance of \(X\), denoted by \(\operatorname{Var}(X)\) or \(\sigma^2\), is
\[\begin{align*} \operatorname{Var}(X) &= (x_1-\mu)^2 P(X=x_1) + (x_2 - \mu)^2 P(X = x_2) + \cdots+ (x_k-\mu)^2 P(X=x_k) \\ &= \sum_{j=1}^{k} (x_j - \mu)^2 P(X=x_j) \end{align*}\]
The standard deviation of \(X\), written as \(\operatorname{SD}(X)\) or \(\sigma\), is the square root of the variance. It is sometimes written \(\sigma_X\).
\[\operatorname{SD}(X) = \sqrt{\operatorname{Var(X)}}\]
In the coin tossing example,
| \(x_i\) | \(\color{red}0\) | \(\color{green}1\) | \(\color{blue}2\) | \(\color{orange}3\) |
|---|---|---|---|---|
| \(P(X = x_i)\) | \(\color{red}1/8\) | \(\color{green}3/8\) | \(\color{blue}3/8\) | \(\color{orange}1/8\) |
\[\begin{align*} \sigma_X^2 &= (x_1-\mu_X)^2P(X=x_1) + \cdots+ (x_4-\mu_X)^2 P(X=x_4) \\[1.2em] &= \left({\color{red}0} - \frac{3}{2}\right)^2{\color{red}\frac{1}{8}} + \left({\color{green}1} - \frac{3}{2}\right)^2{\color{green}\frac{3}{8}} + \left({\color{blue}2} -\frac{3}{2}\right)^2{\color{blue}\frac{3}{8}} + \left({\color{orange}3} -\frac{3}{2}\right)^2{\color{orange}\frac{1}{8}} \\[1.2em] &= \frac{3}{4} \end{align*}\]
The standard deviation \(\sigma_X\) is \(\sqrt{3/4} = \sqrt{3}/2 = 0.866\).
Again, the standard deviation of \(X\) is approximately the number you would get if you took a lot of values of \(X\) and calculated the standard deviation of the data.
tosses <- do (10000) * rflip(3)
sd(~heads, data=tosses)[1] 0.8713321
Suppose your roll three fair six sided dice. Let \(X = \text{ the number of sixes rolled}\).
What are the possible values of \(X\)?
What is the probability distribution for \(X\)?
What is the expected value of \(X\)?
What is the variance of \(X\)?
All these can be modeled by so-called binomial random variables.
\(X\) is a binomial random variable if it represents the number of successes in \(n\) replications of an experiment where
This is also called a binomial process.
A binomial random variable takes on values \(0, 1, 2, \dots, n\).
The numbers \(n\) and \(p\) are called the parameters of the distribution.
We write \(X \sim \operatorname{Binom}(n, p)\).
Suppose \(X\sim \operatorname{Binom}(n,p)\). What is \(P(X = x)\)?
\[\begin{gather} P(\text{first } x \text{ are successes and last } n - x \text{ are failures}) = \\ \underbrace{p\cdot p\cdot p \cdots p}_{x\text{ times}}\cdot\underbrace{(1-p)\cdot(1-p)\cdots(1-p)}_{(n-x) \text{ times}} = p^x\cdot (1-p)^{n-x} \end{gather}\]
\[\begin{gather} P(\text{first } n-x \text{ are failures and last } x \text{ are successes}) = \\ \underbrace{(1-p)\cdot(1-p)\cdots(1-p)}_{(n-x) \text{ times}} \cdot \underbrace{p\cdot p\cdot p \cdots p}_{x\text{ times}} = p^x\cdot (1-p)^{n-x} \end{gather}\]
or it may be that the first is success, then there are two failures, then a success, and so on.
How many different ways can we choose \(x\) successes out of \(n\) repetitions?
The binomial coefficient \(\binom{n}{x}\) is the number of ways to choose \(x\) items from a set of size \(n\), where the order of the choices is ignored.
Mathematically,
\[\binom{n} {x} = \frac{\overbrace{n\cdot(n-1)\cdot(n-2)\cdots(n-x+1)}^{x \text{ factors}}} {\underbrace{x\cdot(x-1)\cdot(x-2)\cdots 1}_{x \text{ factors}}}\]
Let \(X\) = number of successes in \(n\) trials, and \(0 \le x \le n\).
\[P(X = x)=\binom{n}{x} p^x (1-p)^{n-x}\]
Parameters of the distribution:
\(n\) = number of trials
\(p\) = probability of success
Let \(n = 3\) and \(p = 1/2\).
Calculate \(P(X = 1)\).
Let \(n = 3\) and \(p = 1/6\).
Calculate \(P(X = 2)\).
Let \(n = 16\) and \(p = 1/2\).
Calculate \(P(X = 14)\).
(Hint: \(\binom{16}{14} = \binom{16}{2}\))
For a binomial distribution with parameters \(n\) and \(p\), it can be shown that:
Mean = \(np\)
Standard Deviation = \(\sqrt{np(1-p)}\)
The derivation is not shown here nor in the text; it will not be asked for on a problem set or exam.
R
The function dbinom() is used to calculate \(P(X = k)\).
dbinom(k, size=n, prob=p): \(P(X = k)\)
For example, if \(X \sim \operatorname{Binom}(3, 1/6)\), the \(P(X = 2)\) is
dbinom(2, size=3, prob=1/6)[1] 0.06944444
The d in dbinom stands for distribution or density.
R …The function pbinom() is used to calculate \(P(X \leq k)\) or \(P(X > k)\).
\(P(X \leq k)\):
pbinom(k, size=n, prob=p)
\(P(X > k)\):
pbinom(k, size=n, prob=p, lower.tail = FALSE)
The p stands for probability.
pbinom examples:if \(X \sim \operatorname{Binom}(16, 1/2)\), then \(P(X \le 13)\) is
pbinom(13, size=16, prob=1/2)[1] 0.9979095
while \(P(X \ge 14) = P(X > 13)\) is
pbinom(13, size=16, prob=1/2, lower.tail=FALSE)[1] 0.002090454
or, equivalently:
1 - pbinom(13, size=16, prob=1/2)[1] 0.002090454