Class 28
Single categorical variable:
Two levels:
Single proportion
More than two levels:
Goodness of fit test
Two categorical variables:
Both with two levels:
Difference of proportions
More than two levels:
Independence test
Red-breasted nuthatch (Sitta canadensis)
These insect eating birds search bark furrows for hidden prey.
Mannan and Meslow (1984) studied red-breasted nuthatch foraging behavior in a managed forest in Oregon. In the forest, 54% of the canopy volume was Douglas fir, 40% was ponderosa pine, 5% was grand fir, and 1% was western larch. They made 156 observations of foraging by red-breasted nuthatches; 70 observations (45% of the total) in Douglas fir, 79 (51%) in ponderosa pine, 3 (2%) in grand fir, and 4 (3%) in western larch.
Do these data show a preference for some species of trees?
We have a single categorical variable with 3 or more levels.
We are asking whether the categorical variable follows certain specific distribution.
\(H_0: p_{DF} = .54 \text{ and } p_{PP} = .4 \text{ and } p_{GF} = .05 \text{ and } p_{WL} = .01\)
\(H_A:\) the tree choices do not follow this distribution.
Categorical variable: we can do a simulation
What do we use as a test statistic?
Observed counts: 70, 79, 3, 4
Expected counts:
\[\begin{alignat}{3} && \text{Observed} &- \text{Expected} &\\[1.2ex] &\text{Douglas fir: } & 70 &- 84.24 &=& &\quad\class{fragment}{-14.24}\\[1.2ex] &\text{Ponderosa pine: } & 79 &- 62.40 &=& &\class{fragment}{16.60}\\[1.2ex] &\text{Grand fir: } & 3 &- 7.80 &=& &\class{fragment}{-4.80}\\[1.2ex] &\text{Western larch: } & 4 &- 1.56 &=& &\class{fragment}{2.44}\\[1.2ex] \end{alignat}\]
Convert them to z-scores
\(\displaystyle z = \frac{x - \mu}{\sigma} = \frac{\text{Observed} - \text{Expected}}{\sqrt{\text{Expected}}}\)
Square them and add the squares.
\(\displaystyle \sum z^2 = \sum \left(\frac{x - \mu}{\sigma}\right)^2 = \sum \left(\frac{\text{Observed} - \text{Expected}}{\sqrt{\text{Expected}}}\right)^2\)
\[\chi^2 = \sum \frac{\left(\text{Observed} - \text{Expected}\right)^2}{\text{Expected}}\]
\[\chi^2 = \frac{(70 - 84.24)^2}{84.24} + \frac{(79 - 62.4)^2}{62.4} + \frac{(3 - 7.8)^2}{7.8} + \frac{(4 - 1.56)^2}{1.56} \]
\[= \frac{(-14.24)^2}{84.24} + \frac{(16.6)^2}{62.4} + \frac{(-4.8)^2}{7.8} + \frac{(2.44)^2}{1.56} \]
\[\chi^2 = 13.59 \]
Is \(\chi^2 = 13.59\) small or large?