Inference for Categorical Data

class: middle, title-slide

# Inference for Categorical Data
## Null Hypothesis Significance Tests
### Dennis A. V. Dittrich
### 2021

---

layout: true

---

## Hypothesis Tests
.row[.col-7[
A process that uses sample statistics to test a claim about the value of
a population parameter.
* State the claim mathematically and verbally. Identify the null and alternative hypotheses.   
  `$H_0: ?$` vs `$H_a: ?$`
* Specify the level of significance.  
  `$\alpha = ?$`
* Determine the sampling distribution and draw its graph.
]
.col-5[
![](img/hyp-06.png)
]]
---

## Hypothesis Tests
.row[.col-7[	
* Calculate the test statistic. Add it to your sketch. 
![](img/hyp-07.png)
* Find the p-value.
* Use the decision rule to accept or reject your hypothesis.
* Write a statement to interpret the decision in the context of the original claim.
]]

---

## Statistical hypothesis
.row[.col-6[
.pink[A statement about a population parameter.]

**A null hypothesis `$H_0$` **  
is a statistical hypothesis that contains a statement of equality, such
as `$\leq$` , `$=$`, or `$\geq$`.

**The alternative hypothesis `$H_a$` **  
is the complement of the null hypothesis. It is a statement that must be
true if `$H_0$` is false and it contains a statement of strict
inequality, such as `$>$`, `$\neq$`, or `$<$`.
]
.col-6[
When one of these hypotheses is false, the other must be true.

`$H_0: \mu \leq k$` vs `$H_a: \mu > k$`

`$H_0: \mu \geq k$` vs `$H_a: \mu < k$`

`$H_0: \mu = k$` vs `$H_a: \mu \neq k$`

Regardless of which pair of hypotheses you use, you always assume
`$\mu = k$` and examine the sampling distribution on the basis of this
assumption.
]]

---

## Level of Significance `$\alpha$` vs p-value
.row[.col-6[
**Level of Significance `$\alpha$` **  
Your maximum allowable probability of making a type I error.

**P-value**  
The probability, if the null hypothesis is true, of obtaining a sample statistic with a value as extreme or more extreme than the one determined from the sample data.
]
.col-6[
* A result is statistically significant if it would rarely occur by chance.
* By setting the level of significance at a small value, you are saying   that you want the probability of rejecting a true null hypothesis to  be small.
* Compare the p-value with `$\alpha$`.
  1. If `$p \leq \alpha$`, then reject `$H_0$`.
  2. If `$p > \alpha$`, then fail to reject `$H_0$`.
]]
---

## Are the proportions equal?
.row[.col-6[
Null-Hypothesis test: is the proportion of flights in the morning equal
to 50%?
`$$H_0: p=0.5 \text{ vs } H_a: p\neq 0.5$$`
Theoretical Null Distribution: Binomial experiment with 500 trials, and
success probability of `$p=0.5$`

]
.col-6[

<br/>
<img src="08.inference.cat-2_files/figure-html/unnamed-chunk-2-1.png" width="100%" style="display: block; margin: auto;" />
]]

---

## Are the proportions equal?
.row[.col-7[
Area under the null distribution representing values at least as extreme
as our observation

<img src="08.inference.cat-2_files/figure-html/unnamed-chunk-3-1.png" width="100%" style="display: block; margin: auto;" />
]
.col-5[

```r
pbinom(214,500,0.5) + 
  (1-pbinom(250+250-214-1,500,0.5))
```

```
## [1] 0.001472458
```

]]

---

## The exact binomial test
.row[.col-7[

```r
binom.test(x=214, n=500, p=0.5)
```

```
## 
## 	Exact binomial test
## 
## data:  214 and 500
## number of successes = 214, number of trials = 500, p-value = 0.001472
## alternative hypothesis: true probability of success is not equal to 0.5
## 95 percent confidence interval:
##  0.3841682 0.4726881
## sample estimates:
## probability of success 
##                  0.428
```
]]

---

.row[.col-7[
# The binomial test 
.pink[with simulated null distribution]

```r
# compute the test statistic
( p_hat <- fli_small %>%
    specify(response = day_hour, 
            success = "morning") %>%
    calculate(stat = "prop") )
# simulate the null distribution
null_distn <- fli_small %>%
  specify(response = day_hour, 
          success = "morning") %>%
  hypothesize(null = "point", p = .5) %>%
  generate(reps = 1000, type="simulate") %>%
  calculate(stat = "prop")
# visualize the null distribution
visualize(null_distn) +
  shade_p_value(obs_stat = p_hat, 
                direction = "two_sided") + 
  theme_minimal()
# compute the p-value
null_distn %>%
  get_p_value(obs_stat = p_hat, 
              direction = "two_sided")
```
]
.col-5[

```r
DescTools::BinomCI(214, 500)
```

```
##        est    lwr.ci    upr.ci
## [1,] 0.428 0.3853418 0.4717561
```

<br/>

```
## # A tibble: 1 x 1
##    stat
##   <dbl>
## 1 0.428
```

```
## # A tibble: 1 x 1
##   p_value
##     <dbl>
## 1       0
```
]]

---

## The contingency table --- test of independence
.row[.col-6[

<br/>
Are the two variables season and day_hour independent?
]
.col-6[
If there was no (sampling) uncertainty, we would only need to check whether `$$P(\text{summer})=\frac{P(\text{summer} \land \text{morning})}{P(\text{morning})}$$`

Since we are dealing with random variables, we have sampling variability. The question changes to whether we have a statistically significant difference in the proportions of flights in the morning between the two seasons: Does the proportion of flights in the morning depend on the season?
]]

---

## Test of no difference in proportions
### based on a random samples of permutations
.row[.col-7[

```r
# Compute test statistic
(d_hat <- fli_small %>% 
  specify(day_hour ~ season, success = "morning") %>%
  calculate(stat = "diff in props", 
           order = c("winter", "summer")) )
# Create the null distribution
null_distn <- fli_small %>%
  specify(day_hour ~ season, success = "morning") %>%
  hypothesize(null = "independence") %>% 
  generate(reps = 1000, type="permute") %>% 
  calculate(stat = "diff in props", 
            order = c("winter", "summer"))
# Visualize the null distribution
visualize(null_distn) +
  shade_p_value(obs_stat = d_hat, 
                direction = "two_sided") + 
  theme_minimal()
# Compute the p-value
null_distn %>% 
  get_p_value(obs_stat = d_hat, 
              direction = "two_sided")
```
]
.col-5[

```
## # A tibble: 1 x 1
##      stat
##     <dbl>
## 1 -0.0183
```

```
## # A tibble: 1 x 1
##   p_value
##     <dbl>
## 1   0.758
```
]]

---

.row[.col-8[
## R: Approximative `$\chi^2$` test 
### for proportions

```r
prop.test(c(108, 106), c(247, 253), correct=FALSE)
```

```
## 
## 	2-sample test for equality of proportions without continuity correction
## 
## data:  c(108, 106) out of c(247, 253)
## X-squared = 0.17049, df = 1, p-value = 0.6797
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.06846113  0.10501039
## sample estimates:
##    prop 1    prop 2 
## 0.4372470 0.4189723
```
]
.col-4[

]
]

.row[.col-8[

```r
prop.test(c(108, 106), c(247, 253), correct=TRUE)
```

```
## 
## 	2-sample test for equality of proportions with continuity correction
## 
## data:  c(108, 106) out of c(247, 253)
## X-squared = 0.10402, df = 1, p-value = 0.7471
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.0724617  0.1090110
## sample estimates:
##    prop 1    prop 2 
## 0.4372470 0.4189723
```
]
.col-4[
#### Continuity correction
]
]

---

## R: Approximative `$\chi^2$` test for proportions
.row[.col-8[

```r
prop.test(c(4, 1), c(5, 5),  conf.level=0.95, correct=FALSE)
```

```
## 
## 	2-sample test for equality of proportions without continuity correction
## 
## data:  c(4, 1) out of c(5, 5)
## X-squared = 3.6, df = 1, p-value = 0.05778
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  0.104164 1.000000
## sample estimates:
## prop 1 prop 2 
##    0.8    0.2
```
]
.col-4[
.pink[without continuity correction]
]
]

.row[.col-8[

```r
prop.test(c(4, 1), c(5, 5),  conf.level=0.95, correct=TRUE)
```

```
## 
## 	2-sample test for equality of proportions with continuity correction
## 
## data:  c(4, 1) out of c(5, 5)
## X-squared = 1.6, df = 1, p-value = 0.2059
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.09583603  1.00000000
## sample estimates:
## prop 1 prop 2 
##    0.8    0.2
```
]
.col-4[
.pink[with continuity correction]

<br/><br/>
A smaller sample makes a bigger difference
]
]

---

## Pearson's `$\chi^2$` test
.row[.col-5[
`$$X^2 = \sum \frac{(n_{ij}-\mu_{ij})^2}{\mu_{ij}}$$`

** `$n_{ij}$` ** Count in cell `$ij$`

** `$\mu_{ij}$` ** Expected frequency in cell `$ij$` when `$H_0$` is true ( `$\mu_{ij} = n\pi_{ij}$`)

** `$\pi_{ij}$` ** Expected probability, computed by the marginal probabilities `$\pi_{ij} = \pi_{i}\times \pi_{j}$`
]
.col-7[

For random sampling and for large sample
sizes `$X^2$` has approximately a `$\chi^2$` distribution. It takes its minimum value of zero
when all `$n_{ij} = \mu_{ij}$`.

For a fixed sample size, greater differences `$n_{ij} - \mu_{ij}$` produce larger `$X^2$` values and stronger evidence against `$H_0$`.

The P-value is the probability, under `$H_0$`, that
`$X^2$` is at least as large as the observed value. This is the `$\chi^2$` right-tail probability
above the observed `$X^2$` value.

The `$\chi^2$` approximation improves as `$\mu_{ij}$` increase
and `$\mu_{ij} \geq 5$` is usually sufficient for a decent approximation.
]]

---

## Degrees of freedom for the `$\chi^2$` test
.row[.col-5[
There are `$r − 1$` nonredundant row probabilities.

Because they sum to 1, the first r − 1 determines the last
one.

Similarly, there are `$c − 1$` nonredundant column probabilities, so, under `$H_0$`, there are (r − 1) + (c − 1) parameters.
]
.col-7[
The alternative
hypothesis `$H_a$` states that there is not independence but does not specify a pattern for the
`$rc$` cell probabilities. The probabilities are then solely constrained to sum to 1, so there are
`$rc − 1$` nonredundant parameters.

The value for df is the difference between the number of
parameters under `$H_a$` and `$H_0,$` or
`\begin{align}
\text{df} &= (rc − 1) − [(r − 1) + (c − 1)] \\
&= rc − r − c + 1\\
&= (r − 1)(c − 1)
\end{align}`
]]

---

.row[.col-8[
## R: Pearson's `$\chi^2$` test

```r
chisq.test(ct, correct = F)
```

```
## 
## 	Pearson's Chi-squared test
## 
## data:  ct
## X-squared = 0.17049, df = 1, p-value = 0.6797
```
]
.col-4[

```
##              season
## day_hour      summer winter
##   morning        108    106
##   not morning    139    147
```
]
]
.row[.col-8[

```r
chisq.test(ct, correct = T)
```

```
## 
## 	Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  ct
## X-squared = 0.10402, df = 1, p-value = 0.7471
```
]
.col-4[
#### with continuity correction
]]
.row[.col-8[

```r
chisq.test(ct, simulate.p.value = T, B=2000)
```

```
## 
## 	Pearson's Chi-squared test with simulated p-value (based on 2000 replicates)
## 
## data:  ct
## X-squared = 0.17049, df = NA, p-value = 0.7346
```
]
.col-4[
#### Computation of p-value via Monte Carlo Simulation
]]

---

## Test of independence - `$\chi^2$`
### based on random samples of permutations
.row[.col-7[

```r
# Compute test statistic
(d_hat <- fli_small %>% 
  specify(day_hour ~ season, success = "morning") %>%
  calculate(stat = "Chisq", 
           order = c("winter", "summer")) )
# Create the null distribution
null_distn <- fli_small %>%
  specify(day_hour ~ season, success = "morning") %>%
  hypothesize(null = "independence") %>% 
  generate(reps = 1000, type="permute") %>% 
  calculate(stat = "Chisq", 
            order = c("winter", "summer"))
# Visualize the null distribution
visualize(null_distn, method="both") +
  shade_p_value(obs_stat = d_hat, 
                direction = "right") + 
  theme_minimal()
# Compute the p-value
null_distn %>% 
  get_p_value(obs_stat = d_hat, 
              direction = "right")
```
]
.col-5[

```
## # A tibble: 1 x 1
##    stat
##   <dbl>
## 1 0.104
```

```
## # A tibble: 1 x 1
##   p_value
##     <dbl>
## 1    0.73
```
]]

---

## Test of independence - relative risk
### based on random samples of permutations
.row[.col-7[

```r
# Compute test statistic
(d_hat <- fli_small %>% 
  specify(day_hour ~ season, success = "morning") %>%
  calculate(stat = "ratio of props", 
           order = c("winter", "summer")) )
# Create the null distribution
null_distn <- fli_small %>%
  specify(day_hour ~ season, success = "morning") %>%
  hypothesize(null = "independence") %>% 
  generate(reps = 1000, type="permute") %>% 
  calculate(stat = "ratio of props", 
            order = c("winter", "summer"))
# Visualize the null distribution
visualize(null_distn) +
  shade_p_value(obs_stat = d_hat, 
                direction = "two_sided") + 
  theme_minimal()
# Compute the p-value
null_distn %>% 
  get_p_value(obs_stat = d_hat, 
              direction = "two_sided")
```
]
.col-5[

```
## # A tibble: 1 x 1
##    stat
##   <dbl>
## 1 0.958
```

```
## # A tibble: 1 x 1
##   p_value
##     <dbl>
## 1    0.71
```
]]

---

## Test of independence - odds ratio
### based on random samples of permutations
.row[.col-7[

```r
# Compute test statistic
(d_hat <- fli_small %>% 
  specify(day_hour ~ season, success = "morning") %>%
  calculate(stat = "odds ratio", 
           order = c("winter", "summer")) )
# Create the null distribution
null_distn <- fli_small %>%
  specify(day_hour ~ season, success = "morning") %>%
  hypothesize(null = "independence") %>% 
  generate(reps = 1000, type="permute") %>% 
  calculate(stat = "odds ratio", 
            order = c("winter", "summer"))
# Visualize the null distribution
visualize(null_distn) +
  shade_p_value(obs_stat = d_hat, 
                direction = "two_sided") + 
  theme_minimal()
# Compute the p-value
null_distn %>% 
  get_p_value(obs_stat = d_hat, 
              direction = "two_sided")
```
]
.col-5[

```
## # A tibble: 1 x 1
##    stat
##   <dbl>
## 1 0.928
```

```
## # A tibble: 1 x 1
##   p_value
##     <dbl>
## 1    0.68
```
]]

---

## Fisher's exact test
.row[.col-7[
Can you taste whether someone put milk or sugar first in your tea?

Let's run an experiment...
]]
.row[.col-4[

```r
# create some data
tea <-
  matrix(c(4, 1, 1, 4),
    nrow = 2,
    dimnames = list(Guess = c("Milk", "Tea"),
                   Truth = c("Milk", "Tea")))
tea %>% kbl()
```

<table>
 <thead>
  <tr>
   <th style="text-align:left;">   </th>
   <th style="text-align:right;"> Milk </th>
   <th style="text-align:right;"> Tea </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> Milk </td>
   <td style="text-align:right;"> 4 </td>
   <td style="text-align:right;"> 1 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Tea </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 4 </td>
  </tr>
</tbody>
</table>
]
.col-8[

```r
fisher.test(tea)
```

```
## 
## 	Fisher's Exact Test for Count Data
## 
## data:  tea
## p-value = 0.2063
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##    0.45474 968.76176
## sample estimates:
## odds ratio 
##    10.9072
```
]]

---

## Fisher's exact test
.row[.col-7[
When the cell counts
result from two independent binomial samples, the row totals (i.e., the binomial n indices)
are fixed. Considering only tables that also have the same column totals as the
observed data, the cell counts have exactly
the hypergeometric distribution for any `$n$`, with no parameters.

Since there are no unknown parameters, it allows exact
rather than approximate or asymptotic P-value calculations.
]]

---
class: middle

## Larger Contingency Tables
---

### Two categorical variables, one with more than 2 levels

.row[.col-6[

```r
fli_small %>% tabyl(origin,day_hour) %>% 
  adorn_totals(c("row", "col")) %>%
  adorn_title("combined") %>% 
  kableExtra::kbl(format="pipe")
```

|origin/day_hour | morning| not morning| Total|
|:---------------|-------:|-----------:|-----:|
|EWR             |      79|          86|   165|
|JFK             |      57|         116|   173|
|LGA             |      78|          84|   162|
|Total           |     214|         286|   500|
<img src="08.inference.cat-2_files/figure-html/unnamed-chunk-25-1.png" width="95%" style="display: block; margin: auto;" />
]
.col-6[

```r
fli_small %>% 
  ggplot() + 
  geom_bar(aes(y=origin,fill=day_hour),
           position="fill") + 
  scale_fill_grey() +
  guides(fill = 
         guide_legend(reverse = TRUE)) +
  theme(legend.position="top")
```

]]

---

### `$\chi^2$` test for independence 
### of 2 categorical variables with more than 2 levels

.row[.col-7[

```r
# Compute test statistic
( Chisq_hat <- fli_small %>%
    specify(formula = day_hour ~ origin) %>% 
    calculate(stat = "Chisq") )
# Create null distribution
null_distn <- fli_small %>%
  specify(day_hour ~ origin) %>%
  hypothesize(null = "independence") %>% 
  generate(reps = 1000, type = "permute") %>% 
  calculate(stat = "Chisq")
# Visualize null distribution
visualize(null_distn) +
  shade_p_value(
    obs_stat = Chisq_hat, 
    direction = "greater") 
#Compute p-value
null_distn %>%
  get_p_value(
    obs_stat = Chisq_hat, 
    direction = "greater")
```
]
.col-5[

```
## # A tibble: 1 x 1
##    stat
##   <dbl>
## 1  10.5
```

```
## # A tibble: 1 x 1
##   p_value
##     <dbl>
## 1   0.006
```

]]

---

### `$\chi^2$` test for independence 
### of 2 categorical variables, one with more than 2 levels
.row[.col-8[

```r
(tab <- fli_small %>% tabyl(origin,day_hour) %>% 
   select(morning, "not morning") %>% 
   as.matrix() )
```
]
.col-4[

```
##      morning not morning
## [1,]      79          86
## [2,]      57         116
## [3,]      78          84
```
]]

.row[.col-8[

```r
chisq.test(tab)
```

```
## 
## 	Pearson's Chi-squared test
## 
## data:  tab
## X-squared = 10.49, df = 2, p-value = 0.005274
```
]
.col-4[
#### Data in matrix 
<br/>
<br/>
Here:  
1. column: counts of morning flights
2. column: counts of non-morning flights
]]

---

### Test for equality of proportion with more than 2 groups

.row[.col-8[

```r
(tab <- fli_small %>% tabyl(origin,day_hour) %>% 
   select(morning, "not morning") %>% 
   as.matrix() )
```
]
.col-4[

```
##      morning not morning
## [1,]      79          86
## [2,]      57         116
## [3,]      78          84
```
]]

.row[.col-8[

```r
prop.test(tab)
```

```
## 
## 	3-sample test for equality of proportions without continuity correction
## 
## data:  tab
## X-squared = 10.49, df = 2, p-value = 0.005274
## alternative hypothesis: two.sided
## sample estimates:
##    prop 1    prop 2    prop 3 
## 0.4787879 0.3294798 0.4814815
```
]
.col-4[
#### Data in matrix 
<br/>
<br/>
1. column: counts of successes
2. column: counts of failures
]]

.row[.col-8[

```r
prop.test(x=c(79, 57, 78), n=c(165,173,162))
```

```
## 
## 	3-sample test for equality of proportions without continuity correction
## 
## data:  c(79, 57, 78) out of c(165, 173, 162)
## X-squared = 10.49, df = 2, p-value = 0.005274
## alternative hypothesis: two.sided
## sample estimates:
##    prop 1    prop 2    prop 3 
## 0.4787879 0.3294798 0.4814815
```
]
.col-4[
#### Data in vectors
<br/>
<br/>
x:   vector of successes  
n:   vector of trial count
]
]

---

### `$\chi^2$` tests work also for larger `$n\times k$` contingency tables (here: 4x4)
.row[.col-6[

```r
(Job <- matrix(c(1, 2, 1, 0, 3, 3, 6, 1, 10, 10, 14, 9, 6, 7, 12, 11), 4, 4,
  dimnames = list( income = c("< 15k", "15-25k", "25-40k", "> 40k"),
    satisfaction = c("VeryD", "LittleD", "ModerateS", "VeryS") )))
```
]
.col-6[

```
##         satisfaction
## income   VeryD LittleD ModerateS VeryS
##   < 15k      1       3        10     6
##   15-25k     2       3        10     7
##   25-40k     1       6        14    12
##   > 40k      0       1         9    11
```
]]

.row[.col-6[
#### Fisher's exact test

```r
fisher.test(Job)
```

```
## 
## 	Fisher's Exact Test for Count Data
## 
## data:  Job
## p-value = 0.7827
## alternative hypothesis: two.sided
```
]
.col-6[
#### Asymptotic `$\chi^2$` test

```r
chisq.test(Job)
```

```
## 
## 	Pearson's Chi-squared test
## 
## data:  Job
## X-squared = 5.9655, df = 9, p-value = 0.7434
```

]]

---

.row[.col-6[
#### Contigency table

```
##         satisfaction
## income   VeryD LittleD ModerateS VeryS
##   < 15k      1       3        10     6
##   15-25k     2       3        10     7
##   25-40k     1       6        14    12
##   > 40k      0       1         9    11
```

#### Expected counts

```r
chisq.test(Job)$expected %>% round(1)
```

```
##         satisfaction
## income   VeryD LittleD ModerateS VeryS
##   < 15k    0.8     2.7       9.0   7.5
##   15-25k   0.9     3.0       9.9   8.2
##   25-40k   1.4     4.5      14.8  12.4
##   > 40k    0.9     2.8       9.4   7.9
```

Some expected counts are below 5, the `$\chi^2$` approximation may be poor.
]
.col-6[

```r
Job.data %>%
  ggplot() +
  geom_bar(aes(
    y = income,
    fill = fct_rev(satisfaction)),
  position = "fill") +
  guides(fill = 
         guide_legend(reverse = TRUE)) +
  scale_fill_colorblind() +
  labs(fill= "Satisfaction", x=NULL)
```

<img src="08.inference.cat-2_files/figure-html/job-fig-1.png" width="95%" style="display: block; margin: auto;" />
]]

---

### R: Creating a long table from a contigency table 
### stored as a matrix
.row[.col-6[

```r
# create data
as.table(Job) %>% as.data.frame()
```

```
##    income satisfaction Freq
## 1   < 15k        VeryD    1
## 2  15-25k        VeryD    2
## 3  25-40k        VeryD    1
## 4   > 40k        VeryD    0
## 5   < 15k      LittleD    3
## 6  15-25k      LittleD    3
## 7  25-40k      LittleD    6
## 8   > 40k      LittleD    1
## 9   < 15k    ModerateS   10
## 10 15-25k    ModerateS   10
## 11 25-40k    ModerateS   14
## 12  > 40k    ModerateS    9
## 13  < 15k        VeryS    6
## 14 15-25k        VeryS    7
## 15 25-40k        VeryS   12
## 16  > 40k        VeryS   11
```
]
.col-6[

```r
(Job.data <- as.table(Job) %>% 
  as.data.frame() %>% 
  uncount(Freq) )
```

```
##    income satisfaction
## 1   < 15k        VeryD
## 2  15-25k        VeryD
## 3  15-25k        VeryD
## 4  25-40k        VeryD
## 5   < 15k      LittleD
## 6   < 15k      LittleD
## 7   < 15k      LittleD
## 8  15-25k      LittleD
## 9  15-25k      LittleD
## 10 15-25k      LittleD
## 11 25-40k      LittleD
## 12 25-40k      LittleD
## 13 25-40k      LittleD
## 14 25-40k      LittleD
## 15 25-40k      LittleD
## 16 25-40k      LittleD
## 17  > 40k      LittleD
## 18  < 15k    ModerateS
## 19  < 15k    ModerateS
## 20  < 15k    ModerateS
## 21  < 15k    ModerateS
## 22  < 15k    ModerateS
## 23  < 15k    ModerateS
## 24  < 15k    ModerateS
## 25  < 15k    ModerateS
## 26  < 15k    ModerateS
## 27  < 15k    ModerateS
## 28 15-25k    ModerateS
## 29 15-25k    ModerateS
## 30 15-25k    ModerateS
## 31 15-25k    ModerateS
## 32 15-25k    ModerateS
## 33 15-25k    ModerateS
## 34 15-25k    ModerateS
## 35 15-25k    ModerateS
## 36 15-25k    ModerateS
## 37 15-25k    ModerateS
## 38 25-40k    ModerateS
## 39 25-40k    ModerateS
## 40 25-40k    ModerateS
## 41 25-40k    ModerateS
## 42 25-40k    ModerateS
## 43 25-40k    ModerateS
## 44 25-40k    ModerateS
## 45 25-40k    ModerateS
## 46 25-40k    ModerateS
## 47 25-40k    ModerateS
## 48 25-40k    ModerateS
## 49 25-40k    ModerateS
## 50 25-40k    ModerateS
## 51 25-40k    ModerateS
## 52  > 40k    ModerateS
## 53  > 40k    ModerateS
## 54  > 40k    ModerateS
## 55  > 40k    ModerateS
## 56  > 40k    ModerateS
## 57  > 40k    ModerateS
## 58  > 40k    ModerateS
## 59  > 40k    ModerateS
## 60  > 40k    ModerateS
## 61  < 15k        VeryS
## 62  < 15k        VeryS
## 63  < 15k        VeryS
## 64  < 15k        VeryS
## 65  < 15k        VeryS
## 66  < 15k        VeryS
## 67 15-25k        VeryS
## 68 15-25k        VeryS
## 69 15-25k        VeryS
## 70 15-25k        VeryS
## 71 15-25k        VeryS
## 72 15-25k        VeryS
## 73 15-25k        VeryS
## 74 25-40k        VeryS
## 75 25-40k        VeryS
## 76 25-40k        VeryS
## 77 25-40k        VeryS
## 78 25-40k        VeryS
## 79 25-40k        VeryS
## 80 25-40k        VeryS
## 81 25-40k        VeryS
## 82 25-40k        VeryS
## 83 25-40k        VeryS
## 84 25-40k        VeryS
## 85 25-40k        VeryS
## 86  > 40k        VeryS
## 87  > 40k        VeryS
## 88  > 40k        VeryS
## 89  > 40k        VeryS
## 90  > 40k        VeryS
## 91  > 40k        VeryS
## 92  > 40k        VeryS
## 93  > 40k        VeryS
## 94  > 40k        VeryS
## 95  > 40k        VeryS
## 96  > 40k        VeryS
```
]]
---

## Bootstraped `$\chi^2$` test for larger contigency tables

.row[.col-6[

```r
# Compute test statistic
( Chisq_hat <- Job.data %>%
    specify(formula = satisfaction ~ income) %>% 
    calculate(stat = "Chisq") )
# Create null distribution
null_distn <- Job.data %>%
  specify(formula = satisfaction ~ income) %>% 
  hypothesize(null = "independence") %>% 
  generate(reps = 1000, type = "permute") %>% 
  calculate(stat = "Chisq")
# Visualize null distribution
visualize(null_distn) +
  shade_p_value(
    obs_stat = Chisq_hat, 
    direction = "greater") 
#Compute p-value
null_distn %>%
  get_p_value(
    obs_stat = Chisq_hat, 
    direction = "greater")
```
]
.col-6[

```
## # A tibble: 1 x 1
##    stat
##   <dbl>
## 1  5.97
```

```
## # A tibble: 1 x 1
##   p_value
##     <dbl>
## 1   0.776
```
]]

---
class: middle

## Paired Categorical Data - McNemar Test
---

## McNemar test for paired categorical data
.row[.col-7[
The McNemar test is applied to a `$2\times 2$` contingency table, which tabulates the outcomes of two tests on a sample of N subjects:

||Test 2 positive|	Test 2 negative|	Row total|
|---|---|---|
|Test 1 positive|	a|	b	|a + b|
|Test 1 negative|	c|	d	|c + d|
|Column total|	a + c	|b + d|N|

The null hypothesis of marginal homogeneity states that the two marginal probabilities for each outcome are the same: `$p_a + p_b = p_a + p_c$` and `$p_c + p_d = p_b + p_d$`.
Therefore:
`\begin{align*}
H_{0}&:~p_{b}=p_{c}\\
H_{1}&:~p_{b}\neq p_{c}
\end{align*}`
]
.col-5[
The McNemar test statistic (with continuity correction) is:
`$$\chi^2= \frac{(|b-c|-1)^2}{b+c}.$$`

If either b or c is small ($b + c < 25$) then `$\chi^2$` is not well-approximated by the `$\chi^2$` distribution with 1 degree of freedom.
]]

---

## R:  McNemar test for paired categorical data
.row[.col-7[

```r
(tab <- xtabs(~y1 + y2, data = Opinions))
```

```
##    y2
## y1    1   2
##   1 227 132
##   2 107 678
```
]]
.row[.col-7[

```r
mcnemar.test(tab, correct=FALSE)
```

```
## 
## 	McNemar's Chi-squared test
## 
## data:  tab
## McNemar's chi-squared = 2.6151, df = 1, p-value = 0.1059
```
]
.col-5[
#### McNemar's test without continuity correction
]
]
.row[.col-7[

```r
mcnemar.test(tab, correct=T)
```

```
## 
## 	McNemar's Chi-squared test with continuity correction
## 
## data:  tab
## McNemar's chi-squared = 2.41, df = 1, p-value = 0.1206
```
]
.col-5[
#### McNemar's test with continuity correction
]
]

---

## McNemar's test is approximating a binomial test
.row[.col-6[
<img src="08.inference.cat-2_files/figure-html/unnamed-chunk-47-1.png" width="85%" style="display: block; margin: auto;" />

```r
binom.test(107,107+132,0.5)
```

```
## 
## 	Exact binomial test
## 
## data:  107 and 107 + 132
## number of successes = 107, number of trials = 239, p-value = 0.1204
## alternative hypothesis: true probability of success is not equal to 0.5
## 95 percent confidence interval:
##  0.3835629 0.5131480
## sample estimates:
## probability of success 
##              0.4476987
```
]
.col-6[

```r
pbinom(107,107+132,.5) + 1-pbinom(132-1,107+132,.5)
```

```
## [1] 0.1203767
```

```r
2*pbinom(107,107+132,.5)
```

```
## [1] 0.1203767
```

Since the binomial test allows us to compute exact p-value without requiring any additional distributional assumptions -- we know the process is a binomial experiment -- we gain nothing by applying a non-parametric permutation test (which yields approximate p-values).
]]