class: middle, title-slide # Inference for Categorical Data ## Null Hypothesis Significance Tests ### Dennis A. V. Dittrich ### 2021 --- layout: true <div class="my-footer"> <span><img src="img/tcb-logo.png" height="40px"></span> </div> --- ## Hypothesis Tests .row[.col-7[ A process that uses sample statistics to test a claim about the value of a population parameter. * State the claim mathematically and verbally. Identify the null and alternative hypotheses. `\(H_0: ?\)` vs `\(H_a: ?\)` * Specify the level of significance. `\(\alpha = ?\)` * Determine the sampling distribution and draw its graph. ] .col-5[ ![](img/hyp-06.png) ]] --- ## Hypothesis Tests .row[.col-7[ * Calculate the test statistic. Add it to your sketch. ![](img/hyp-07.png) * Find the p-value. * Use the decision rule to accept or reject your hypothesis. * Write a statement to interpret the decision in the context of the original claim. ]] --- ## Statistical hypothesis .row[.col-6[ .pink[A statement about a population parameter.] **A null hypothesis `\(H_0\)` ** is a statistical hypothesis that contains a statement of equality, such as `\(\leq\)` , `\(=\)`, or `\(\geq\)`. **The alternative hypothesis `\(H_a\)` ** is the complement of the null hypothesis. It is a statement that must be true if `\(H_0\)` is false and it contains a statement of strict inequality, such as `\(>\)`, `\(\neq\)`, or `\(<\)`. ] .col-6[ When one of these hypotheses is false, the other must be true. `\(H_0: \mu \leq k\)` vs `\(H_a: \mu > k\)` `\(H_0: \mu \geq k\)` vs `\(H_a: \mu < k\)` `\(H_0: \mu = k\)` vs `\(H_a: \mu \neq k\)` Regardless of which pair of hypotheses you use, you always assume `\(\mu = k\)` and examine the sampling distribution on the basis of this assumption. ]] --- ## Level of Significance `\(\alpha\)` vs p-value .row[.col-6[ **Level of Significance `\(\alpha\)` ** Your maximum allowable probability of making a type I error. **P-value** The probability, if the null hypothesis is true, of obtaining a sample statistic with a value as extreme or more extreme than the one determined from the sample data. ] .col-6[ * A result is statistically significant if it would rarely occur by chance. * By setting the level of significance at a small value, you are saying that you want the probability of rejecting a true null hypothesis to be small. * Compare the p-value with `\(\alpha\)`. 1. If `\(p \leq \alpha\)`, then reject `\(H_0\)`. 2. If `\(p > \alpha\)`, then fail to reject `\(H_0\)`. ]] --- ## Are the proportions equal? .row[.col-6[ Null-Hypothesis test: is the proportion of flights in the morning equal to 50%? `$$H_0: p=0.5 \text{ vs } H_a: p\neq 0.5$$` Theoretical Null Distribution: Binomial experiment with 500 trials, and success probability of `\(p=0.5\)` ] .col-6[ <br/> <img src="08.inference.cat-2_files/figure-html/unnamed-chunk-2-1.png" width="100%" style="display: block; margin: auto;" /> ]] --- ## Are the proportions equal? .row[.col-7[ Area under the null distribution representing values at least as extreme as our observation <img src="08.inference.cat-2_files/figure-html/unnamed-chunk-3-1.png" width="100%" style="display: block; margin: auto;" /> ] .col-5[ ```r pbinom(214,500,0.5) + (1-pbinom(250+250-214-1,500,0.5)) ``` ``` ## [1] 0.001472458 ``` ]] --- ## The exact binomial test .row[.col-7[ ```r binom.test(x=214, n=500, p=0.5) ``` ``` ## ## Exact binomial test ## ## data: 214 and 500 ## number of successes = 214, number of trials = 500, p-value = 0.001472 ## alternative hypothesis: true probability of success is not equal to 0.5 ## 95 percent confidence interval: ## 0.3841682 0.4726881 ## sample estimates: ## probability of success ## 0.428 ``` ]] --- .row[.col-7[ # The binomial test .pink[with simulated null distribution] ```r # compute the test statistic ( p_hat <- fli_small %>% specify(response = day_hour, success = "morning") %>% calculate(stat = "prop") ) # simulate the null distribution null_distn <- fli_small %>% specify(response = day_hour, success = "morning") %>% hypothesize(null = "point", p = .5) %>% generate(reps = 1000, type="simulate") %>% calculate(stat = "prop") # visualize the null distribution visualize(null_distn) + shade_p_value(obs_stat = p_hat, direction = "two_sided") + theme_minimal() # compute the p-value null_distn %>% get_p_value(obs_stat = p_hat, direction = "two_sided") ``` ] .col-5[ ```r DescTools::BinomCI(214, 500) ``` ``` ## est lwr.ci upr.ci ## [1,] 0.428 0.3853418 0.4717561 ``` <br/> ``` ## # A tibble: 1 x 1 ## stat ## <dbl> ## 1 0.428 ``` <img src="08.inference.cat-2_files/figure-html/unnamed-chunk-7-1.png" width="100%" style="display: block; margin: auto;" /> ``` ## # A tibble: 1 x 1 ## p_value ## <dbl> ## 1 0 ``` ]] --- ## The contingency table --- test of independence .row[.col-6[ <br/> Are the two variables season and day_hour independent? ] .col-6[ If there was no (sampling) uncertainty, we would only need to check whether `$$P(\text{summer})=\frac{P(\text{summer} \land \text{morning})}{P(\text{morning})}$$` Since we are dealing with random variables, we have sampling variability. The question changes to whether we have a statistically significant difference in the proportions of flights in the morning between the two seasons: Does the proportion of flights in the morning depend on the season? ]] --- ## Test of no difference in proportions ### based on a random samples of permutations .row[.col-7[ ```r # Compute test statistic (d_hat <- fli_small %>% specify(day_hour ~ season, success = "morning") %>% calculate(stat = "diff in props", order = c("winter", "summer")) ) # Create the null distribution null_distn <- fli_small %>% specify(day_hour ~ season, success = "morning") %>% hypothesize(null = "independence") %>% generate(reps = 1000, type="permute") %>% calculate(stat = "diff in props", order = c("winter", "summer")) # Visualize the null distribution visualize(null_distn) + shade_p_value(obs_stat = d_hat, direction = "two_sided") + theme_minimal() # Compute the p-value null_distn %>% get_p_value(obs_stat = d_hat, direction = "two_sided") ``` ] .col-5[ ``` ## # A tibble: 1 x 1 ## stat ## <dbl> ## 1 -0.0183 ``` <img src="08.inference.cat-2_files/figure-html/unnamed-chunk-9-1.png" width="100%" style="display: block; margin: auto;" /> ``` ## # A tibble: 1 x 1 ## p_value ## <dbl> ## 1 0.758 ``` ]] --- .row[.col-8[ ## R: Approximative `\(\chi^2\)` test ### for proportions ```r prop.test(c(108, 106), c(247, 253), correct=FALSE) ``` ``` ## ## 2-sample test for equality of proportions without continuity correction ## ## data: c(108, 106) out of c(247, 253) ## X-squared = 0.17049, df = 1, p-value = 0.6797 ## alternative hypothesis: two.sided ## 95 percent confidence interval: ## -0.06846113 0.10501039 ## sample estimates: ## prop 1 prop 2 ## 0.4372470 0.4189723 ``` ] .col-4[ ] ] .row[.col-8[ ```r prop.test(c(108, 106), c(247, 253), correct=TRUE) ``` ``` ## ## 2-sample test for equality of proportions with continuity correction ## ## data: c(108, 106) out of c(247, 253) ## X-squared = 0.10402, df = 1, p-value = 0.7471 ## alternative hypothesis: two.sided ## 95 percent confidence interval: ## -0.0724617 0.1090110 ## sample estimates: ## prop 1 prop 2 ## 0.4372470 0.4189723 ``` ] .col-4[ #### Continuity correction ] ] --- ## R: Approximative `\(\chi^2\)` test for proportions .row[.col-8[ ```r prop.test(c(4, 1), c(5, 5), conf.level=0.95, correct=FALSE) ``` ``` ## ## 2-sample test for equality of proportions without continuity correction ## ## data: c(4, 1) out of c(5, 5) ## X-squared = 3.6, df = 1, p-value = 0.05778 ## alternative hypothesis: two.sided ## 95 percent confidence interval: ## 0.104164 1.000000 ## sample estimates: ## prop 1 prop 2 ## 0.8 0.2 ``` ] .col-4[ .pink[without continuity correction] ] ] .row[.col-8[ ```r prop.test(c(4, 1), c(5, 5), conf.level=0.95, correct=TRUE) ``` ``` ## ## 2-sample test for equality of proportions with continuity correction ## ## data: c(4, 1) out of c(5, 5) ## X-squared = 1.6, df = 1, p-value = 0.2059 ## alternative hypothesis: two.sided ## 95 percent confidence interval: ## -0.09583603 1.00000000 ## sample estimates: ## prop 1 prop 2 ## 0.8 0.2 ``` ] .col-4[ .pink[with continuity correction] <br/><br/> A smaller sample makes a bigger difference ] ] --- ## Pearson's `\(\chi^2\)` test .row[.col-5[ `$$X^2 = \sum \frac{(n_{ij}-\mu_{ij})^2}{\mu_{ij}}$$` ** `\(n_{ij}\)` ** Count in cell `\(ij\)` ** `\(\mu_{ij}\)` ** Expected frequency in cell `\(ij\)` when `\(H_0\)` is true ( `\(\mu_{ij} = n\pi_{ij}\)`) ** `\(\pi_{ij}\)` ** Expected probability, computed by the marginal probabilities `\(\pi_{ij} = \pi_{i}\times \pi_{j}\)` ] .col-7[ For random sampling and for large sample sizes `\(X^2\)` has approximately a `\(\chi^2\)` distribution. It takes its minimum value of zero when all `\(n_{ij} = \mu_{ij}\)`. For a fixed sample size, greater differences `\(n_{ij} - \mu_{ij}\)` produce larger `\(X^2\)` values and stronger evidence against `\(H_0\)`. The P-value is the probability, under `\(H_0\)`, that `\(X^2\)` is at least as large as the observed value. This is the `\(\chi^2\)` right-tail probability above the observed `\(X^2\)` value. The `\(\chi^2\)` approximation improves as `\(\mu_{ij}\)` increase and `\(\mu_{ij} \geq 5\)` is usually sufficient for a decent approximation. ]] --- ## Degrees of freedom for the `\(\chi^2\)` test .row[.col-5[ There are `\(r − 1\)` nonredundant row probabilities. Because they sum to 1, the first r − 1 determines the last one. Similarly, there are `\(c − 1\)` nonredundant column probabilities, so, under `\(H_0\)`, there are (r − 1) + (c − 1) parameters. ] .col-7[ The alternative hypothesis `\(H_a\)` states that there is not independence but does not specify a pattern for the `\(rc\)` cell probabilities. The probabilities are then solely constrained to sum to 1, so there are `\(rc − 1\)` nonredundant parameters. The value for df is the difference between the number of parameters under `\(H_a\)` and `\(H_0,\)` or `\begin{align} \text{df} &= (rc − 1) − [(r − 1) + (c − 1)] \\ &= rc − r − c + 1\\ &= (r − 1)(c − 1) \end{align}` ]] --- .row[.col-8[ ## R: Pearson's `\(\chi^2\)` test ```r chisq.test(ct, correct = F) ``` ``` ## ## Pearson's Chi-squared test ## ## data: ct ## X-squared = 0.17049, df = 1, p-value = 0.6797 ``` ] .col-4[ ``` ## season ## day_hour summer winter ## morning 108 106 ## not morning 139 147 ``` ] ] .row[.col-8[ ```r chisq.test(ct, correct = T) ``` ``` ## ## Pearson's Chi-squared test with Yates' continuity correction ## ## data: ct ## X-squared = 0.10402, df = 1, p-value = 0.7471 ``` ] .col-4[ #### with continuity correction ]] .row[.col-8[ ```r chisq.test(ct, simulate.p.value = T, B=2000) ``` ``` ## ## Pearson's Chi-squared test with simulated p-value (based on 2000 replicates) ## ## data: ct ## X-squared = 0.17049, df = NA, p-value = 0.7346 ``` ] .col-4[ #### Computation of p-value via Monte Carlo Simulation ]] --- ## Test of independence - `\(\chi^2\)` ### based on random samples of permutations .row[.col-7[ ```r # Compute test statistic (d_hat <- fli_small %>% specify(day_hour ~ season, success = "morning") %>% calculate(stat = "Chisq", order = c("winter", "summer")) ) # Create the null distribution null_distn <- fli_small %>% specify(day_hour ~ season, success = "morning") %>% hypothesize(null = "independence") %>% generate(reps = 1000, type="permute") %>% calculate(stat = "Chisq", order = c("winter", "summer")) # Visualize the null distribution visualize(null_distn, method="both") + shade_p_value(obs_stat = d_hat, direction = "right") + theme_minimal() # Compute the p-value null_distn %>% get_p_value(obs_stat = d_hat, direction = "right") ``` ] .col-5[ ``` ## # A tibble: 1 x 1 ## stat ## <dbl> ## 1 0.104 ``` <img src="08.inference.cat-2_files/figure-html/unnamed-chunk-20-1.png" width="100%" style="display: block; margin: auto;" /> ``` ## # A tibble: 1 x 1 ## p_value ## <dbl> ## 1 0.73 ``` ]] --- ## Test of independence - relative risk ### based on random samples of permutations .row[.col-7[ ```r # Compute test statistic (d_hat <- fli_small %>% specify(day_hour ~ season, success = "morning") %>% calculate(stat = "ratio of props", order = c("winter", "summer")) ) # Create the null distribution null_distn <- fli_small %>% specify(day_hour ~ season, success = "morning") %>% hypothesize(null = "independence") %>% generate(reps = 1000, type="permute") %>% calculate(stat = "ratio of props", order = c("winter", "summer")) # Visualize the null distribution visualize(null_distn) + shade_p_value(obs_stat = d_hat, direction = "two_sided") + theme_minimal() # Compute the p-value null_distn %>% get_p_value(obs_stat = d_hat, direction = "two_sided") ``` ] .col-5[ ``` ## # A tibble: 1 x 1 ## stat ## <dbl> ## 1 0.958 ``` <img src="08.inference.cat-2_files/figure-html/unnamed-chunk-21-1.png" width="100%" style="display: block; margin: auto;" /> ``` ## # A tibble: 1 x 1 ## p_value ## <dbl> ## 1 0.71 ``` ]] --- ## Test of independence - odds ratio ### based on random samples of permutations .row[.col-7[ ```r # Compute test statistic (d_hat <- fli_small %>% specify(day_hour ~ season, success = "morning") %>% calculate(stat = "odds ratio", order = c("winter", "summer")) ) # Create the null distribution null_distn <- fli_small %>% specify(day_hour ~ season, success = "morning") %>% hypothesize(null = "independence") %>% generate(reps = 1000, type="permute") %>% calculate(stat = "odds ratio", order = c("winter", "summer")) # Visualize the null distribution visualize(null_distn) + shade_p_value(obs_stat = d_hat, direction = "two_sided") + theme_minimal() # Compute the p-value null_distn %>% get_p_value(obs_stat = d_hat, direction = "two_sided") ``` ] .col-5[ ``` ## # A tibble: 1 x 1 ## stat ## <dbl> ## 1 0.928 ``` <img src="08.inference.cat-2_files/figure-html/unnamed-chunk-22-1.png" width="100%" style="display: block; margin: auto;" /> ``` ## # A tibble: 1 x 1 ## p_value ## <dbl> ## 1 0.68 ``` ]] --- ## Fisher's exact test .row[.col-7[ Can you taste whether someone put milk or sugar first in your tea? Let's run an experiment... ]] .row[.col-4[ ```r # create some data tea <- matrix(c(4, 1, 1, 4), nrow = 2, dimnames = list(Guess = c("Milk", "Tea"), Truth = c("Milk", "Tea"))) tea %>% kbl() ``` <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> Milk </th> <th style="text-align:right;"> Tea </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Milk </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Tea </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4 </td> </tr> </tbody> </table> ] .col-8[ ```r fisher.test(tea) ``` ``` ## ## Fisher's Exact Test for Count Data ## ## data: tea ## p-value = 0.2063 ## alternative hypothesis: true odds ratio is not equal to 1 ## 95 percent confidence interval: ## 0.45474 968.76176 ## sample estimates: ## odds ratio ## 10.9072 ``` ]] --- ## Fisher's exact test .row[.col-7[ When the cell counts result from two independent binomial samples, the row totals (i.e., the binomial n indices) are fixed. Considering only tables that also have the same column totals as the observed data, the cell counts have exactly the hypergeometric distribution for any `\(n\)`, with no parameters. Since there are no unknown parameters, it allows exact rather than approximate or asymptotic P-value calculations. ]] --- class: middle ## Larger Contingency Tables --- ### Two categorical variables, one with more than 2 levels .row[.col-6[ ```r fli_small %>% tabyl(origin,day_hour) %>% adorn_totals(c("row", "col")) %>% adorn_title("combined") %>% kableExtra::kbl(format="pipe") ``` |origin/day_hour | morning| not morning| Total| |:---------------|-------:|-----------:|-----:| |EWR | 79| 86| 165| |JFK | 57| 116| 173| |LGA | 78| 84| 162| |Total | 214| 286| 500| <img src="08.inference.cat-2_files/figure-html/unnamed-chunk-25-1.png" width="95%" style="display: block; margin: auto;" /> ] .col-6[ ```r fli_small %>% ggplot() + geom_bar(aes(y=origin,fill=day_hour), position="fill") + scale_fill_grey() + guides(fill = guide_legend(reverse = TRUE)) + theme(legend.position="top") ``` <img src="08.inference.cat-2_files/figure-html/unnamed-chunk-26-1.png" width="95%" style="display: block; margin: auto;" /> ]] --- ### `\(\chi^2\)` test for independence ### of 2 categorical variables with more than 2 levels .row[.col-7[ ```r # Compute test statistic ( Chisq_hat <- fli_small %>% specify(formula = day_hour ~ origin) %>% calculate(stat = "Chisq") ) # Create null distribution null_distn <- fli_small %>% specify(day_hour ~ origin) %>% hypothesize(null = "independence") %>% generate(reps = 1000, type = "permute") %>% calculate(stat = "Chisq") # Visualize null distribution visualize(null_distn) + shade_p_value( obs_stat = Chisq_hat, direction = "greater") #Compute p-value null_distn %>% get_p_value( obs_stat = Chisq_hat, direction = "greater") ``` ] .col-5[ ``` ## # A tibble: 1 x 1 ## stat ## <dbl> ## 1 10.5 ``` <img src="08.inference.cat-2_files/figure-html/unnamed-chunk-27-1.png" width="95%" style="display: block; margin: auto;" /> ``` ## # A tibble: 1 x 1 ## p_value ## <dbl> ## 1 0.006 ``` ]] --- ### `\(\chi^2\)` test for independence ### of 2 categorical variables, one with more than 2 levels .row[.col-8[ ```r (tab <- fli_small %>% tabyl(origin,day_hour) %>% select(morning, "not morning") %>% as.matrix() ) ``` ] .col-4[ ``` ## morning not morning ## [1,] 79 86 ## [2,] 57 116 ## [3,] 78 84 ``` ]] .row[.col-8[ ```r chisq.test(tab) ``` ``` ## ## Pearson's Chi-squared test ## ## data: tab ## X-squared = 10.49, df = 2, p-value = 0.005274 ``` ] .col-4[ #### Data in matrix <br/> <br/> Here: 1. column: counts of morning flights 2. column: counts of non-morning flights ]] --- ### Test for equality of proportion with more than 2 groups .row[.col-8[ ```r (tab <- fli_small %>% tabyl(origin,day_hour) %>% select(morning, "not morning") %>% as.matrix() ) ``` ] .col-4[ ``` ## morning not morning ## [1,] 79 86 ## [2,] 57 116 ## [3,] 78 84 ``` ]] .row[.col-8[ ```r prop.test(tab) ``` ``` ## ## 3-sample test for equality of proportions without continuity correction ## ## data: tab ## X-squared = 10.49, df = 2, p-value = 0.005274 ## alternative hypothesis: two.sided ## sample estimates: ## prop 1 prop 2 prop 3 ## 0.4787879 0.3294798 0.4814815 ``` ] .col-4[ #### Data in matrix <br/> <br/> 1. column: counts of successes 2. column: counts of failures ]] .row[.col-8[ ```r prop.test(x=c(79, 57, 78), n=c(165,173,162)) ``` ``` ## ## 3-sample test for equality of proportions without continuity correction ## ## data: c(79, 57, 78) out of c(165, 173, 162) ## X-squared = 10.49, df = 2, p-value = 0.005274 ## alternative hypothesis: two.sided ## sample estimates: ## prop 1 prop 2 prop 3 ## 0.4787879 0.3294798 0.4814815 ``` ] .col-4[ #### Data in vectors <br/> <br/> x: vector of successes n: vector of trial count ] ] --- ### `\(\chi^2\)` tests work also for larger `\(n\times k\)` contingency tables (here: 4x4) .row[.col-6[ ```r (Job <- matrix(c(1, 2, 1, 0, 3, 3, 6, 1, 10, 10, 14, 9, 6, 7, 12, 11), 4, 4, dimnames = list( income = c("< 15k", "15-25k", "25-40k", "> 40k"), satisfaction = c("VeryD", "LittleD", "ModerateS", "VeryS") ))) ``` ] .col-6[ ``` ## satisfaction ## income VeryD LittleD ModerateS VeryS ## < 15k 1 3 10 6 ## 15-25k 2 3 10 7 ## 25-40k 1 6 14 12 ## > 40k 0 1 9 11 ``` ]] .row[.col-6[ #### Fisher's exact test ```r fisher.test(Job) ``` ``` ## ## Fisher's Exact Test for Count Data ## ## data: Job ## p-value = 0.7827 ## alternative hypothesis: two.sided ``` ] .col-6[ #### Asymptotic `\(\chi^2\)` test ```r chisq.test(Job) ``` ``` ## ## Pearson's Chi-squared test ## ## data: Job ## X-squared = 5.9655, df = 9, p-value = 0.7434 ``` ]] --- .row[.col-6[ #### Contigency table ``` ## satisfaction ## income VeryD LittleD ModerateS VeryS ## < 15k 1 3 10 6 ## 15-25k 2 3 10 7 ## 25-40k 1 6 14 12 ## > 40k 0 1 9 11 ``` #### Expected counts ```r chisq.test(Job)$expected %>% round(1) ``` ``` ## satisfaction ## income VeryD LittleD ModerateS VeryS ## < 15k 0.8 2.7 9.0 7.5 ## 15-25k 0.9 3.0 9.9 8.2 ## 25-40k 1.4 4.5 14.8 12.4 ## > 40k 0.9 2.8 9.4 7.9 ``` Some expected counts are below 5, the `\(\chi^2\)` approximation may be poor. ] .col-6[ ```r Job.data %>% ggplot() + geom_bar(aes( y = income, fill = fct_rev(satisfaction)), position = "fill") + guides(fill = guide_legend(reverse = TRUE)) + scale_fill_colorblind() + labs(fill= "Satisfaction", x=NULL) ``` <img src="08.inference.cat-2_files/figure-html/job-fig-1.png" width="95%" style="display: block; margin: auto;" /> ]] --- ### R: Creating a long table from a contigency table ### stored as a matrix .row[.col-6[ ```r # create data as.table(Job) %>% as.data.frame() ``` ``` ## income satisfaction Freq ## 1 < 15k VeryD 1 ## 2 15-25k VeryD 2 ## 3 25-40k VeryD 1 ## 4 > 40k VeryD 0 ## 5 < 15k LittleD 3 ## 6 15-25k LittleD 3 ## 7 25-40k LittleD 6 ## 8 > 40k LittleD 1 ## 9 < 15k ModerateS 10 ## 10 15-25k ModerateS 10 ## 11 25-40k ModerateS 14 ## 12 > 40k ModerateS 9 ## 13 < 15k VeryS 6 ## 14 15-25k VeryS 7 ## 15 25-40k VeryS 12 ## 16 > 40k VeryS 11 ``` ] .col-6[ ```r (Job.data <- as.table(Job) %>% as.data.frame() %>% uncount(Freq) ) ``` ``` ## income satisfaction ## 1 < 15k VeryD ## 2 15-25k VeryD ## 3 15-25k VeryD ## 4 25-40k VeryD ## 5 < 15k LittleD ## 6 < 15k LittleD ## 7 < 15k LittleD ## 8 15-25k LittleD ## 9 15-25k LittleD ## 10 15-25k LittleD ## 11 25-40k LittleD ## 12 25-40k LittleD ## 13 25-40k LittleD ## 14 25-40k LittleD ## 15 25-40k LittleD ## 16 25-40k LittleD ## 17 > 40k LittleD ## 18 < 15k ModerateS ## 19 < 15k ModerateS ## 20 < 15k ModerateS ## 21 < 15k ModerateS ## 22 < 15k ModerateS ## 23 < 15k ModerateS ## 24 < 15k ModerateS ## 25 < 15k ModerateS ## 26 < 15k ModerateS ## 27 < 15k ModerateS ## 28 15-25k ModerateS ## 29 15-25k ModerateS ## 30 15-25k ModerateS ## 31 15-25k ModerateS ## 32 15-25k ModerateS ## 33 15-25k ModerateS ## 34 15-25k ModerateS ## 35 15-25k ModerateS ## 36 15-25k ModerateS ## 37 15-25k ModerateS ## 38 25-40k ModerateS ## 39 25-40k ModerateS ## 40 25-40k ModerateS ## 41 25-40k ModerateS ## 42 25-40k ModerateS ## 43 25-40k ModerateS ## 44 25-40k ModerateS ## 45 25-40k ModerateS ## 46 25-40k ModerateS ## 47 25-40k ModerateS ## 48 25-40k ModerateS ## 49 25-40k ModerateS ## 50 25-40k ModerateS ## 51 25-40k ModerateS ## 52 > 40k ModerateS ## 53 > 40k ModerateS ## 54 > 40k ModerateS ## 55 > 40k ModerateS ## 56 > 40k ModerateS ## 57 > 40k ModerateS ## 58 > 40k ModerateS ## 59 > 40k ModerateS ## 60 > 40k ModerateS ## 61 < 15k VeryS ## 62 < 15k VeryS ## 63 < 15k VeryS ## 64 < 15k VeryS ## 65 < 15k VeryS ## 66 < 15k VeryS ## 67 15-25k VeryS ## 68 15-25k VeryS ## 69 15-25k VeryS ## 70 15-25k VeryS ## 71 15-25k VeryS ## 72 15-25k VeryS ## 73 15-25k VeryS ## 74 25-40k VeryS ## 75 25-40k VeryS ## 76 25-40k VeryS ## 77 25-40k VeryS ## 78 25-40k VeryS ## 79 25-40k VeryS ## 80 25-40k VeryS ## 81 25-40k VeryS ## 82 25-40k VeryS ## 83 25-40k VeryS ## 84 25-40k VeryS ## 85 25-40k VeryS ## 86 > 40k VeryS ## 87 > 40k VeryS ## 88 > 40k VeryS ## 89 > 40k VeryS ## 90 > 40k VeryS ## 91 > 40k VeryS ## 92 > 40k VeryS ## 93 > 40k VeryS ## 94 > 40k VeryS ## 95 > 40k VeryS ## 96 > 40k VeryS ``` ]] --- ## Bootstraped `\(\chi^2\)` test for larger contigency tables .row[.col-6[ ```r # Compute test statistic ( Chisq_hat <- Job.data %>% specify(formula = satisfaction ~ income) %>% calculate(stat = "Chisq") ) # Create null distribution null_distn <- Job.data %>% specify(formula = satisfaction ~ income) %>% hypothesize(null = "independence") %>% generate(reps = 1000, type = "permute") %>% calculate(stat = "Chisq") # Visualize null distribution visualize(null_distn) + shade_p_value( obs_stat = Chisq_hat, direction = "greater") #Compute p-value null_distn %>% get_p_value( obs_stat = Chisq_hat, direction = "greater") ``` ] .col-6[ ``` ## # A tibble: 1 x 1 ## stat ## <dbl> ## 1 5.97 ``` <img src="08.inference.cat-2_files/figure-html/unnamed-chunk-42-1.png" width="60%" style="display: block; margin: auto;" /> ``` ## # A tibble: 1 x 1 ## p_value ## <dbl> ## 1 0.776 ``` ]] --- class: middle ## Paired Categorical Data - McNemar Test --- ## McNemar test for paired categorical data .row[.col-7[ The McNemar test is applied to a `\(2\times 2\)` contingency table, which tabulates the outcomes of two tests on a sample of N subjects: ||Test 2 positive| Test 2 negative| Row total| |---|---|---| |Test 1 positive| a| b |a + b| |Test 1 negative| c| d |c + d| |Column total| a + c |b + d|N| The null hypothesis of marginal homogeneity states that the two marginal probabilities for each outcome are the same: `\(p_a + p_b = p_a + p_c\)` and `\(p_c + p_d = p_b + p_d\)`. Therefore: `\begin{align*} H_{0}&:~p_{b}=p_{c}\\ H_{1}&:~p_{b}\neq p_{c} \end{align*}` ] .col-5[ The McNemar test statistic (with continuity correction) is: `$$\chi^2= \frac{(|b-c|-1)^2}{b+c}.$$` If either b or c is small ($b + c < 25$) then `\(\chi^2\)` is not well-approximated by the `\(\chi^2\)` distribution with 1 degree of freedom. ]] --- ## R: McNemar test for paired categorical data .row[.col-7[ ```r (tab <- xtabs(~y1 + y2, data = Opinions)) ``` ``` ## y2 ## y1 1 2 ## 1 227 132 ## 2 107 678 ``` ]] .row[.col-7[ ```r mcnemar.test(tab, correct=FALSE) ``` ``` ## ## McNemar's Chi-squared test ## ## data: tab ## McNemar's chi-squared = 2.6151, df = 1, p-value = 0.1059 ``` ] .col-5[ #### McNemar's test without continuity correction ] ] .row[.col-7[ ```r mcnemar.test(tab, correct=T) ``` ``` ## ## McNemar's Chi-squared test with continuity correction ## ## data: tab ## McNemar's chi-squared = 2.41, df = 1, p-value = 0.1206 ``` ] .col-5[ #### McNemar's test with continuity correction ] ] --- ## McNemar's test is approximating a binomial test .row[.col-6[ <img src="08.inference.cat-2_files/figure-html/unnamed-chunk-47-1.png" width="85%" style="display: block; margin: auto;" /> ```r binom.test(107,107+132,0.5) ``` ``` ## ## Exact binomial test ## ## data: 107 and 107 + 132 ## number of successes = 107, number of trials = 239, p-value = 0.1204 ## alternative hypothesis: true probability of success is not equal to 0.5 ## 95 percent confidence interval: ## 0.3835629 0.5131480 ## sample estimates: ## probability of success ## 0.4476987 ``` ] .col-6[ ```r pbinom(107,107+132,.5) + 1-pbinom(132-1,107+132,.5) ``` ``` ## [1] 0.1203767 ``` ```r 2*pbinom(107,107+132,.5) ``` ``` ## [1] 0.1203767 ``` Since the binomial test allows us to compute exact p-value without requiring any additional distributional assumptions -- we know the process is a binomial experiment -- we gain nothing by applying a non-parametric permutation test (which yields approximate p-values). ]]