Power Curve in R

Cinni Patel
6 min readFeb 12, 2021

--

Power curves are line plots that show how the change in variables, such as effect size and sample size, impact the power of the statistical test.

For this tutorial, we will be generating power and interpreting power curves.

We can use the pwr package to perform statistical power analysis in R.

This package has statistical power analyses for many experiment or study types. These have a common approach: enter three of the four parameter options above (sample size, effect size, statistical significance, and power) and the package will calculate the fourth parameter.

library(tidyverse)
## -- Attaching packages -------------------------------------------------------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.0 v purrr 0.3.3
## v tibble 2.1.3 v dplyr 0.8.4
## v tidyr 1.0.2 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.5.0
library(pwr)

Let’s understand, What is POWER ?

The power of a hypothesis test is the probability that the test correctly rejects the null hypothesis. The power of a hypothesis test is affected by the sample size, the difference, the variability of the data, and the significance level of the test.

In R, it is fairly straightforward to perform a power analysis for the paired sample t-test using R’s pwr.t.test function.

1. Generate and interpret the power curve for a t-test with a fixed sample size of 25 per group.

Let’s Create Function to Generate power curve for t-test for given sample size of α = 5% ( significance level)

power.curve <- function(n){

cd <- seq(.1,1.5,.1) #Vector of effect size
samp.out <- NULL

for(i in 1:length(cd)){
power <- pwr.t.test(d=cd[i],n=n,sig.level=.05,type="two.sample")$power
power <- data.frame(effect.size=cd[i],power=power)
samp.out <- rbind(samp.out,power)
}


ggplot(samp.out, aes(effect.size,power))+
geom_line() +
geom_point() +
theme_minimal() +
geom_hline(yintercept = .8,lty=2, color='blue') +
labs(title=paste0("t-test Power Curve for n=", n),
x="Cohen's d",
y="Power")
}

Call function to generate Power curve for Sample Size 25

n <- 25
power.curve(n)

Interpretation: In this plot, the power curve for a sample size of 25 shows that the test has a power of 0.8 for a difference of 0.8. As the difference approaches 0, the power of the test decreases and approaches α (also called the significance level), which is 0.05 for this analysis.

2. Generate and interpret the power curve for a t-test with a fixed sample size of 100 per group.

Call function (created to solve problem 1) for Sample Size 100

n <- 100
power.curve(n)

Interpretation: In this plot, the power curve for a sample size of 100 shows that the test has a power of 0.8 for a difference of 0.4. As the difference approaches 0, the power of the test decreases and approaches α (also called the significance level), which is 0.05 for this analysis.

3. Generate and interpret the power curve for a 2 proportion test with a fixed sample size of 30 per group

Create Function to Generate power curve for 2-proportion test for given sample size

power.p.curve <- function(n){

cd <- seq(.1,1.5,.1) #Vector of effect size
pwr.2p.test(h=.33,power = .8,sig.level = .05)
samp.p.out <- NULL
for(i in 1:length(cd)){
power <- pwr.2p.test(h=cd[i],n=n,sig.level=.05)$power
power <- data.frame(effect.size=cd[i],power=power)
samp.p.out <- rbind(samp.p.out,power)
}
ggplot(samp.p.out, aes(effect.size,power))+
geom_line() +
geom_point() +
theme_minimal() +
geom_hline(yintercept = .8,lty=2, color='blue') +
labs(title=paste0("2 proportion test power curve for n=", n),
subtitle = "Two proportions",
x="Cohen's d",
y="Power")
}

Call function to generate power curve for t-test for sample size 30

n <- 30
power.p.curve(n)

Interpretation: In this plot, the power curve for a sample size of 30 shows that the 2 proportion test has a power of 0.8 for a difference of 0.71. As the difference approaches 0, the power of the test decreases and approaches α (also called the significance level), which is 0.05 for this analysis.

4. Generate and interpret the power curve for a 2 proportion test with a fixed sample size of 50 per group.

Call function (created to solve problem 3) for sample size 50

n <- 50
power.p.curve(n)

Interpretation: In this plot, the power curve for a sample size of 50 shows that the 2 proportion test has a power of 0.8 for a difference of 0.55. As the difference approaches 0, the power of the test decreases and approaches α (also called the significance level), which is 0.05 for this analysis.

The plots for problems 5–7 are slightly different since we have fixed power at 80%. Think about what values you will use for the x-axis and which values you will use for the y-axis.

5. Generate and interpret the power curve for a t-test with a fixed sample size of 50 per group, power of 80% for values of the significance level between 0.01 and 0.10.

Here, we are asked to generate 80% power curve between 0.01 to 0.10 significance level for size of 50 per group. What we need to visualize is, the difference by significance level.

  • X axis — Effect Size
  • Y axis — Significance level
sig.level.list <- seq(.01,0.10,.01)  #Vector of sig..level 
samp.out <- NULL
for(i in 1:length(sig.level.list)){

eff.xxx <- pwr.t.test(power=.80, sig.level= sig.level.list[i], n=50)$d
eff.xxx <- data.frame(sig.level=sig.level.list[i],effect.size=eff.xxx)
samp.out <- rbind(samp.out,eff.xxx)
}
ggplot(samp.out, aes(effect.size,sig.level))+
geom_line() + theme_bw() +
geom_hline(yintercept = .05,lty=2, color='blue') +
theme_minimal() +
geom_point() +
labs(title="Significance level vs effect Size, power=0.80, n=50",
y="Significance Level",
x="Cohen's d")

Interpretation: In this plot, 80% power curve for a sample size of 50 shows that the t-test has a difference of 0.57 at significance level 0.05. Which is considered as medium. We need a bigger sample size to match the effect size of study.

6. Generate and interpret the power curve for a two proportion test with a fixed sample size of 60 per group, power of 80% for values of the significance level between 0.01 and 0.10.

sig.level.list <- seq(.01,0.10,.01)  #Vector of sig..level 
samp.p.out <- NULL
for(i in 1:length(sig.level.list)){

eff.xxx <- pwr.2p.test(power=.80, sig.level= sig.level.list[i], n=60)$h
eff.xxx <- data.frame(sig.level=sig.level.list[i],effect.size=eff.xxx)
samp.p.out <- rbind(samp.p.out,eff.xxx)
}
ggplot(samp.p.out, aes(effect.size,sig.level))+
geom_line() + theme_bw() +
theme_minimal() +
geom_hline(yintercept = .05,lty=2, color='blue') +
geom_point() +
labs(title="Significance level vs effect Size, power=0.80, n=60",
subtitle = "Two proportions",
y="Significance Level",
x="Cohen's d")

Interpretation: In this plot, 80% power curve for a sample size of 60 shows that the 2 proportion test has a difference of 0.52 at significance level 0.05. Which is considered as medium. We need a bigger sample size to match the effect size of study.

7. Generate and interpret the power curve for a t-test with power of 80%, effect size of 0.7 for values of the significance level between 0.01 and 0.10.

sig.level.list <- seq(.01,0.10,.01)  #Vector of sig..level 
samp.out <- NULL
for(i in 1:length(sig.level.list)){

n.xxx <- pwr.t.test(power=.80, sig.level= sig.level.list[i], d=.7)$n
n.xxx <- data.frame(sig.level=sig.level.list[i],sample.size=n.xxx)
samp.out <- rbind(samp.out,n.xxx)
}
ggplot(samp.out, aes(sample.size,sig.level))+
geom_line() + theme_bw() +
theme_minimal() +
geom_hline(yintercept = .05,lty=2, color='blue') +
geom_point() +
labs(title="t-test Power Curve for Significance level vs Sample Size, power=0.80, effect=.7",
y="Significance Level",
x="Sample size")

Interpretation As you can see, the sample size increases from 25 to 50 for specified power of .80 when alpha(significance level) drops from .10 to .05. This means if we want our test to be more reliable, i.e., not rejecting the null hypothesis in case it is true, we will need a larger sample size.

--

--

Cinni Patel
Cinni Patel

Written by Cinni Patel

Data Scientist — Generalist | Big data Enthusiast | Student at St Thomas Uni