Power analysis (English)
This text is an edited version of the AMC sample size calculation manual [1]. It provides a practical guide into sample size calculations used in clinical research. After reading the manual, a researcher will know:
- why power analysis is used to plan and evaluate medical research
- what power and statistical significance mean
- what information is needed for a sample size calculation
- where to find the information needed
- how to perform a simple sample size calculation
- how to write down a power calculation.
In addition, the manual contains two practical examples of sample size calculations.
Why perform a sample size calculation?
The main reasons for perfomring a sample size calculation are ethical. If the number of subjects tested in a study is too small to detect the effect being investigated, the subjects will be subjected to the risks of participating in the study in vain.
The study will easily result in a false negative conclusion. On the other hand, testing too many subjects may also lead to undesirable situations. If an intervention turns out to be effective, too many subjects have missed out on this intervention. If the intervention is not effective, too many have been exposed to this ineffective intervention. For these reasons a trial should always consider what number of subjects would be appropriate to answer the study question. Sample size calculations prior to a study can help focus on the number of subjects that is needed and sufficient for a study. Moreover, a sample size calculation helps one to focus on a clinically relevant effect, instead of the erroneous strategy of testing as many subjects as needed to reach statistical significance of an irrelevant effect.
The CONSORT statement (guideline for reporting clinical trials) states that a researcher should calculate study size on beforehand and should report this calculation in the methods section of the resulting scientific paper. The AMC Medical Ethics Board (MEC) and the Animal Experiments Committee (DEC) also ask for a power calculations in the approval process. The same holds for most study grant applications (e.g., ZonMW).
Finally, the logistic planning of a study benefits from a sample size calculation.
Power and statistical significance
The term ‘power’ pops up everywhere in medical research, certainly in sample size calculations. Often, the term power is interpreted as a synonym for the number of patients tested in a study. ‘Our study did not have enough power to control for possible confounders’ is understood as ‘you didn’t test enough patients to account for several effects’. ‘Our study had 80% power to detect an OR of 1.1 at a significance level of 5%’ is understood as: ‘you have tested enough patients to pick up a possible effect’. Although these interpretations are not (absolutely) wrong, in order to use the concept of power in a sample size calculation, we need to understand its exact meaning. Formally: the power of a study testing the null hypothesis H0 against the alternative hypothesis H1 is the probability that the test (based on a sample from this population) rejects H0, given H0 is false (in the whole population). So the power is the chance of correctly rejecting a null hypothesis (rejecting a null hypothesis given it should be rejected). Since in most tests H0 is stated as ’no difference between groups or no effect of intervention’, for example H0 = ’no difference in survival between treated and control group’, rejecting H0 means you have reason to believe there is a difference. In other words, the power reflects the ability to pick up an effect that is present in a population using a test based on a sample from that population (true positive). The power of a study is closely related to the so called type II error (β), the probability of falsely accepting H0. The power of a study is 1 − β, so it is the probability of rightfully rejecting H0 (see Table 1). In the table also the significance level α is stated. Alpha is the probability of falsely rejecting H0, i.e., falsely picking up an effect (false positive). Note that α only concerns about situations in which no true effect exists in the population. In a sample size calculation one determines the number of patients needed to test the hypothesis with large enough power and small enough significance level. In this way one protects oneself against false negative and false positive conclusions. Table 1: Possible conclusions and errors of a study in relation to the truth. Whole population effect exists no effect exists H1 is true H0 is true effect observed true positive false positive Study H1 appears true power (1 − β) type I error (α) conclusion no effect observed false negative true negative H0 appears true type II error (β) (1 − α)
Information required to calculate a sample size
To make a sample size calculation based on the power of a study one will need information about each of the following values: Desired power of the study 1 − β How much power do you want in the study? Or, stated differently, how certain do you want to be of preventing a type II error? Desired significance level α How certain do you want to be of preventing a type I error? Desired test direction One or two sided test? Clinically relevant (or expected) difference Which difference or which effect are you trying to find?
Expected variance / standard deviation How much variation is expected in subjects belonging to the same study group? Test to be used in statistical analysis How will the hypothesis test be performed in the analysis phase of the study? Attrition rate Anticipate on the number of included subjects who will not be available for the study analysis.
Referenties
-
van Geloven N, Dijkgraaf M, Tanck M, Reitsma J. AMC biostatistics manual - Sample size calculation. 2009. Amsterdam: Academic Medical Center.