How does sample size affect power




















The sample size for such a study can be estimated as follows:. In case-control study, data are usually summarized in odds ratio, rather than difference between two proportions when the outcome variables of interest were categorical in nature.

If P1 and P2 are proportion of cases and controls, respectively, exposed to a risk factor, then:. The equations in this paper assume that the selection of individual is random and unbiased.

The decisions to include a subject in the study depend on whether or not that subject has the characteristic or the outcome studied. Second, in studies in which the mean is calculated, the measurements are assumed to have normal distributions. The concept of statistical power is more associated with sample size, the power of the study increases with an increase in sample size. Hence, the sample size calculation is critical and fundamental for designing a study protocol.

Even after completion of study, a retrospective power analysis will be useful, especially when a statistically not a significant results are obtained. The analysis of power retrospectively re-emphasizes the fact negative finding is a true negative finding. The ideal study for the researcher is one in which the power is high. This means that the study has a high chance of detecting a difference between groups if one exists; consequently, if the study demonstrates no difference between groups, the researcher can be reasonably confident in concluding that none exists.

The Power of the study depends on several factors, but as a general rule, higher power is achieved by increasing the sample size. A Sample size calculation is an essential step in research protocols and is a must to justify the size of clinical studies in papers, reports etc.

Nevertheless, one of the most common error in papers reporting clinical trials is a lack of justification of the sample size, and it is a major concern that important therapeutic effects are being missed because of inadequately sized studies. Often, the research is faced with various constraints that may force them to use an inadequate sample size because of both practical and statistical reasons. These constraints may include budget, time, personnel, and other resource limitations.

In these cases, the researchers should report both the appropriate sample size along with sample size actually used in the study; the reasons for using inadequate sample sizes and a discussion of the effect of inadequate sample size may have on the results of the study.

The researcher should exercise caution when making pragmatic recommendations based on the research with an inadequate sample size. Sample size determination is an important major step in the design of a research study. Appropriately-sized samples are essential to infer with confidence that sample estimated are reflective of underlying population parameters. The sample size required to reject or accept a study hypothesis is determined by the power of an a-test.

A study that is sufficiently powered has a statistical rescannable chance of answering the questions put forth at the beginning of research study. Inadequately sized studies often results in investigator's unrealistic assumptions about the effectiveness of study treatment. Conducting a study that has little chance of answering the hypothesis at hand is a misuse of time and valuable resources and may unnecessarily expose participants to potential harm or unwarranted expectations of therapeutic benefits.

As scientific and ethical issue go hand-in-hand, the awareness of determination of minimum required sample size and application of appropriate sampling methods are extremely important in achieving scientifically and statistically sound results.

Using an adequate sample size along with high quality data collection efforts will result in more reliable, valid and generalizable results, it could also result in saving resources. This paper was designed as a tool that a researcher could use in planning and conducting quality research. Source of Support: Nil. Conflict of Interest: None declared. National Center for Biotechnology Information , U. J Hum Reprod Sci. KP Suresh and S Chandrashekara 1.

Author information Article notes Copyright and License information Disclaimer. Address for correspondence: Dr. E-mail: moc. This is an open-access article distributed under the terms of the Creative Commons Attribution-Noncommercial-Share Alike 3.

See J Hum Reprod Sci. This article has been cited by other articles in PMC. Abstract Determining the optimal sample size for a study assures an adequate power to detect statistical significance.

Factors that affect the sample size The calculation of an appropriate sample size relies on choice of certain factors and in some instances on crude estimates. Table 1 Factors that affect sample size calculations. Open in a separate window. Table 2 The normal deviates for Type I error Alpha. Table 3 The normal deviates for statistical power.

Study design, outcome variable and sample size Study design has a major impact on the sample size. Alpha level The definition of alpha is the probability of detecting a significant difference when the treatments are equally effective or risk of false positive findings. Variance or standard deviation The variance or standard deviation for sample size calculation is obtained either from previous studies or from pilot study.

Minimum detectable difference This is the expected difference or relationship between 2 independent samples, also known as the effect size. Power The difference between 2 groups in a study will be explored in terms of estimate of effect, appropriate confidence interval, and P value.

Withdrawals, missing data and losses to follow-up Sample size calculated is the total number of subjects who are required for the final study analysis. The sample size can be estimated using the following formula Where P is the prevalence or proportion of event of interest for the study, E is the Precision or margin of error with which a researcher want to measure something.

Sample size estimation with two proportions In study based on outcome in proportions of event in two populations groups , such as percentage of complications, mortality improvement, awareness, surgical or medical outcome etc.

In the example, a. Sample size estimation with odds ratio In case-control study, data are usually summarized in odds ratio, rather than difference between two proportions when the outcome variables of interest were categorical in nature. Shuster JJ. Handbook of sample size guidelines for clinical trials. Altman DG. London, UK: Chapman and Hall; Practical statistics for Medical Research.

Wittes J. Sample size calculations for randomized controlled trials. Epidemiol Rev. Desu M, Raghavarao D. Sample size methodology. Agresti A. New York: John Wilely and Sons; Categorical data analysis.

Statistical information and the fictitious results are shown for each study A—F in Figure 2, with the key information shown in bold italics. Although these six examples are of the same study design, do not compare the made-up results across studies.

Figure 2 Six fictitious example studies that each examine whether a new app called StatMaster can help students learn statistical concepts better than traditional methods click to view larger. In Study A , the key element is the p -value of 0. Since this is less than alpha of 0. While the study is still at risk of making a Type I error, this result does not leave open the possibility of a Type II error. Said another way, the power is adequate to detect a difference because they did detect a difference that was statistically significant.

It does not matter that there is no power or sample size calculation when the p -value is less than alpha. In Study B , the summaries are the same except for the p -value of 0. Since this is greater than the alpha of 0. In this case, the criteria of the upper left box are met that there is no sample size or power calculation and therefore the lack of a statistically significant difference may be due to inadequate power or a true lack of difference, but we cannot exclude inadequate power.

We hit the upper left red STOP. Since inadequate power—or excessive risk of Type II error—is a possibility, drawing a conclusion as to the effectiveness of StatMaster is not statistically possible. In Study C , again the p -value is greater than alpha, taking us back to the second main box. The ability to draw a statistical conclusion regarding StatMaster is hampered by the potential of unacceptably high risk of Type II error. That is a good thing. In Study E , the challenges are more complex.

With a p -value greater than alpha, we once again move to the middle large box to examine the potential of excessive or indeterminate Type II error. Second, a sample size will provide adequate power to detect an effect size that is at least as big as the desired effect size or bigger, but not smaller.

Reviewing the equation earlier in this manuscript provides the mathematical evidence of this concept. Therefore, we are left at the red STOP sign in the lower right corner. Note that, unlike the other red STOP signs, this example requires subjective judgment and is less objective than the other three paths to potentially exceeding acceptable Type II error. Both are described for classes of about 20 students, but you can modify them as needed for smaller or larger classes or for classes in which you have fewer resources available.

Both of these activities involve tests of significance on a single population proportion, but the principles are true for nearly all tests of significance. In advance of the class, you should prepare 21 bags of poker chips or some other token that comes in more than one color. Each of the bags should have a different number of blue chips in it, ranging from 0 out of to out of , by 10s.

These bags represent populations with different proportions; label them by the proportion of blue chips in the bag: 0 percent, 5 percent, 10 percent, Distribute one bag to each student. Then instruct them to shake their bags well and draw 20 chips at random. Have them count the number of blue chips out of the 20 that they observe in their sample and then perform a test of significance whose null hypothesis is that the bag contains 50 percent blue chips and whose alternate hypothesis is that it does not.

They are to record whether they rejected the null hypothesis or not, then replace the tokens, shake the bag, and repeat the simulation a total of 25 times. When they are done, they should compute what proportion of their simulations resulted in a rejection of the null hypothesis. Meanwhile, draw on the board a pair of axes. When they and you are done, students should come to the board and draw a point on the graph corresponding to the proportion of blue tokens in their bag and the proportion of their simulations that resulted in a rejection.

Figure 2 is an example of what the plot might look like. The lesson from this activity is that the power is affected by the magnitude of the difference between the hypothesized parameter value and its true value. Bigger discrepancies are easier to detect than smaller ones.

For this activity, prepare 11 paper bags, each containing blue chips 65 percent and nonblue chips 35 percent. The activity proceeds as did the last one. Below is an example of what the plot might look like. The AP Statistics curriculum is designed primarily to help students understand statistical concepts and become critical consumers of information.

Being able to perform statistical computations is of, at most, secondary importance and for some topics, such as power, is not expected of students at all. Students should know what power means and what affects the power of a test of significance.

The activities described above can help students understand power better. If you teach a minute class, you should spend one or at most two class days teaching power to your students.

AP Central.



0コメント

  • 1000 / 1000