## Assay design

Most researchers typically think about the application of statistics after an experiment has run and the data have been collected. However, using statistical methodology before an assay is conducted enables the researcher to get some idea of whether what they are attempting to measure is feasible, and that the sample size used will be suitable for the system studied. Both time and resources may be wasted without such consideration.

Power calculations are based upon the statistical values a and p. The a of a test is the probability of a false positive, that is, the test indicates a significant difference where one does not in fact exist. This is typically set to 0.05, and a p-value below this is taken as evidence of a significant difference. P indicates the reverse scenario, where the test indicates no significant difference when a real difference does occur. The power of a statistical test is 1-p, and represents the ability of the test to detect a difference when it really does exist. As such, a high power indicates that the test is likely to detect a real difference if one exists, and a low power suggests that the test will not be capable of detecting a real difference. Unlike a, in experimental biology P is typically set to around 0.20, as false negatives are typically deemed less detrimental than false positives. However, certain assays may require higher P values, so the context of the assay is a critical factor. For example, in clinical detection of disease, a false negative indicates that a patient has a disease but the disease is not diagnosed, and as such a low p and therefore a higher power would be essential (Bland, 2000; Ott and Longnecker, 2001).

The power of a test may be estimated before conducting experiments, and this can provide researchers with an indication as to whether their experimental design is sufficiently robust to detect the differences in expression

Table 6.1 Calculation of the sample size required to detect a given change in gene expression, using power calculations based upon a two-tailed t-distribution for specified degrees of biological variance (expressed as coefficient of variance). Calculations conducted using http://www.univie.ac.at/medstat.

Power 0.80

Biological variance (CV%)

Table 6.1 Calculation of the sample size required to detect a given change in gene expression, using power calculations based upon a two-tailed t-distribution for specified degrees of biological variance (expressed as coefficient of variance). Calculations conducted using http://www.univie.ac.at/medstat.

Biological variance (CV%)

1G%

2G%

3G%

4G%

SG%

1.1

S

32

71

126

197

1.2

2

S

1S

32

SG

1.3

1

4

S

14

22

1.4

1

2

S

S

13

1.S

1

2

3

6

S

1.6

1

1

2

4

6

1.7

1

1

2

3

S

1.S

1

1

2

2

4

1.9

1

1

1

2

3

2.G

1

1

1

2

 1G% 2G% le bl 1.1 13 S2 a 1.2 4 13 t c e 1.3 2 6 t e 1.4 1 4 d e g 1.S 1 3 1.6 1 2 n a 1.7 1 2 h c 1.S 1 1 ld ol 1.9 1 1 u_ 2.G 1 1
 3G% 4G% SG% 117 2GS 32S 3G S2 S2 13 24 37 S 13 21 S 9 13 4 6 1G 3 S 7 2 4 6 2 3 S 2 3 4

they are looking for. The worst case scenario is that experiments are conducted and no significant difference is found, as real differences may be present, but the researcher has insufficient evidence to detect them.

Table 6.1 can be used as a quick lookup chart when designing qPCR experiments. These calculations are based upon an unpaired t-test, although these values will at least provide a guideline for other test procedures, indicating a suitable sample size. An estimation of biological variance must first be made, usually as a coefficient of variance (CV = standard deviation divided by the mean, usually expressed as a percentage). This may be an estimate based upon previous experiments using a similar tissue and/or protocol conducted in house or in existing scientific literature. If such information is unavailable an approximation may be made, with an estimate around 30% not being unreasonable. Using the planned sample size, the table provides the fold change that can be detected with a power of 0.80 (A) or 0.95 (B). This should enable researchers to decide upon a suitable sample size for an assay, and to determine the level of resolution that should be possible with that sample size.