## Example

High Blood Pressure Exercise Program

Get Instant Access

Suppose 10 hypertensive subjects are treated with a novel antihypertensive drug. The subjects' blood pressure is measured at 8:00 a.m., just prior to the administration of the drug, and then again 1 hour later: the data are shown in Table 21.2.

The first and second rows of Table 21.2 give the diastolic blood pressure of subjects before and after treatment, respectively. The third row gives the change (A) in diastolic pressure (row 1 minus row 2). The mean of A given in the last column, is 12.8 mmHg. On the face of it, 12.8 mmHg looks like an impressive effect. However, as we have discussed before, we cannot assess its significance without considering the inherent variability, the noise. Indeed, the values of A range from —4 to 33, a substantial range. To assess A's variability, we calculated the deviations of the values of A about their mean, 12.8. These values are given in the next row. Naturally, since the mean is a value somewhere in the middle, some deviations are positive and others are negative. One property of the mean is that the sum of these deviations is always zero. Thus, the average (mean) of the deviations around the mean is always zero, and therefore is not useful as a measure of the variability. Instead, we calculate the mean of the squares of the deviations about the mean as a measure of variability. This measure is called the variance. The variance is an average of non-negative numbers and it is, therefore, always a non-negative number. It is equal to 0 if and only if all the deviates are equal to zero, meaning that all

 Subject 1 2 3 4 5 6 7 8 9 10 Mean Before treatment 102 78 95 86 109 107 100 86 96 92 95.1 After treatment 75 82 80 81 76 93 92 80 90 74 82.3 Difference (A) 27 -4 15 5 33 14 8 6 6 18 12.8 [(A — mean(A)] 14.2 -16.8 2.2 -7.8 20.2 1.2 -4.8 -6.8 -6.8 5.2 0 [A — Mean(A)]2 201.64 282.24 4.84 60.84 408.04 1.44 23.04 46.24 46.24 27.04 110.16

the measurements are the same and thus equal to their mean, i.e. there is no variability at all. The standard deviation (SD), the most commonly used measure of variability, is the square root of the variance. In our case, SD = V 110.16 = 10.50. The advantage of using SD over the variance is that it is measured with the same units as the mean. The mean does not represent the response to treatment of any particular individual. It does, though, give us an idea of the magnitude of the response to treatment produced by the drug. Can we conclude, then, that the drug is efficacious? If the drug is ineffective, the mean change of 12.8 mmHg is due entirely to chance. Statistical theory shows that the likelihood that of a set 10 numbers generated at random with standard deviation of 10.5 would have a mean of 12.8 or larger in either a positive or a negative direction is less than 0.15%. Although this outcome is not impossible, it is highly unlikely. Thus, it is more prudent to conclude that the results of the experiment are due to the drug's effect rather than to chance.

The above example encapsulates many of the ideas and concepts behind the theory of statistical inference. The SD quantifies how widely a measurement is expected to deviate from theoretical typical value of the variable being measured. In our example, the variable being measured is the change between pre- and post-treatment in a patient's dia-stolic blood pressure. So, if the drug is ineffective, any change is due entirely to chance and therefore one would expect the change to be zero. This expected typical value is theoretical. In reality, blood pressure is affected by a variety of factors independent of the treatment and therefore actual measurements will not necessarily be zero. The SD enables us to calculate the probability that the measurements will fall close to, or far away from, zero, e.g. the probability is 95% that a measurement will fall within ±2 SD. That is, assuming the drug is ineffective and the SD is 10.5, 95% of patients treated with the drug should have a change in their pre- and post-treatment diastolic blood pressure between —21 and +21. This is a fairly large range and indeed, all but two of the measurements in our example are within this range. This observation does not contradict our previous conclusion that the drug is effective. This is because our conclusion that the drug is effective was based on the mean of 10 measurements rather than on a single measure ment. The mean change is associated with experimental error. If we calculate the mean change for another set of 10 measurements obtained from different patients, it is unlikely that the result will be 12.8. The variability associated with a mean is smaller than that of a single measurement. The SD associated with the mean [also called the standard error of the mean (SEM) ] is smaller than the SD by a factor equal to the square root of the number of measurements used to calculate the mean. In our example, SEM = 10.5/V 10 = 10.5/3.16 = 3.32. Thus, in our experiment the probability is 95% that the mean change will fall between —6.64 and +6.64. The mean of 12.8 is well outside that range. In fact, 12.8/SEM = 3.85. The probability of obtaining a ratio of 3.85 or larger is now approximately 0.15%.

To summarize, statistical methods are not intended to establish a cause-and-effect relationship between treatment and the response of any individual subject; rather, it is to establish a cause-and-effect relationship in the aggregate response (e.g. the mean) of a population of subjects. The key to this is the fact that, by considering aggregates, one can control the variability of a measured quantity. By increasing the sample size, one can reduce the SEM to a level that would make it possible to determine whether a signal is likely or unlikely to be due to chance, and thus decide whether a causal relationship is likely or unlikely to exist.