Analysis of Measurement Errors

No review of microarray technology would be complete without a discussion of appropriate criteria for quality control. Microarray technology is essentially a highly parallel form of a dot blot. Correlations with assays by other methods, e.g. Northern blot, are surprisingly good [17, 18]. Unfortunately, even when most genes are accurately measured, data for individual genes may be discrepant. In one study, four of forty-six differences substantiated by Northern analysis were not demonstrable in the array [19]. This error may reflect the inability of a dot on an array to distinguish between related genes, splice forms, or alternative polyade-nylation sites. Another problem is the error due to "non specific" cross hybridization. In one study in our lab we found differential hybridization at a spot that turned out to carry an ALU-rich sequence. The gene spotted at this site apparently was not even present in the RNAs being used to make probes.

Even when arrays accurately detect a species of mRNA, spots that have very little hybridization will of necessity have a high variance due to fluctuation in the

Fig. 8.4 Error in ratio as a function of Expression level for typical array data. Error is expressed as the standard deviation of the ratio in 4 quadrants of the spot; expression level is expressed as the sum of the Cy3 and Cy5 signals. Since most ratios are near 1, the standard deviation * 100 is approximately equal to the percent error in the ratio. The image inset shows a portion of the array data from which this graph was derived. Two spots corresponding to a highly and lowly expressed gene are circled in red in both the image and the graph.

Fig. 8.4 Error in ratio as a function of Expression level for typical array data. Error is expressed as the standard deviation of the ratio in 4 quadrants of the spot; expression level is expressed as the sum of the Cy3 and Cy5 signals. Since most ratios are near 1, the standard deviation * 100 is approximately equal to the percent error in the ratio. The image inset shows a portion of the array data from which this graph was derived. Two spots corresponding to a highly and lowly expressed gene are circled in red in both the image and the graph.

background and other sources of noise. Expression ratios based on such spots may be the result of two values of which neither has, or only one offers, statistical confidence. The resulting number may be meaningless (Fig. 8.4).

A complimentary issue is the assumption that the hybridization curves for any two samples are parallel at all spots. This, of course, is unlikely. Obviously most curves will be at best S-shaped and some data may not even be monotonic. The most extreme example of this problem is recent evidence for differences in hybridization with fluorescent probes based on Cy3 and Cy5 [20]. There are two types of observed discrepancies. First, it appears that the Cy3/Cy5 ratio is not linear with intensity, e.g., if one takes aliquots of the same RNA, labels one in Cy3 and one in Cy5, and hybridizes them to an array, the observed ratio is often a non-linear function of intensity. Second, there are some spots on any given array which appear to be consistently differentially expressed in one channel regardless of the samples that are applied. The exact cause of this effect is not yet proven. Correction for the Cy3/Cy5 error requires either that the array be reciprocally hybridized (e.g. a "flipped color experiment') or that a universal standard be used for one "channel" of arrays having two colors.

Was this article helpful?

0 0

Post a comment