The use of a PCR dependent reference is by far the most attractive normalization option. In theory it is also the most straightforward method, as both the target gene and PCR dependent reference are measured by realtime PCR and all that is required for their measurement are additional PCR reactions. There are four types of PCR dependent references available:
Genomic DNA (gDNA) measures the amount of gDNA co-extracted with the RNA. This method does not require reverse transcription and so does not control for error introduced by this method. However, the problem with measuring gDNA is that most RNA extraction techniques try and eliminate co-extracted DNA as much as possible since DNA can inhibit the RT-PCR assay. Nevertheless, gDNA PCR dependent referencing has played an important role in research into early stage embryonic development as it can provide very accurate information about which division stage is being measured (Hartshorn et al., 2003).
Spiking the sample with a distinctive (alien) nucleic acid allows an assessment of extraction efficiency and can be used for normalization. The spike, as long as it is not found in the sample of interest, can be synthetic or from another organism. This method benefits from the fact that accurately defined amounts of the spike can be introduced prior to extraction allowing good estimation of error introduced through most of the stages of processing. If RNA spikes are used the reverse transcription reaction can also be controlled for which is essential for fine measurements. The main criticism of using spikes is that, while they can be introduced prior to extraction, unlike the cellular RNAs they are not extracted from within the tissue. Consequently, there may be situations (e.g. if the samples differ histologically) when the spike may not be a good control for the extraction procedure. The concept of the RNA spike has been taken to its maximum level with RNA viral titer measurements from human plasma; here similar whole viruses (e.g., brome mosaic virus when measuring HIV) can be used as spikes providing detailed information on extraction efficiency, reverse transcrip-tase efficiency, can be used to control for PCR inhibition and defining the dynamic range linear limit of the multistaged process outlined in Figure 1.
Reference genes represent the by far most common method for normalizing qRT-PCR data. This strategy targets RNAs that are, it is hoped/assumed/ demonstrated, universally and constitutively expressed, and whose expression does not differ between the experimental and control groups, to report any variation that occurs due to experimental error. Theoretically, reference genes are ideal as they are subject to all the variation that affects the gene of interest. The problem occurs when a reference gene is used without validation.
Reference genes (previously termed housekeeping genes) such as GAPDH (glyceraldehyde-3-phosphate dehydrogenase) are historical carry-overs from RNA measurement techniques that generate more qualitative results (northern blotting, RNase protection assay). These genes were found to be essential for cell metabolism and, importantly, always switched on. Since northern blot analysis is mainly concerned with gross up-or-down regulation, they were perfectly adequate as loading controls.
The development of qRT-PCR transformed RNA analysis from a qualitative and, at best semi-quantitative assay, to a quantitative high-throughput one. Suddenly it became possible to measure far more than crude alterations in RNA expression; it was possible to detect reproducibly small changes in mRNA levels and provide numerical values that could be subject to statistical analysis.
The increasing emphasis on quantification meant that the requirement for ubiquitous expression was no longer sufficient. Now its expression also had to be stable and not be affected by experimental design. At best a poorly chosen reference gene would reduce the resolution of the assay by introducing additional noise and, at worst, the reference gene would be directly affected by the experimental system; directional shift would either hide a true change in the gene of interest or present a completely false result.
What has become apparent over recent years is that there is no single reference gene for all experimental systems (at least none has been discovered). Vandesompele et al., (2002) quantified errors related to the use of a single reference gene as more than three-fold in 25% and more than six-fold in 10% of samples. Tricarico et al., (2002) showed that VEGF mRNA levels could be made to appear increased, decreased, or unchanged between paired normal and cancer samples, depending only on the normaliser chosen. Today it is clear that reference genes must be carefully validated for each experimental situation and that new experimental conditions or different tissue samples require re-validation of the chosen reference genes (Schmittgen, 2000). There are two strategies that can be used to validate a reference gene:
(a) the first method is the whole error approach; this strategy directly measures the error of the raw data of candidate reference genes within the experimental system as illustrated by Dheda et al. (2004). The measured error comprises both the experimental error and the biological fluctuation of the chosen reference gene. This method is dependent on good laboratory technique not introducing technical error that will reduce the resolution, may require one of the other normalization methods discussed above to increase this resolution, and is dependent on choosing the correct reference. This approach scores over strategies that normalize against total RNA in that it defines the resolution of the individual assay and so provides a measurable parameter for assessing the likelihood of a quantitative result being meaningful. This method provides a simple strategy for reference gene normalization and is ideal when resources are limiting and when the measurement of larger differences (>five-fold) are required. However, this strategy is inappropriate when measuring more subtle changes in mRNA levels (<two-fold). This is due to the fact that the initial validation must be normalized to another factor (e.g. total RNA) and so is limited by the resolution of this factor, which may not be accurately measurable.
(b) The second method uses the error induced trends approach. This strategy also measures the error of the raw data of candidate reference genes within the experimental system. However this strategy differs from the single validated reference gene approach by comparing the fluctuations between the respective reference genes. Consequently, should error be introduced by a particular sample that artificially increases the measurement this will be observed within all the measured reference genes and can be compensated. Furthermore, by measuring the geometric mean as opposed to the arithmetic mean the influence of outliers can be greatly reduced (Vandesompele et al., 2002). As with the single validated gene approach, this method defines the assay resolution; however it can give a far better estimation of this resolution as it is independent of technical error so that measurements as small as 0.5 fold have been reported
(Depreter et al., 2002). The drawback to this method is that it requires the measurement of at least three reference genes and, for very fine resolution, may require as many as 10. This is hardly feasible when comparing expression patterns in numerous tissues, or when using several different treatment regimes. Furthermore, it may be necessary to revalidate reference genes when changing extraction procedures, using new enzymes or analysis procedures.
The two strategies constitute two extremes: one is applicable when measuring relatively large changes on a small budget (whole error approach), the other when the emphasis is more on measuring subtle changes in mRNA levels (error induced trends approach).
When choosing potential candidate reference genes the resolution can be increased if candidate genes are chosen based on their function not being involved in the experimental question. For example, there is no point in using GAPDH as a reference gene when analyzing glycolytic pathways as GAPDH is directly involved in this and so would be likely to be influenced by studies that looked at sugar metabolism, for example.
Data can also be supported by measuring a number of different reference genes which can strengthen conclusions as the experiment can be validated using the error induced trends approach on an individual experimental basis; this is of particular value when small measurements need to be made. Finally, it is important that intergroup comparisons must take into account any variability of the reference genes, as this defines the overall error and provides an estimate of the reliability of measurements of mRNA levels.
Was this article helpful?