Wahlund Effect And Fstatistics

We have just seen how yet another inbreeding coefficient based upon the concept of identity by descent enters into the population genetic literature, but this time as a measure of how the balance of drift and gene flow influences identity by descent and coalescent times within and between demes in a subdivided species. We also saw in Chapter 4 that genetic drift influences many genetic parameters besides identity by descent, including the variance of allele frequencies across isolated replicate demes. This aspect of drift motivates an alternative definition of Fst in terms of variances of allele frequencies across the local demes (Cockerham and Weir 1987).

Consider first a model in which a species is subdivided into n discrete demes where Ni is the size of the ith deme. Suppose further that the species is polymorphic at a single autosomal locus with two alleles (A and a) and that each deme has a potentially different allele frequency (due to past drift in this neutral model). Let pi be the frequency of allele A in deme i. Let N be the total population size (N = Y, Ni) and wi the proportion of the total population that is in deme i (wi = Ni /N). For now, we assume random mating within each deme. Hence, the genotype frequencies in deme i are

The frequency of A in the total population is p = J2 wipi. If there were no genetic subdivision (that is, all demes had identical gene pools), then with random mating the expected genotype frequencies in the total population would be

However, in the general case where the demes can have different allele frequencies, the actual genotype frequencies in the total population are

Genotype AA Frequency p?

2PiQi qi

Aa aa

Genotype AA Aa aa Frequency p2 2 pq q2

By definition, the variance in allele frequency across demes is n n n

Var(p) = ap = J^ W(pt - p)2 = J2 w'Pf - P2 = w'q2 - 42 (6'22)

Substituting equation 6.22 into 6.21, the genotype frequencies in the total population can be expressed as n

By factoring out the term 2pq from the heterozygote frequency in equation 6.23, the observed frequency of heterozygotes in the total population can be expressed as

where fst = ap/(pq). Hence, the genotype frequencies can now be expressed as

Note the resemblance between equation 6.25 and equation 3.1 from Chapter 3. Equation 3.1 describes the deviations from Hardy-Weinberg genotype frequencies induced by system-of-mating inbreeding ( f ). Because a variance can only be positive, fst > 0, which implies that the subdivision of the population into genetically distinct demes causes deviations from Hardy-Weinberg that are identical in form to those caused by system-of-mating inbreeding within demes ( f < 0). This inbreeding coefficient is called fst because it refers to the deviation from Hardy-Weinberg at the total population level caused by allele frequency deviations in the subdivided demes from the total-population allele frequency. This deviation from Hardy-Weinberg genotype frequencies in the species as a whole that is caused by population subdivision is called the Wahlund effect, after the man who first identified this phenomenon.

The parameter fst is a standardized variance of allele frequencies across demes. In the extreme case where there is no gene flow at all (m = 0), we know from Chapter 4 that drift will eventually cause all populations to either lose or fix the A allele. Since drift has no direction, a portion p of the populations will be fixed for A, a portion q will be fixed for a, and the variance (equation 6.22) becomes p(1 - p)2 + q(0 - p)2 = pq. Therefore, fst is the ratio of the actual variance in allele frequencies across demes to the theoretical maximum when there is no gene flow at all.

From equation 6.24, an alternative expression for fst is given by

Freq(Aa) 2pq - Freq(Aa) Ht - Hs fst = 1--—— =-—-= —---(6.26)

2 -pq 2pq Ht where Ht = 2pq is the expected heterozygosity if the total population were mating at random and Hs is the observed frequency of heterozygotes in the total population, which is the same as the average heterozygosity in the subpopulations (recall that random mating is assumed within each subpopulation). The definition of fst given by equation 6.26 is useful in extending the concept of fst to the case with multiple alleles, as expected and observed heterozygosities are easily calculated or measured regardless of the number of alleles per locus. Equation 6.26 is used more commonly in the literature to measure population structure than equation 6.17, but readers need to be wary as many papers do not explicitly state which of the two definitions of Fst/fst is being used. This is unfortunate, because the distinction can sometimes be important.

Equation 6.10 shows that the Fst defined in terms of probability of identity by descent can be related to the amount of gene flow, m, under the island model of gene flow in an equilibrium population. In the island model, a species is subdivided into a large number of local demes of equal size and with each local deme receiving a fraction m of its genes per generation from the species at large (Figure 6.4). Under this model, the variance in allele frequency across demes for an autosomal locus with two alleles reaches an equilibrium between drift and gene flow of (Li 1955)

Because fst = ap/(pq) for a two-allele system, we have 