## Resolved and Unresolved Harmonics 341 Defining Resolvability

It is known that the absolute bandwidth of the auditory filters increases with center frequency. Glasberg and Moore (1990) estimated that the equivalent rectangular bandwidth (ERB) of the auditory filter (in Hertz) is given by:

where fc is the center frequency of the filter (in Hertz). For high frequencies the ERB is approximately proportional to center frequency. At 1000 Hz the ERB is about 130 Hz, that is, around 13% of the center frequency.

Whereas the auditory filters become broader with frequency, the component spacing in a complex tone is usually constant (and equal to F0). It follows that the spacing between harmonics, in units of auditory-filter bandwidths, decreases with increasing harmonic number; lower harmonics are separated out in the cochlea (i.e., they excite distinct places on the basilar membrane) and are said to be "resolved," whereas the higher harmonics are not separated out by the cochlea and are said to be "unresolved." Figure 2.3 shows a simulated excitation pattern (the level of excitation on the basilar membrane as a function of center frequency) for a 100-Hz F0 complex with equal-amplitude harmonics (Glasberg and Moore 1990). It can be seen that the first few harmonics produce distinct peaks in the excitation pattern. As harmonic number is increased, the size of the peaks decreases relative to the troughs between them. For the high, unresolved harmonics, several harmonics interact at each place on the basilar membrane, and consequently there is little variation in excitation with center

frequency around each harmonic. However, whereas places on the basilar membrane responding to the lower harmonics show a sinusoidal pattern of vibration at the frequency of the harmonic, places responding to (several) higher harmonics show a complex pattern of vibration that repeats at a rate corresponding to the spacing between the harmonics (which equals F0).

Spectral resolvability depends more on harmonic number than on frequency per se. For example, if the repetition rate of the complex is doubled then the harmonics are spaced twice as far apart. However, each harmonic is doubled in frequency and is therefore shifted to a place on the basilar membrane where the auditory filters are approximately twice as broad. These two effects tend to cancel out, so that the resolvability of a given harmonic number does not change substantially with F0, at least for F0s above about 100 Hz. This relationship would be exact if the bandwidth of the auditory filters were directly proportional to center frequency.

The harmonic number at which the transition from resolved to unresolved occurs is a matter of some debate, and it depends on how resolvability is defined. Consider the excitation pattern plotted in Figure 2.3. At what harmonic number can it be said that the bump in the pattern is insufficient to constitute effective separation of the harmonic from the rest of the complex? Perhaps the most direct definition is based on perceptual separation: for a harmonic to be resolved, a trained listener must be able to "hear out" the harmonic as a pure tone with a distinct pitch. This can be measured by requiring the listener to make a frequency comparison between a pure tone and a harmonic in a complex tone. Most studies suggest that this comparison is possible for harmonics up to around number 5 to 8 (Plomp 1964; Plomp and Mimpen 1968; Moore and Ohgushi 1993), but recent results suggest that this may be possible for harmonics up to number 10 if attention is drawn to the harmonic by gating it on and off (Bernstein and Oxenham 2003).

A less direct definition was proposed by Shackleton and Carlyon (1994). When the harmonics in a complex are presented so that the positive-going zero crossings (the times at which the amplitude crosses zero between a trough and a subsequent peak in the sinusoidal waveform) of the individual harmonics are coincident, the harmonics are said to be in sine phase. The resulting waveform has an envelope that repeats at the F0. If, however, the harmonics are alternated between sine phase and cosine phase (so that the zero crossings of the odd-numbered harmonics are aligned with the peaks of the even-numbered harmonics) the resulting envelope has a repetition rate of twice the F0. This is known as alternating, or ALT, phase (see Fig. 2.4). It turns out that as the lowest harmonic number in the ALT complex is raised above number 10 or so, the periodicity pitch of the complex is an octave higher, corresponding to twice the F0. The implication is that when the harmonics are resolved, the phase relationship between them is irrelevant since the harmonics do not interact significantly in the cochlea. However, when three or more harmonics excite the same place on the basilar membrane (i.e., are unresolved), the resulting pattern of vibration will reflect the phase relationship between them. It is suggested that

Sine Phase Alternating Phase

Sine Phase Alternating Phase

Figure 2.4. An illustration of a brief section of the waveforms of sine phase and alternating phase complexes, similar to those used by Shackleton and Carlyon (1994). These complexes have the same F0 (125 Hz) and the same harmonic numbers, but the pitch of the complex on the right is an octave higher than the pitch of the complex on the left. Both complexes were filtered between 3900 and 5400 Hz.

Figure 2.4. An illustration of a brief section of the waveforms of sine phase and alternating phase complexes, similar to those used by Shackleton and Carlyon (1994). These complexes have the same F0 (125 Hz) and the same harmonic numbers, but the pitch of the complex on the right is an octave higher than the pitch of the complex on the left. Both complexes were filtered between 3900 and 5400 Hz.

periodicity pitch is related to the repetition rate of the temporal envelope of these interacting harmonics, and not to F0 (see Flanagan and Guttman 1960 for earlier work manipulating the temporal envelope of harmonic complexes).

Finally, resolvability can be defined in terms of F0 discrimination. It has been observed that it is much easier to discriminate the F0s of complexes containing low harmonics than the F0s of complexes containing just high harmonics. Houtsma and Smurzynski (1990) measured F0 discrimination for a group of 11 successive harmonics for an F0 of 200 Hz (see Fig. 2.5). As the number of the lowest harmonic was increased from 7 to 13 there was a dramatic increase in the relative F0DL from around 0.25% to around 2.5% of F0, with performance remaining roughly constant as the lowest harmonic number was increased above 13. It was argued that this jump in the F0DL reflects the transition from a complex containing some resolved harmonics to a complex containing no resolved harmonics. Similar experiments carried out by others using different F0s have confirmed that the deterioration in performance is due not to the increasing absolute frequency, but to the increase in the lowest harmonic number present (Carlyon and Shackleton 1994; Shackleton and Carlyon 1994; Kaernbach and Bering 2001; Bernstein and Oxenham 2003).

These experiments suggest that, for most F0s used experimentally, the harmonic number that marks the transition from resolved to unresolved is not less than 5 but no greater than 10. However, the transition point will depend on F0 to some extent. An inspection of Eq. (1.1) reveals that as center frequency is decreased, the ERB expressed as a proportion of center frequency increases. It follows that for low F0s (below around 100 Hz) the harmonics are not as well resolved in the excitation pattern as they are for higher F0s. The transition between resolved and unresolved will occur, therefore, at a lower harmonic number. In effect, resolvability may be defined in terms of the bandwidth of the auditory filter. For example, Moore and Ohgushi (1993) found that listeners

Figure 2.5. The results of Houtsma and Smurzynski (1990) showing the F0DL (as a percentage of F0) for a group of 11 successive harmonics with a nominal F0 of 200 Hz, as a function of the lowest harmonic number in the group. Harmonics were presented in either sine phase or in negative Schroeder phase, in which the phase relationships between harmonics were selected to produce a relatively flat envelope on the basilar membrane.

### 4 7 10 13 16 19 22 25 28 Lowest Harmonic Number

Figure 2.5. The results of Houtsma and Smurzynski (1990) showing the F0DL (as a percentage of F0) for a group of 11 successive harmonics with a nominal F0 of 200 Hz, as a function of the lowest harmonic number in the group. Harmonics were presented in either sine phase or in negative Schroeder phase, in which the phase relationships between harmonics were selected to produce a relatively flat envelope on the basilar membrane.

could determine whether a pure-tone probe was higher or lower than a component in an inharmonic complex at around 75% correct when the spacing between the components was 1.25 times the ERB. Similarly, Shackleton and Carlyon (1994) estimated that harmonics are resolved when there are fewer than two within the 10-dB bandwidth of the auditory filter, as defined by Glasberg and Moore (1990), and unresolved when there are more than 3.25 within the 10-dB bandwidth of the auditory filter.

From the results presented in Section 3.2 it can be seen that the region of harmonic resolvability may not coincide exactly with the region of dominance. However, it is true to say that resolved harmonics, when present, provide a greater contribution to the overall pitch than unresolved harmonics, at least for F0s of 100 Hz and above.

3.4.2 Is F0 Discrimination Dependent on Resolvability or Harmonic Number?

The previous section outlined a number of different measures that seem to converge on the idea that the first 5 to 10 harmonics may be peripherally resolved. The fact that this limit coincides well with a transition between good and poor F0 discrimination suggests that good F0 discrimination requires the presence of some resolved harmonics. Though they may be necessary, the question remains whether resolved harmonics are sufficient to produce good F0 discrimination. A recent study suggests not. Bernstein and Oxenham (2003) repeated part of Houtsma and Smurzinski's (1990) study, with the addition of a "dichotic" condition, in which the odd harmonics were presented to one ear and the even harmonics to the other. They first confirmed that the dichotic presentation doubled the number of harmonics that could be heard out individually, or resolved. As might be expected, because the frequency spacing between adjacent components in each ear was doubled, listeners were now able to hear out the first 15 to 20 harmonics of 100- and 200-Hz F0s. However, when these complexes were used to measure F0 discrimination as a function of the lowest harmonic present, performance was very similar to that found in the diotic condition, in which all components were presented to both ears (see Fig. 2.6). In other words, listeners were not able to make use of the additional resolved components to improve F0 discrimination. This shows that presenting higher components in such a way that they are also resolved does not improve performance. Similar results were found for two-component stimuli by Houtsma and Goldstein (1972; see Section 3.5.3) in normal-hearing listeners and by Arehart and Burns (1999) in hearing-impaired listeners (see Moore and Carlyon, Chapter 7).

The inability of higher harmonics to contribute to the pitch percept, even if they are peripherally resolved, has some interesting theoretical implications. From the perspective of spectral theories of pitch (de Cheveigne, Chapter 6) it suggests that harmonic templates, if they exist, are formed only of the lower harmonics, which are normally resolved. This is consistent with the idea that harmonic templates can build up through exposure to harmonic sounds (Terhardt 1974) or even to any broadband sounds (Shamma and Klein 2000). In both these cases, one requirement for such templates to emerge is that individual harmonics are normally spectrally resolved.

## Post a comment