Stable and unstable intervals
We further segment voiced regions into stable and unstable intervals. As mentioned in the introduction, stable intervals have a quasi-constant energy, whereas in unstable intervals, the energy rises or falls significantly. Given the mean energy of a frame as computed in section 9.2.2, a frame is defined as stable in the following way: its mean energy must not deviate by more than 50% from the mean energy of the previous frame and also not by more than 50% from the mean energy of the next frame. By setting the threshold for the relative mean energy difference to 50%, we allow some tolerance for the energy differences between frames of a stable interval. This is justified since speech signals show high variations.
Figure 9.2 shows a voiced region of a speech segment with three stable intervals S1, S2 and S3. The figure also depicts the series of overlapping speech frames for processing.
Figure 9.2. Overlapping frames of a voiced region of a speech segment that contains the words “the north” uttered by a male speaker. Three stable intervals S1, S2 and S3 of lengths 1, 9 and 27 are identified.