haut de page

mise à jour du
23 octobre 2003
J Acoust Soc Am
Regulating glottal airflow in phonation: application of the maximum power transfer theorem to a low dimensional phonation model
Ingo R. Titze
Department of speech pathology and audiology and The National Center for voice and speech,The University of Iowa, USA
Bâillements, chants et relaxation le cas de Anna D


Abstract : Two competing views of regulating glottal airflow for maximum vocal output are investigated theoretically. The maximum power transfer theorem is used as a guide. A wide epilarynx tube (laryngeal vestibule) matches well with low glottal resistance (believed to correspond to the 'yawn-sigh' approach in voice therapy), whereas a narrow epilarynx tube matches well with a higher glottal resistance (believed to correspond to the "twang-belt" approach). A simulation model is used to calculate mean flows, peak flows, and oral radiated pressure for an impedance ratio between the vocal tract (the load) and the glottis (the source). Results show that when the impedance ratio approaches 1.0, maximum power is transferred and radiated from the mouth. A full update of the equations used for simulating driving pressures, glottal flow, and vocal tract input pressures is provided as a programming guide for those interested in model development.
Speech language pathologies and singing teachers have generated two competing views (and accompanying behavioral strategies) about the management of airflow in phonation. On the one hand, there is the strategy of using a "sigh" to release air with the voice (Linklater, 1976; Colton and Casper, 1996; Brown, 1996), or using a flowphonation mode (Sundberg, 1987). This flow mode strategy helps to obtain maximum peak-to-peak glottal airflow.
On the other hand, there is the opposite strategy of increasing the adduction of the vocal folds, as in belt (Sullivan, 1985; Bestebreurtje and Schutte, 2000) and some country-western singing (Sundberg et al., 1999) to decrease both the average glottal flow and the peak flow for (perhaps) greater glottal efficiency. Even in some classical singing approaches, airflow reduction is sometimes encouraged by the mental image of "drinking in the air" rather than blowing out the air.
In this paper, a few data sets will be presented that simulate a "tight adduction" case and a "loose adduction" case with a computer model of phonation. One objective of the study is to show that both techniques can lead to an optimum acoustic output at the mouth, but the vocal tract configuration has to be matched to the glottal configuration. Tight adduction of the vocal folds requires a narrower supraglottal airway, whereas looser adduction requires a wider airway to maximize the output power. An underlying guiding principle is the maximum power transfer theorem in electric circuits and transmission systems, which states that the internal impedance of the source should match the impedance of the load for maximum power transfer.
A second objective of the study is to update the aerodynamic driving force equations for a low-dimensional model of vocal fold vibration in detail. Some changes have occurred since publication of the three-mass body-cover model (Story and Titze, 1995), particularly with regard to flow separation from the glottal wall and collision forces. In order to continue explorations with this low-dimensional bodycover model, it is important to provide the aerodynamic detail as a programming guide. This dual objective makes this paper somewhat of a nontraditional mixture between model development and a clinical application. But this mixture is justified by the fact that there is an unfortunate history of "modeling for modeling sake," by this and other authors, with insufficient benefit to practitioners in voice and speech. This paper is an attempt to steer toward application while also maintaining a theoretical forward thrust.
Mean glottal airflow (or, alternatively, glottal resistance) has been a target for optimizing vocal output power in voice therapy and singing training. The current investigation suggests that the optirnization process should involve both the vocal tract and the vocal folds. It appears that an impedance matching between the two mght take place. In general, a wide epilarynx tube (from the ventrical to the laryngeal vestibule) requires a low glottal resistance for maximum power transfer. Conversely, a narrow epilarynx tube requires a high glottal resistance (more adduction) for maximum power transfer. What Sundberg (1987) has called the "flow mode" appears to be a condition where the vocal tract impedance is considerably smaller than the glottal impedance, making the glottis a flow source acoustically, as for steady-flow (aerodynarnic) conditions.

Vocologists (those who habilitate voices) have some choices in guiding a speaker or singer. If the desired (or acquired) voice quality is to be bright and "twangy," as in some forms of belting, gospel singing, or some regional dialects, the vocal tract can be more narrow in the epilaryngeal and pharyngeal region (Estill et al., 1996; Story et al., 2001). For such a vocal tract configuration, a well-adducted pair of vocal folds, with relatively high glottal resistance, would be a good match. Because of this higher glottal resistance, lung pressures would likely also be on the high side. Conversely, if the desired (or acquired) voice quality is to be "yawny," as in crooning, sobbing, or a mellow speech dialect (Estill et al., 1996), the epilaryngeal and pharyngeal vocal tract can be wider. For this configuration, a lesser degree of adduction, with lower glottal resistance and probably lower lung pressure, is a good match.

It is already known that "yawn-sigh" is a good combination for voice therapy. Sigh involves a glottal posture with low glottal impedance that matches a "yawny" vocal tract. Less is known about the "twang-belt" combination in voice training and therapy. Here the voice is sometimes initiated with a creaky production, a tighter state of vocal fold adduction. This is a match for twang, a tighter vocal tract configuration. Some vocologists shy away from a twang-belt approach to voice therapy because they fear hyperfunction and excessive vocal fold collision. But since mean glottal flow is smaller, and hence presumably also the amplitude of vibration of the vocal folds, it is not clear that one or the other of these techniques is necessarily more healthy. For the time being, one must keep an open mind about high pressure, low flow production as a viable alternative to low pressure, high flow production. The choice depends to a large degree on the natural state of the vocal tract and the voice quality to be achieved with it.

As a future investigation, it would be worthwhile to examine if the subglottal (tracheal) impedance could assume a compliant characteristic to provide a truc impedance match as a complex conjugate to the supraglottal inipedance. It would also be instructive to test maximum power transfer for conditions where the fundamental frequency is at or above the first formant frequency. Research is presently ongoing in this area.

Source and filter adjustments affecting the perception
of the vocal qualities twang and yawn
Titze IR, Bergan CC, Hunter EJ, Story B
Department of Speech Pathology and Audiology, National Center for Voice and Speech
The University of Iowa, Iowa City 52242, USA.
Logoped Phoniatr Vocol. 2003; 28; 4; 147-155
Two vocal qualities, twang and yawn, were synthesized and rated perceptually. The stimuli consisted of synthesized vocal productions of a sentence-length utterance 'ya ya ya ya ya,' which had speech-like intonation. In a continuum transformation from normal to twang, the area in the pharynx was gradually decreased, along with vocal tract shortening and a decreased open quotient in the glottal airflow. In a continuum transformation toward yawn, the area in the pharynx was gradually increased, along with vocal tract lengthening and an increased open quotient. The normal (untransformed) vocal tract area was pre-determined by earlier studies involving MRI scans of a human subject's vocal tract. Listeners were asked to rate (on a scale from 1-10) the 'amount of twang' in one listening session and the 'amount of yawn' in another listening session. Overall, the perception of twang increased directly with pharyngeal area narrowing, vocal tract shortening, and decreased open quotient. The perception of yawn increased with pharyngeal area widening, vocal tract lengthening, and increased open quotient. Adjustments of one parameter alone yielded less significant perceptual changes than the above combinations, with open quotient showing the greatest effect in isolation. Listeners demonstrated variable perceptions in both continua with poor inter-subject, intra-subject, and inter-group reliability