Jump to ContentJump to Main Navigation
Speech Motor ControlNew developments in basic and applied research$

Ben Maassen and Pascal van Lieshout

Print publication date: 2010

Print ISBN-13: 9780199235797

Published to Oxford Scholarship Online: March 2012

DOI: 10.1093/acprof:oso/9780199235797.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2018. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: null; date: 20 June 2018

Control of movement precision in speech production

Control of movement precision in speech production

(p.37) Chapter 3 Control of movement precision in speech production
Speech Motor Control

Sazzad M. Nasir

David J. Ostry

Oxford University Press

Abstract and Keywords

This chapter reviews evidence that somatosensory precision in speech is also important to the nervous system and appears to be achieved through impedance or stiffness control. A robotic device was used to apply lateral loads to the jaw that altered the motion path and hence somatosensory feedback without affecting speech acoustics. The loads were designed to maximally affect the consonant or vowel-related portion of an utterance. With training subjects corrected for both vowel and consonant-related loads, such that the motion path and presumably the associated somatosensory input returned to that normally experienced under no-load conditions. A control study was run in which subjects first trained with vowel-related loads and then following adaptation the direction of load was reversed unexpectedly. The reversal of the load resulted in deflections that were comparable in magnitude to those observed at the end of adaptation. The findings indicate that even in the absence of any effect on speech acoustics, somatosensory precision is equally important for vowel-related movements and for consonants. The adaptation observed here was achieved by impedance control. The results are consistent with the idea that impedance control is used in attaining the precision requirements of orofacial movement in speech.

Keywords:   somatosensory precision, orofacial movement, nervous system, speech acoustics


Interest to date in precision in speech production has focused on kinematic adjustments related to auditory information. In this paper we review evidence that somatosensory precision in speech is also important to the nervous system and appears to be achieved through impedance or stiffness control. We used a robotic device to apply lateral loads to the jaw that altered the motion path and hence somatosensory feedback without affecting speech acoustics. The loads were designed to maximally affect the consonant or vowel-related portion of an utterance. We found that with training subjects corrected for both vowel and consonant-related loads, such that the motion path and presumably the associated somatosensory input returned to that normally experienced under no-load conditions. A control study was run in which subjects first trained with vowel-related loads and then following adaptation the direction of load was reversed unexpectedly. The reversal of the load resulted in deflections that were comparable in magnitude to those observed at the end of adaptation. Our findings indicate that even in the absence of any effect on speech acoustics, somatosensory precision is equally important for vowel-related movements and for consonants. We further conclude that the adaptation observed here was achieved by impedance control (since impedance control produces resistance in all directions). The results are thus consistent with the idea that impedance control is used in attaining the precision requirements of orofacial movement in speech.

3.1 Introduction

3.1.1 Some issues in speech motor control

Why are the speech movements the way they are? What is special about them? Like limb movement, speech acquisition involves a sensorimotor learning task. Both auditory and somatosensory feedback contribute to speech production. Presumably, speech is dominated by auditory goals, much as limb movement is guided principally by vision. However, like any motor function, somatosensory input, mediated by the muscle and skin receptors, plays a crucial role in the control of speech movements. It is interesting to entertain the possibility that the integration of different sensory inputs gives rise to the characteristic pattern of each movement that we see. (p.38) The differences that exist between speech and other movements may be attributed very generally to this process of multisensory integration. However it should be noted that differences between limbs and speech articulators and the different physiological properties of the muscles that innervate these two types of articulators may also contribute to differences between limb and speech movements. A related issue, that is the source of considerable debate, is the extent to which speech is special: is there anything at all that non-speech movements can teach us about speech? The documented commonalities between speech and limb movement, suggest that it would be worthwhile to consider the extent to which principles of motor control gleaned through the studies of non-speech movements could be applied to speech production. Before moving on further, a few characteristics of speech production should be noted that bear not only on speech control but may more broadly shed light on how sensorimotor regulation of movement is achieved by the nervous system.

Among healthy normal adults, speech movements occur predominantly in the sagittal plane. They involve very little lateral motion, which is unlike chewing movements that occur mainly in the frontal plane. Figure 3.1 shows a frontal view of jaw movements in speech and chewing – two types of orofacial motions with different movement kinematics. Jaw movement was recorded during repetitions of the utterance, straw (/stɹɔ/) and during repeated chewing of a piece of gum. Notice that during speech, jaw position hardly deviates from the mid-sagittal plane. One therefore wonders out of the many possibilities for sound production that could have been used for speech, why it mostly frequently involves two-dimensional planar motion, and indeed why off midline movements such as those of the tongue are generally symmetrical about the sagittal plane? The problem is reminiscent of Bernstein's formulation of the degrees of freedom problem in motor control: to make a reaching movement in three spatial dimensions, the nervous system opts for a particular reaching trajectory generated from a very selective combination of muscle forces out of infinitely many choices (Bernstein 1967). It is likely that mid-sagittal symmetry, along with the geometrically symmetric arrangement in the positions of the articulators, naturally facilitates speech movements in the sagittal plane, for example, perhaps by providing effective constriction-based airway channelling. Presumably, the nervous system takes into account the

                      Control of movement precision in speech production

Fig. 3.1 Frontal view of jaw position during speech (A) and chewing (B). Notice that jaw position during speech deviates little from the mid-sagittal plane.

(p.39) peripheral arrangement of the vocal apparatus to produce speech. However, it is less clear whether the nervous system is largely oblivious to this restriction of movement to the sagittal plane or whether instead it actively limits lateral movements in general during speech? If, for example, a speech movement were deflected laterally, would the nervous system make any corrections for such a perturbation?

Speech movements can require fairly precise positioning of the articulators with a high degree of coordination among them. For example, during production of the fricative consonant, s, (/s/) the jaw has to be raised and maintained in a rather precise elevated position in order to generate the turbulent airflow that gives rise to the /s/ sound. On the other hand, there appears to be greater latitude in the position of the jaw for the production of the vowel, a (/æ/). This then poses the question whether, with regard to the complexity of articulator positioning or the degree of their coordination, if all speech movements are created equal. It may not be surprising to see that the role of the articulators shifts to maintain varying degrees of precision depending on the classes of speech sounds produced. One may think of analogous examples in reaching movement where the position accuracy of the limb may vary from one part of the movement to the other. The precision requirement near the target is presumably the highest. A question thus arises as to what extent is the varying degree of precision that is required within a single goal-directed movement centrally planned and to what extent can this be attributed to mechanical or kinetic properties of the limb or articulator at the periphery executing that movement?

Even when producing similar sounds, precision requirements may differ along different spatial dimensions. Figure 3.2A shows the formant frequencies during production of the vowels and dipthongs of the words saw (/sɔ/), say (/seI/), sas (/sæs/), and sane (/seIn/). Figure 3.2B gives the corresponding position of the jaw in the sagittal plane – protrusion and elevation. As can be seen, jaw position is less variable in the protrusion–retraction direction than in the raising–lowering direction (even though the full range of forward–backward motion is on the order of 10 mm). Overall, kinematic measures with larger amplitudes are more variable in general. This seemingly innocuous fact does not have to be true apriori, but reveals a deeper fact that noise in biological signal scales with amplitude. One of the beautiful elucidations of this principle is Fitt's law that was formulated in the context of reaching movements to targets of varying sizes (Fitt 1954; Harris and Wolpert 1998). This law at its core embodies the idea that reaching to a smaller target requires more time. A pertinent question is whether the fundamentals of speech movements are subject to these same principles of motor control and whether a great deal about speech production can be

                      Control of movement precision in speech production

Fig. 3.2 Example of acoustical precision (A) and variabillity (B) during speech. The kinematic precision of the jaw in the horizontal direction is different from that in the vertical direction. Variability differences in formant frequencies can likewise be noted.

(p.40) learnt from the general rules of motor control. It should be noted that although the tongue plays a primary role in the production of speech sounds, nevertheless the jaw plays an important part to help achieve a stable position for the tongue. Understanding the role of individual articulators may enable us to understand the role of the other articulators in speech production.

3.1.2 Role of somatosensory input in speech control

Speech production is heavily influenced by acoustical feedback, but like any other motor act it is also driven by somatosensory input and, hence, also has somatosensory goals (Nasir and Ostry 2008; Tremblay et al., 2003). It is of interest to determine how these two sensory goals interact in speech motor control and, more specifically, what role somatosensory input plays in guiding speech movement. The prevalent use of robotic devices in recent years has found wide-ranging applications in studies of limb and postural control that bears on the role of somatosensory feedback (see, e.g., Shadmehr and Mussa-Ivaldi 1994). In particular, with the use of these devices, one now has better insights into different limb control mechanisms, consolidation of motor memories, and the basis of variability, to mention just a few. A robotic device enables one to design loads that can be used to perturb the limb in a specific manner, such as subjecting the limb to a predictable force-field during learning, to assess the input–output properties of the nervous system through the response of the limb to such perturbations. Although mechanical perturbations have been used for many years to study properties of oral cavity function, the use of computer-controlled robotic devices to understand speech movements is a fairly recent development. Indeed, it was perturbation of the jaw's movement path while talking that remarkably demonstrated an independent somatosensory goal in speech (Tremblay et al., 2003). This was done by dissociating auditory and somatosensory feedback as the robot deflected the jaw and, hence, proprioception, without affecting speech acoustics. Similar techniques have been used here to answer some of the questions of speech motor control that we raised above (Nasir and Ostry 2006). In the experiments to be reviewed in this chapter, the loads were designed to target either the consonant or vowel-related portion of an utterance, since these are the major sound categories in speech, and were applied in a lateral direction to address the question of whether the lateral position of the jaw is actively regulated. In this way we wanted to determine somatosensory precision requirements for both classes of speech sounds. In addition, we wanted to gather experimental evidence on the mechanism that provides for somatosensory precision in speech production. It should be noted that though the loads were applied to the jaw during speech, our findings may help to understand orofacial control in other goal-directed tasks, such as chewing.

Altering sensory feedback and looking at the motor consequences is a time-honoured method for gauging how the nervous system controls movements (Abbs and Gracco 1983, for an application in speech). There are also studies that look at the effect on speech movements as auditory feedback is altered (Houde and Jordan 1998; Jones and Munhall 2005). Here, we elaborate on a similar theme, where during speech the altered sensory feedback is proprioception.

3.1.3 Role of impedance control

In a typical force-field learning task two types of learning strategies are called for to understand the response of the nervous system to the load. In one control strategy, termed feed-forward mapping, a precise mapping of the force-field is learned. In this case, muscles generate force that compensate exactly the applied load. Alternatively, a mechanism based on impedance control may be used, as a means to deal with a field that interferes with the stability of the limb. In cases where maintaining a precise limb position is required or the nature of the force is uncertain, (p.41) impedance control offers the best strategy. On the other hand a predictable force-field is probably best dealt with using a feed-forward approach. Both of these strategies may of course contribute together to varying degrees in any given movement.

For limb movements, impedance control – resistance to movement – has been invoked as a potential means to aid in the achievement of precision (Burdet et al., 2001; Hogan 1985). Impedance control is achieved by coactivating antagonist muscles to increase the stiffness of the limb. Anything that resists movement contributes to the impedance of limb. Inertia, viscous damping of the limb and stiffness, which gives rise to spring-like properties of the limb, all are components of impedance. Impedance control is thought to provide for an effective mechanism to attain precision in limb movement.

According to the Hooke's law the stiffness (spring constant), K, of an ideal spring (no inertia and no damping) is defined as:

                      Control of movement precision in speech production
where the spring is displaced by an amount x by an external perturbation Fext and K denotes the resistance to displacement. If the spring system is also given damping and inertia, then the previous equation reads:
                      Control of movement precision in speech production
where b and m are the damping coefficient and the inertia, respectively, and dx/dt and d2 x/dt2 are velocity and acceleration. Transformed into Fourier space, Hooke's law is written as:
                      Control of movement precision in speech production
where impedance, Z, is defined as:
                      Control of movement precision in speech production
Notice that when written as above impedance contains in it the stiffness term. In general for any dynamical system a generalized Hooke's law can be defined with a suitable definition of impedance.

For a biological articulator, impedance can be actively modulated by the nervous system. This aspect of active impedance control provides a plausible biological basis to achieve stable positioning of articulators. If an ideal spring has infinite stiffness then from Hooke's law, for any finite perturbation, the resulting displacement is zero. Thus increasing the stiffness, or impedance in general, enables the nervous system to achieve a stable articulator position. In doing so the nervous system may endeavour to increase impedance globally or can change it only to the extent necessary to offset the perturbation. For example, consider a moving limb, which is being pushed off the plane by an external force. The nervous system can choose to resist the perturbation by increasing the limb impedance in all directions. Or it can offset the perturbation by increasing impedance in the direction to which the perturbation is applied. The capacity for directional change in stiffness lets the nervous system maintain stable articulator positioning only in the dimensions demanded by the task.

Plausibly, impedance control could play a prominent role during certain phases of speech, due to different requirements for stability, and hence, for various degrees of precision. By looking at the response of speech articulators to perturbations that are delivered in conjunction with either (p.42) vowel or consonant-related movement, one may hope to gain insights into the control mechanism underlying speech movements.

In this chapter we will review experiments aimed at addressing the problems noted above (see Nasir and Ostry 2006 for more details). The organization of the chapter is as follows. It starts off with a description of the experimental protocol using a robot to perturb speech movements. This is followed by a presentation of the main findings illustrating adaptation in a force-field learning task and acoustical effects that may arise due to learning. The patterns of kinematic variability of jaw movements in the production of consonants and vowels are then discussed. The role of impedance control in achieving precision in speech production is presented afterwards. The chapter closes with discussion and a brief conclusion.

3.2 Experimental paradigm

We assessed the strategies involved in the achievement of precision in speech using the following experimental setup. The subject was seated comfortably in a chair resting the head against a cushion. The subject wore a custom-made acrylic-metal dental appliance (Fig. 3.3A). The subject's head

                      Control of movement precision in speech production

Fig. 3.3 Schematic of experimental methods and typical patterns of adaptation. A. A robotic device was used to deliver loads to the jaw. B. A lateral force was used that scaled with the elevation of the jaw. C. An example of adaptation, where the jaw path is deflected upon the load's introduction (light grey), but returns to the baseline level at the end of training (black).

(p.43) was restrained during the experiment by connecting a second dental appliance, which was glued to the maxillary teeth, to an external frame consisting of a set of articulated metal arms. A computer controlled robotic device was connected to the lower appliance and was used to deliver a load to the jaw. A force/torque sensor was mounted at the tip of the robotic device to measure the restoring force applied by the subject in opposition to the load. Three-dimensional jaw movements were recorded by encoders in the robot arm. The subject's voice was recorded using a directional microphone (Nasir and Ostry 2006; 2008; Tremblay et al., 2003).

The robotic device applied a lateral load to the jaw while the subject repeatedly produced a test utterance. The goal of the manipulation was to perturb the jaw in a systematic manner and to assess the response of the nervous system in such a learning task. As previously noted, an analogous paradigm where the limb is subject to a force-field, has been used widely to study properties of limb movement control (Shadmehr and Mussa-Ivaldi 1994).

The loads were applied during production of the vowel or consonant part of a test-utterance and were, thus, able to perturb the somatosensory feedback during these phases of movement. Figure 3.3B shows an example of force application during consonant production. The top panel shows the vertical position of the jaw during repetitions of the test utterance straw. The second panel shows the raw speech waveform. The bottom panel shows the commanded force to the jaw. Lateral loads were applied to the jaw either during the closing or opening phase of the movement (shaded part in Fig. 3.3B shows an example of load applied during the closing phase of the movement). For a closing-related load, the load came on at the mid-point of jaw closing phase and stayed on until the mid-point of jaw lowering. For opening-related loads, the load came on midway through jaw opening and stayed on until mid-way through jaw closing. The load pushed the jaw laterally in proportion to jaw elevation such that the load was at its peak when the jaw was either fully closed or fully open. The loads thus had a destabilizing effect on the movements of the jaw and served to reduce positioning accuracy in speech movements.

The experiment was carried out in blocks and each block consisted of 15 repetitions of the test utterance. No load was applied during the first three blocks which constituted the ‘null-field’ condition of the experiment and provided a baseline pattern of jaw movement in the absence of any applied load. The subject was subsequently trained with the load on for the next twenty to twenty-five ‘training’ blocks, consisting of approximately 300 repetitions of the test utterance. Following training, the load was unexpectedly turned off. In this ‘after-effect’ block, ‘catch’ trials were recorded in the absence of load. The after-effect block gives clues to the kind of control mechanism employed by the nervous system to learn the force-field task.

The test utterance for consonant-related perturbations was chosen to have a consonant or consonant-cluster at the beginning and a vowel at the end. The consonant-cluster in straw was chosen with the goal of achieving a level of movement complexity that was sufficient to prompt adaptation to the load. The vowel ending in the test utterance was chosen to provide for large amplitude movement of the jaw. It should be noted that the consonant part of the test utterances corresponds to a jaw position near to closure, whereas the vowel part of the word corresponds to maximum jaw opening. Thus the consonant and vowel sounds were largely confined to the two movement extremes, where, possibly, a high degree of movement precision is required by virtue of the jaw having to maintain a relatively stable position.

Subjects were divided into a vowel group, a consonant group, and one impedance control group. Each subject experienced either vowel- or consonant-related loads. The vowel group subjects (seven in total) experienced the load during jaw opening; the consonant group subjects (four in total) experienced the load during closing. In a separate impedance control group (four subjects in total), where a vowel-related load was applied, the after-effect block was replaced by an unexpected 180° switch in the direction of load and no after-effect block was recorded.

(p.44) 3.3 Results

3.3.1 Adaptation patterns

How does the load affect the movement path of the jaw? The load initially affects the curvature of the jaw path, but with training subjects adapt to the load by reducing the curvature. The maximum perpendicular distance from the movement path to a straight line from movement start to end was taken as a measure of curvature. The curvature was computed for each repetition of the test utterance. The raising segment, which began with the jaw fully open and ended with the jaw fully closed, was used for the analyses. The start and end of the movement were scored at 20% of peak vertical velocity. Adaptation was assessed by computing the mean curvature for the first 35% and the last 35% of the force-field training trials. Null-field performance was based on curvature measured during the three null-field familiarization blocks.

A frontal plane view of jaw movement is shown in Fig. 3.3C. During null-field trials, movements are initially straight (grey). With the load's onset the path is deflected laterally (light grey); with training the curvature decreases as the nervous system responds to the movement error introduced by the load (black); following unexpected removal of load there is no after-effect (pale grey). Subjects differed in their degree of adaptation. Figure 3.3C shows an example of complete adaptation, however more typically, there is a significant decrease of curvature relative to the beginning of training but performance never returns to baseline levels (Nasir and Ostry 2006).

Adaptation was observed for vowel- and consonant-related loads (Fig. 3.4). As shown, the null-field curvatures are indicated in grey. As the load is introduced the jaw deviates significantly (light grey). After-training there was a reduction in curvature (black). For vowel-related loads, six of seven subjects showed adaptation (Fig. 3.4A), as indicated by a significant decrease in curvature over the course of training (p 〈 0.01). For consonant-related loads all four subjects tested showed significant adaptation (Fig. 3.4B).

The amount of adaptation was further assessed on a per subject basis by computing the reduction in curvature over the course of training as a proportion of the curvature due to the introduction of load. A value of 1.0 indicates complete adaptation. For vowel-related loads, the amount of adaptation averaged across subjects was 0.46 ± 0.09 (mean ± SE). For consonants, the mean

                      Control of movement precision in speech production

Fig. 3.4 Observed adaptation patterns for consonants and vowels. Curvature increases with the introduction of load (light grey) relative to no load conditions (grey). Adaptation is observed following training (black). Stars designate significant adaptation ( p 〈 0.01).

(p.45) adaptation was 0.35 ± 0.05. Thus there was comparable adaptation when loads coincided with either vowel or consonant production (p 〉 0.5). This suggests that precision requirements are similar for both kinds of speech movements.

3.3.2 Acoustical effects

What role does the somatosensory input play in mediating the adaptation? Clearly, the load affects the movement path of the jaw and, thus, directly alters somatosensory input. The load may also affect the acoustics by altering the vocal tract shape. Any systematic changes in the acoustics due to the load would suggest a role for auditory input in mediating the observed adaptation.

Acoustical effects were assessed by computing spectral measures related to the application of the load. For the consonant-related loads, we manually selected a window that contained the /s/ in straw. For the vowel, the selected window contained the /a/ in straw. The spectral measures were the first and second formant frequencies for vowels and the centroid frequency for /s/ (Fig. 3.5). Notice that the first two formants correspond to the elevation and protrusion dimensions of the oral cavity. The spectral measures in the null-field condition are indicated in white, the introduction of load is in light grey, and after-training is in black. Across subjects, there were no differences found in any of the acoustical measures due to the introduction of load (p 〉 0.05). Moreover, there were no differences in the acoustics from the start to end of training (p 〉 0.05) (Nasir and Ostry 2006). The absence of any measurable acoustical effect suggests that somatosensory input is the primary drive in mediating the adaptation observed in these experiments.

3.3.3 Kinematic variability

It was an intriguing finding that subjects showed comparable adaptation during vowel and consonant related loads, whereas the movements associated with their production had, on the surface, different variability patterns. If error tolerance of a goal-directed movement is reflected in its variability, then movements with greater variability may lead to a lesser degree of adaptation in a force-field learning task, because for those movements subjects would have to correct less for movement errors. The production of the fricative consonant /s/ presumably requires more precise positioning of the articulators than the production of vowel-like sounds such as /a/ and,

                      Control of movement precision in speech production

Fig. 3.5 Effect on the acoustics. The first and second formants of vowels, and the centroid frequency for /s/ were computed under no load conditions (white), at the introduction of the load (light grey) and at the end of training (black).

                      Control of movement precision in speech production

Fig. 3.6 Precision of jaw position during vowel and consonant production. A. Frontal plane view of jaw position during consonants (filled grey triangles) and vowels (black circles) for the utterance straw. B. Variability in jaw position across subjects. C. CV for consonants and vowels. CV is a measure of normalized variation. Once differences in variability due to differences in movement amplitude are accounted for, differences in kinematic variability are eliminated.

hence, may show greater sensitivity to error in a learning task. If this is the case, then consonant-related movements associated with /s/ should have resulted in more adaptation. As this clearly was not the case, we wanted to determine whether the kinematic precision of vowel and consonant movements are indeed different.

Figure 3.6 shows a representative sample of jaw positions in the frontal plane during repetitions of the word straw (Fig. 3.6A). The movement extremes during opening or closing were used to obtain estimates of the variability of the vowel or consonant-related utterance. The figure gives jaw position during both consonant and vowel production. As can be seen, jaw positions during the consonant phase are more tightly clustered than during vowel production, suggesting a greater kinematic precision for consonants. Figure 3.6 further summarizes in box-plots the variability of jaw position across subjects during consonant and vowel productions for the same utterance.

The precision of the jaw position was computed using two measures of jaw variability, which were calculated on a per subject basis: the first was variability relative to the mean jaw position at maximum opening or maximum closing, the second was the coefficient of variation (CV), which is the standard deviation of jaw position divided by the mean jaw position. The rationale for using (p.47) this measure is, as we have already noted, that in many biological signals variability is proportional to amplitude. That is, larger values are naturally more variable. The CV normalizes variability with respect to amplitude and hence enables one to test whether there are differences in variability in vowels and consonants once differences in movement amplitude are factored out. Figures 3.6B and 3.6C show in box-plots the variability and CV of the jaw position during consonant and vowel production for the test utterances. Reliable differences in variability of jaw position in the vertical direction were observed between vowels and consonants for both test words (p 〈 0.01), whereas the two other spatial directions, protrusion and lateral, did not show any differences. When we tested for differences using the CV, no reliable differences in variability were found in any of the three spatial directions between vowels and consonant for either of the test words. Note that the CV provides normalization in that variability is simply linearly adjusted for differences in amplitude. This simple correction, nevertheless, can account for differences in kinematic precision between vowel and consonants.

Notice that comparable kinematic precision was observed in the lateral position of the jaw during the vowel and consonant production. On the basis of this one may expect to see no difference in the adaptation patterns for consonants and vowels for a lateral load, as was the case in the present study. Therefore, it is not surprising that both consonant- and vowel-related movements showed similar adaptation for they did not have different precision requirements, as measured by the CV, to begin with. It should be recalled, however, that the adaptation measure was computed with reference to the jaw trajectory in three dimensions and, hence, compensation to a lateral load as measured by this 3D measure presumably reflects compensation in all three dimensions. This, taken together with the finding of comparable patterns of normalized variation for consonant-and vowel-related movements, points to the conclusion that the nervous system assigns equal weight in controlling their precision.

3.3.4 Impedance control

We explored the way in which subjects achieved adaptation in the present force-field learning task that involved destabilizing loads, with their maximum at movement ends. What kind of control strategy could be employed by the nervous system in order to achieve the observed adaptation? We will provide several lines of evidence to suggest that impedance control was chiefly used to achieve the adaptation in the present studies.

One signature that distinguishes impedance control and feed-forward control is the response to catch trials, when the force-field is turned off unexpectedly following learning. If a feed-forward strategy has been employed to compensate for the applied force, then during a catch trial the kinematic response should mirror that observed at the load's introduction, producing what has been termed as after-effect. On the other hand, for impedance learning, there is no after-effect during the catch trials and the absence of after-effect is a defining characteristic of this control strategy. If the jaw were to behave like a spring with infinite stiffness, then according to Hooke's law any finite external perturbation would produce zero displacement. Therefore, increasing the value of stiffness is a good control strategy for the nervous system in order to offset sudden and uncertain external perturbations. In such a case, during the catch trials, the trajectory would be no different than the trajectory at the end of a force-field learning task since muscle coactivation, which changes impedance, results in no change to net torque and hence no change in movement. Figure 3.3C shows examples of after-effect trials recorded at the end of training (pale grey). In this case the movement path is no different from that observed under null-field conditions (grey). As noted before, had adaptation involved a feed-forward control strategy to offset the external load, one would have expected a negative after-effect with a curvature approaching that of initial-exposure trials. (p.48)

                      Control of movement precision in speech production

Fig. 3.7 Control of jaw impedance. A. Frontal view of the movement path under conditions of force-field reversal. When the direction of the load was reversed unexpectedly after training, the movement path (thin light grey) was the mirror image of the path at the end of training (black). The black arrow indicates the direction of the training load. B. All but one subject showed adaptation in the reversal experiment.

We directly tested the idea that subjects use impedance control to achieve precision at movement endpoints. We ran four new subjects for whom, following adaptation, the direction of the force-field was reversed unexpectedly rather than switched off completely as in the case of catch trials. On the basis of the analogy to a highly stiff spring, we reasoned that if an impedance-based control strategy was being employed to achieve adaptation then subjects' performance following force-field reversal would not differ from that observed at the end of training. Figure 3.7A shows a frontal view of performance in the force-field reversal study. The test word was straw and the load was applied during the vowel. Null-field conditions are in grey. A large lateral deflection is observed with the introduction of load (light grey); substantial adaptation occurs following training (black). When the direction of the load is unexpectedly reversed, the movement path is a mirror image of that observed at the end of training (thin light grey). Figure 3.7B shows significant adaptation to load by all but one subject (p 〈 0.01). Light grey denotes curvature at the introduction of load and black denotes curvature after training. When the load is reversed, curvature (thin light grey) does not differ from that observed at the end of training. Consistent with the idea that adaptation under these conditions is based on impedance control, movement curvature during the force-field reversal trials did not differ from that observed at the end of training (p 〉 0.05 for all subjects).

We further quantified impedance over the course of learning for each of our subjects and for both test-utterances. This was done to assess quantitatively whether subjects indeed increased their impedance over the course of learning. For this, a coefficient of impedance was computed by subdividing each movement into two parts, one in which commanded force was zero and the other in which force increased as a result of displacement of the jaw. We recorded a 3D sensed force vector for each of the two segments and took the magnitude of the vector difference as the measure of force change. The curvature of the movement path was taken as a measure of position change. The impedance coefficient was defined as the ratio between the difference in force and the difference in curvature. This measure captures information about impedance that is suitable for the assessment of impedance change over learning. (p.49)

                      Control of movement precision in speech production

Fig. 3.8 Quantification of impedance. A. Impedance progressively increases over the course of training while curvature decreases with adaptation. B. A linear relation is observed between impedance and curvature. Subjects with greater amounts of adaptation show greater impedance.

Figure 3.8A shows patterns of impedance and associated movement curvature pooled over subjects, test words, and vowel- versus consonant-related loads. Movement curvature is low under null-field conditions (white), increases following the introduction of load (light grey) and decreases significantly with adaptation (black) (p 〈 0.01). In contrast, jaw impedance shows a steadily increasing pattern such that impedance coefficient is low initially and progressively increases with learning to result in a significantly higher impedance coefficient at the end of training (p 〈 0.01). This suggests that subjects achieved adaptation by increasing impedance to help reduce movement curvature.

The relationship between impedance and movement curvature was also assessed quantitatively by computing impedance change and curvature change on a per subject basis over the course of learning. Figure 3.8B shows data for all participants. The abscissa shows the ratio between curvature during initial force-field exposure and curvature at the end of training. Values greater than 1.0 indicate adaptation whereas values less than 1.0 denote lack of adaptation. Larger values indicate greater curvature reduction. The ordinate of the plot shows the ratio of the impedance coefficient at the end of training to that observed with the initial introduction of load. As can be seen, the impedance ratio correlates well with the amount of adaptation (r = 0.8) (Nasir and Ostry 2006). Thus, as expected, subjects that had greater impedance at the end of training showed greater adaptation.

In short, subjects in the present study compensate for destabilizing loads at the extremes of speech movements by increasing impedance to reduce displacement of the jaw. These results are in line with other demonstrations that impedance control plays a significant role in speech production (Shiller et al., 2002; 2005). Moreover, they are consistent with observed changes in impedance in comparable situations of human limb movement.

3.4 Discussion

Some aspects of the observed adaptation are somewhat surprising when compared to what is observed in studies of limb movement. In speech tasks, not all subjects adapt to the load. Moreover, for adapted subjects, the adaptation is seldom complete, as opposed to the near complete adaptation typically observed in limb control studies. This may point to a larger tolerance of error in speech movements. Below we elucidate further on the difference between the adaptation patterns of speech and limb movements.

(p.50) In response to the application of load, subjects typically showed two trends. Most subjects adapted, but some did not. Subjects tested with altered auditory feedback show a comparable trend, where approximately 15% of subjects show no sign of adaptation (Purcell and Munhall 2006). It would be worthwhile to determine whether these two failures are related. It can be envisaged that individuals who fail to adapt to altered mechanical environments may be more sensitive to auditory than to somatosensory feedback and those who fail with auditory perturbations may rely more heavily on proprioception. In contrast, in force-field learning tasks involving limb movements, typically all subjects show adaptation. Another intriguing difference between adaptation studies with limb and jaw movement is that adaptation is hardly complete in speech. It could be that multisensory integration involving proprioceptive and auditory feedback is conducive to providing an increased tolerance to movement error. However, incomplete adaptation may also reflect a broader target range in speech production and hence greater ease in bringing movements within the target zone without full adaptation.

It is unknown the extent to which the phenomena reported here are restricted to speech movements per se. Indeed, the phenomena perhaps more generally characterize orofacial motor learning and are not restricted to speech. The presence of compensation for loads in a lateral direction that is thought to play a limited role in speech production may indicate that impedance control, and presumably the observed adaptation as well, is a general feature of orofacial movement. Nevertheless, these compensations are observed here in the presence of speech movements, suggesting that the nervous system actively regulates the lateral position of the jaw in speech production. A restriction on motion out of the sagittal plane in combination with peripheral geometrical preference for midsagittal plane movement provides for a solution to Bernstein's problem in the context of speech production. Moreover, we can recall at this point that speech movement hardly deviates from the sagittal plane as compared to chewing motion. A priori it could have been the case that speech movements have little lateral deviation and accordingly there is little need for correction by the nervous system. Alternatively, an active control mechanism, such as suggested by our study, keeps speech movement within the sagittal plane. Note also that for both vowels and consonants, the CV is no different in the lateral direction than for movements in the sagittal plane. This implies that precision requirements in the lateral direction are as great as those in the sagittal plane.

3.5 Conclusion

In summary, we have assessed sensorimotor adaptation in speech production, by using mechanical loads to perturb the movement path of the jaw during consonant or vowel production. The load initially produced a lateral deviation of the jaw. With training, for those subjects that adapted to the load, the deviation was reduced. The extent of adaptation was comparable for vowel- and consonant-related loads. Likewise, variability of jaw position, as measured by the CV, was comparable for the vowel and consonant portion of the movements. This would suggest that the two classes of speech sounds – vowels and consonants – have similar precision requirements for jaw positioning. The loads were small enough not to have any noticeable acoustical effects, thereby assigning a prominent role for somatosensory input in mediating the observed adaptation. As the movement extremes of the jaw were perturbed, impedance control, as expected, was found to be the primary means by which the nervous system achieves stability.

To conclude, application of robotic devices to speech motor control helps to elucidate any commonality or difference between speech production and the control of limb movement. Whether the same neural principles of motor control underlie movement in both systems is yet to be determined. Indeed, multisensory integration may produce different movement outcomes (p.51) in speech or limb movement despite the fact that common neural principles underlie the two. For example, the sensitivity to movement error may be dictated by tolerance for sensory errors that control the movement. Movements of the limb that are principally guided by vision may have different precision requirements than those principally guided by audition. It is likely that speech control system will prove to be a fertile ground not only for purposes of understanding the speech motor production per se, but also to uncover general principles of motor control that underlie any movement.


Bibliography references:

Abbs, J.H. and Gracco, V.L. (1983). Sensorimotor actions in the control of multi-movement speech gestures. Trends. Neurosci. 6, 391–5.

Bernstein, N.A. (1967). The coordination and regulation of movements. Pergamon, London.

Burdet, E., Osu, R., Franklin, D.W., Milner, T.E., and Kawato, M. (2001). The central nervous system stabilizes unstable dynamics by learning optimal impedance. Nature 414, 446–9.

Fitts, P.M. (1954). The information capacity of the human motor system in controlling the amplitude of movement. J. Exp. Psychol. 47, 381–91.

Harris, C.M. and Wolpert, D.M. (1998). Signal-dependent noise determines motor planning. Nature. 394, 780–4.

Hogan, N. (1985). The mechanics of multi-joint posture and movement control. Biol. Cyber. 52, 315–51.

Houde, J.F. and Jordan, M.I. (1998). Sensorimotor adaptation in speech production. Science. 279, 1213–16.

Jones, J.A. and Munhall, K.G. (2005). Remapping auditory-motor representations in voice production. Curr. Biol. 15, 1768–72.

Nasir, S.M. and Ostry, D.J. (2006). Somatosensory precision in speech production. Curr. Biol. 16, 1918–23.

Nasir, S.M. and Ostry, D.J. (2008). Speech motor learning in profoundly deaf adults. Nat. Neurosci. 11, 1217–22.

Purcell, D.W. and Munhall, K.G. (2006). Adaptive control of vowel formant frequency: Evidence from real-time formant manipulation. J. Acoust. Soc. Am. 120, 966–77.

Shadmehr, R. and Mussa-Ivaldi, F.A. (1994). Adaptive representation of dynamics during learning of a motor task. J. Neurosci. 14, 3208–24.

Shiller, D.M., Ostry, D.J., and Laboissiere, R. (2002). The relationship between jaw stiffness and kinematic variability in speech. J. Neurophysiol. 88, 2329–40.

Shiller, D.M., Houle, G., and Ostry, D.J. (2005). Voluntary control of human jaw stiffness. J. Neurophysiol. 94, 2207–17.

Tremblay, S., Shiller, D.M., and Ostry, D.J. (2003). Somatosensory basis of speech production. Nature. 423, 866–9. (p.52)