Entertainment & Humor

The perception of speech sounds by the human brain as reflected by the mismatch negativity ~MMN! and its magnetic equivalent ~MMNm!

Published
of 20
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Description
Psychophysiology, 38 ~2001!, Cambridge University Press. Printed in the USA. Copyright 2001 Society for Psychophysiological Research PRESIDENTIAL ADDRESS, 1999 The perception of speech sounds by
Transcript
Psychophysiology, 38 ~2001!, Cambridge University Press. Printed in the USA. Copyright 2001 Society for Psychophysiological Research PRESIDENTIAL ADDRESS, 1999 The perception of speech sounds by the human brain as reflected by the mismatch negativity ~MMN! and its magnetic equivalent ~MMNm! RISTO NÄÄTÄNEN Cognitive Brain Research Unit ~CBRU!, Department of Psychology, University of Helsinki, Finland BioMag Laboratory, Helsinki University Central Hospital, Helsinki, Finland Abstract The present article outlines the contribution of the mismatch negativity ~MMN!, and its magnetic equivalent MMNm, to our understanding of the perception of speech sounds in the human brain. MMN data indicate that each sound, both speech and nonspeech, develops its neural representation corresponding to the percept of this sound in the neurophysiological substrate of auditory sensory memory. The accuracy of this representation, determining the accuracy of the discrimination between different sounds, can be probed with MMN separately for any auditory feature ~e.g., frequency or duration! or stimulus type such as phonemes. Furthermore, MMN data show that the perception of phonemes, and probably also of larger linguistic units ~syllables and words!, is based on language-specific phonetic traces developed in the posterior part of the left-hemisphere auditory cortex. These traces serve as recognition models for the corresponding speech sounds in listening to speech. MMN studies further suggest that these language-specific traces for the mother tongue develop during the first few months of life. Moreover, MMN can also index the development of such traces for a foreign language learned later in life. MMN data have also revealed the existence of such neuronal populations in the human brain that can encode acoustic invariances specific to each speech sound, which could explain correct speech perception irrespective of the acoustic variation between the different speakers and word context. Descriptors: Mismatch negativity ~MMN!, Magnetic MMN ~MMNm!, Event-related potentials ~ERP!, Speech sounds, Speech perception, Phonemes, Categorical processing, Central auditory processing, Central sound representation I thank Kimmo Alho, Judy Ford, Teija Kujala, Walter Ritter, Mari Tervaniemi, and Istvan Winkler for their very helpful comments on the previous version of this manuscript. Address reprint requests to: Risto Näätänen, P.O. Box 13 ~Meritullinkatu 1!, University of Helsinki, Finland. risto. MMN as an Index of the Central Sound Representation (CSR) The mismatch negativity ~MMN! ~Figure 1! is a frontocentrally negative component of the auditory event-related potential ~ERP!, usually peaking at ms from stimulus onset, that is elicited by any discriminable change in some repetitive aspect of the ongoing auditory stimulation irrespective of the direction of the subject s attention or task ~Näätänen, Gaillard, & Mäntysalo, 1978; for reviews, see Kraus & Cheour, 2000; Kraus, McGee, Carrell, & Sharma, 1995a; Näätänen, 1990, 1995; Näätänen & Alho, 1995, 1997; Picton, Alain, Otten, Ritter, & Achim, 2000!. The fact that MMN ~and its magnetic equivalent MMNm! can be elicited even in the absence of attention makes it a unique measure of auditory discrimination accuracy, with no comparable measure being provided by any of the more recent, and more modern, brainimaging technologies, such as positron emission tomography ~PET! or functional magnetic resonance imaging ~fmri!. Although some studies ~Näätänen, Paavilainen, Tiitinen, Jiang, & Alho, 1993a; Woldorff, Hackley, & Hillyard, 1991; Woldorff, Hillyard, Gallen, Hampson, & Bloom, 1993! showed that the MMN amplitude can be modulated by strongly focused attention in dichotic selectivelistening conditions, no data suggest that the withdrawal of attention can totally eliminate the MMN that would otherwise be elicited ~for a review, see Näätänen, 1991!. As an account of this data pattern, Ritter, Deacon, Gomes, Javitt, and Vaughan ~1995! proposed that the MMN generator per se is fully automatic, with the MMN-amplitude attenuation associated with the withdrawal of attention being caused by the reduced afferent input to the MMN generator mechanism. Perhaps the most convincing evidence for the automaticity of the MMN generator is provided by MMN recorded in coma patients ~then strongly predicting the return of consciousness within a week; Kane, Butler, & Simpson, 2000; Kane, Curry, Butler, & Gummins, 1993; Kane et al., 1996; see also Fischer et al., 1999, Fischer, Morlet, & Giard, 2000; Morlet, Bouchet, & Fischer, 2000!, in sleeping subjects at the stage-2 and REM sleep ~Campbell, Bell, & Bastien, 1991; Sallinen, Kaartinen, & Lyytinen, 1994, 1996!, 1 2 R. Näätänen Figure 1. Left: Frontal ~Fz! event-related potentials ~ERPs; averaged across subjects! to 1000-Hz standard ~thin line! and to deviant ~thick line! stimuli of different frequencies, as indicated on the left side. Right: The difference waves obtained by subtracting the standard-stimulus ERP from that to the deviant stimulus separately for the different deviant stimuli. MMN mismatch negativity. Adapted from Sams et al. ~1985!. Copyright 1985 Elsevier Science Publishers BV ~Biomedical Division!. and in anesthetized cats ~Csépe, Karmos, & Molnár, 1989!, guinea pigs ~Kraus et al., 1994a; Kraus, McGee, Littman, Nicol, & King, 1994b!, and rats ~Ruusuvirta, Penttonen, & Korhonen, 1998!. In these conditions, the MMN amplitude is lower than normal, however, which is consistent with data ~Lang et al., 1995; May, Tiitinen, Sinkkonen, & Näätänen, 1994! suggesting reduction of MMN amplitude with decreased vigilance and increased drowsiness. As already mentioned, any discriminable auditory change elicits an MMN. MMN is thus elicited, for example, by a change in a simple sound such as a sinusoidal tone ~Näätänen et al., 1978; Sams, Paavilainen, Alho, & Näätänen, 1985; see Figure 1! or in a complex sound such as a phoneme ~Aaltonen, Niemi, Nyrke, & Tuhkanen, 1987! or a complex spectrotemporal pattern ~Näätänen et al., 1993b; Schröger et al., 1994!. Importantly, the repetitive ~ standard! stimulus element does not have to be acoustically constant for MMN to be elicited, as long as some pattern or rule is shared by the standards. MMN is then elicited by stimuli violating this pattern or rule ~Paavilainen, Jaramillo, Näätänen, & Winkler, 1999; Paavilainen, Saarinen, Tervaniemi, & Näätänen, 1995; Paavilainen, Simola, Jaramillo, Näätänen, & Winkler, in press; Saarinen, Paavilainen, Schröger, Tervaniemi, & Näätänen, 1992; Tervaniemi, Rytkönen, Schröger, Ilmoniemi, & Näätänen, in preparation!. Furthermore, MMN elicitation tolerates some range of standard-stimulus variation ~Gomes, Ritter, & Vaughan, 1995; Huotilainen et al., 1993; Winkler et al., 1990!. MMN elicitation is based on the presence of short-term memory ~sensory-memory! trace, formed in the auditory cortex, representing the repetitive aspect or element of stimulation that usually has to be repeated at least once ~with a short enough interval! before a deviant event can elicit the MMN ~Cowan, Winkler, Teder, & Näätänen, 1993; Näätänen, 1984; Winkler, Cowan, Csépe, Czigler, & Näätänen, 1996!. The trace underlying MMN elicitation usually fades within 5 10 s ~Näätänen, Paavilainen, Alho, Reinikainen, & Sams, 1987; Sams, Hari, Rif, & Knuutila, 1993!; thereafter no MMN can be elicited irrespective of how wide the stimulus deviation is. ~For long-term auditory traces probed with MMN, see below.! The results of several studies ~e.g., Winkler & Näätänen, 1992; Winkler, Reinikainen, & Näätänen, 1993! using MMN in the backward-masking paradigm also suggested that MMN reflects echoic memory ~the traces underlying this memory being probed by presenting deviant stimuli!. Reviewing a large number of converging MMN studies, Näätänen and Winkler ~1999!, however, extended this notion to involve the central sound representation ~CSR! of the brain, thus linking the sound perception and sensory memory tightly together ~see also Kraus & Cheour, 2000; Kraus et al., 1995a; Näätänen, 1995!. They suggested that sound perception occurs at the phase of the fast emergence of CSR, at its rising slope until completion, which is then followed by a latent, slowly decaying phase underlying sensory ~echoic! memory ~Figure 2!. MMN ~Winkler & Näätänen, 1992; Winkler et al., 1993; Yabe, Tervaniemi, Reinikainen, & Näätänen, 1997; Yabe et al., 1998! and behavioral ~Foyle & Watson, 1984; Hawkins & Presson, 1986; Massaro, Cohen, & Idson, 1976; Scharf & Houtsma, 1986; Winkler et al., 1993! data suggest that CSR, encoded as a memory trace ~or as a set of interlinked memory traces!, is ready by 200 ms from stimulus onset. The 200 ms is needed for the afferent processes and for the subsequent temporal and feature integration of their outcomes. The duration of the temporal window of integration can be estimated, on the basis of MMN data, to be ms ~Winkler & Näätänen, 1992; Winkler et al., 1993; Yabe et al., 1997, 1998!. According to Näätänen and Winkler ~1999!, these integration processes are essential in the formation of a unitary sound percept. That the sensory information carried by sensory-memory traces underlying MMN generation indeed corresponds to sound perception ~and thus provides CSR!, rather than just to the acoustic MMN and the perception of speech sounds 3 Figure 2. A schematic illustration of the emergence and decay of the central auditory representation ~CSR! elicited by a brief sound. First, the different attributes of the sound are rapidly mapped on the respective separate feature analyzers ~Feature analysis! whose outputs are subsequently mapped on the neurophysiological mechanisms of sensory memory so that the basis for unitary sound perception emerges ~Feature integration!. Temporal integration ~such as loudness summation or masking! also occurs here. In this phase, the emerging sound representation is provided with the time dimension, sounds being represented as events in time rather than merely as their static features. The abrupt emergence of the CSR provides the specific information contents for the sound percept ~which may be conscious or preconscious, depending on whether the input also succeeds in exceeding the attention-trigger threshold controlled by separate mechanisms; see Näätänen 1992, p. 383!. The subsequent slowly decaying phase represents the ~echoic! sensory-memory trace of the sound. Adapted from Näätänen and Winkler ~1999!. Copyright 1999 by the American Psychological Association. elements composing the stimulus, was demonstrated by Winkler et al. ~1995!, who found an MMN to a change of the missing fundamental, both the standards and deviants being composed of varying combinations of the same simple tonal elements. Moreover, Gomes, Bernstein, Ritter, Vaughan, and Miller ~1997; see also Takegata, Paavilainen, Näätänen, & Winkler, 1999! found a conjunction MMN, one elicited by a deviant stimulus ~ p.10! that had the frequency of one standard stimulus and the intensity of one of the two other standard stimuli used in the same stimulus block ~ p of each.30!. Thus, according to Näätänen and Winkler ~1999!, CSR, corresponding to the information content of the sound perception and sensory memory, emerges at the stage when the outputs of the central afferent processes are integrated and mapped on the neurophysiological substrate of sensory memory. These authors further maintained that only then the temporal aspects of stimulation are fully represented; that is, from this point on in the sound-processing stream, sound is represented as an auditory event in time rather than as a set of fragmentary stimulus features without temporal coordinates. Furthermore, this temporal organization of the stimulus representation is maintained, as time passes; for example, it is possible to estimate the duration, and to reconstruct the order, of unattended sounds that occurred a moment ago. CSR should not be understood as being carried by any neuroanatomically unitary aftereffect or consequence of sound stimulation, however, as each sound corresponds to a set of levels in different acoustic dimensions, such as frequency and intensity, which are encoded by at least partially different neuronal populations of the central auditory system. This conclusion was implicated by the results of Paavilainen, Alho, Reinikainen, Sams, and Näätänen ~1991!, who found that the polarity-reversal ~when the nose was used as a reference! ratios for frequency, intensity, and duration MMNs differed, suggesting at least partially different generators in the auditory cortex for these three types of changes. Corroborating results were obtained by Giard et al. ~1995!, who modeled these three generators and found them to differ from each other in locus and0or orientation. Corroborating data for other auditory dimensions were provided by numerous other studies ~Alho et al., 1996; Levänen, Ahonen, Hari, McEvoy, & Sams, 1996; Levänen, Hari, McEvoy, & Sams, 1993; Schröger, 1995; Tervaniemi et al., 1999a!. Consequently, MMN might provide a means of mapping the representation of any sound stimulus in the auditory cortex as a set of stimulus aftereffects distributed in the neural basis of sensory memory. This would make us face the binding problem, however, as we perceive unitary sounds rather than just auditory features. Fortunately, some studies ~Gomes et al., 1997; Takegata et al., 1999; see also Ritter et al., 1995! reviewed above found a conjunction MMN also, one elicited when the frequently occurring levels of different sound features are combined in an infrequent way. These results suggest that in addition to the single-feature representations, the neurophysiological substrate of CSR also includes the connecting elements necessary for a unitary percept. This binding function probably has its substrate in higher-order neurons encoding the combinations of the parallel activations of simple, first-order neurons for the different simple stimulus features. A sound is not represented only by these higher-order, complex, neurons, however, as there are parallel MMNs to violations of a single feature and of a feature combination ~Takegata et al., 1999!. Table 1 summarizes the central properties of the CSR proposed. Table 1. Properties of the Central Sound Representation (CSR) Underlies, and corresponds to, perception, providing its sensory information content ~in feature- and temporally integrated form!; Sound enters long-term memory in the form of CSR ~if at all!; Represents sound as an auditory event in time rather than as a static set of certain levels of different stimulus features; Formed when the outputs of central afferent processing enter sensory memory, this transient phase corresponding to perception and the subsequent slowly decaying phase sensory ~echoic! memory; Mental operations possible with it ~e.g., imaging, rehearsal!; Determines auditory performance: i.e., the accuracy of: a. Perception b. Sensory memory c. Recognition d. Discrimination 4 R. Näätänen Figure 3. An illustration of the concept of the representational width ~Rw! along a sensory dimension, here for tone frequency. Rw is the range around the standard stimulus for the deviant-stimulus levels that are too near to the standard-stimulus level to elicit a mismatch negativity ~MMN! and to cause behavioral change discrimination. Outside this range, MMN is elicited at an amplitude that is larger for wider deviations in either direction, and behavioral change detection usually occurs. From Näätänen and Alho ~1997!. Copyright by S. Karger AG, Basel. MMN as an Index of Sound-Discrimination Accuracy As already mentioned, MMN, being elicited by any discriminable auditory change, provides an objective measure of the discrimination accuracy separately for practically any dimension of auditory stimulation. Ultimately, the individual discrimination accuracy must depend on the informational sharpness of the CSR on the auditory dimension involved. Näätänen and Alho ~1997! proposed the concept of the representational width ~Rw! of a sound. This measure is illustrated for auditory frequency in Figure 3. The narrower the Rw, the more discriminable a different stimulus is from the stimulus represented by this Rw, that is, the better is the system s resolution ability. This better resolution ability is reflected by a lower threshold of MMN elicitation and by a larger and earlier MMN elicited by suprathreshold changes. The first study showing that the MMN amplitude indeed provides an index for the behavioral discrimination accuracy was reported by Lang et al. ~1990!. They found that the accuracy of the behavioral discrimination of the frequency difference between two successively presented tone stimuli correlated strongly with the MMN amplitude for minor frequency changes ~recorded in a separate session in which subjects were reading!. These results are illustrated in Figure 4. Corroborating data were obtained in numerous subsequent studies. For example, Näätänen et al. ~1993b! found that subjects who were able to behaviorally discriminate a slight deviation in a spectrotemporal stimulus pattern showed an MMN to this deviant stimulus in a subsequent passive condition, whereas no MMN was elicited in subjects who were not able to perform this discrimination. However, after the subjects who were not able to perform the discrimination learned the task during the course of the subsequent behavioral discrimination blocks of the same session, then the deviant pattern did elicite an MMN in them ~Figure 5!. An anal- Figure 4. Grand-average event-related potentials ~ERPs! for subgroups of subjects who were good, moderate, or poor discriminators ~in a separate behavioral pitch-discrimination session!. ERPs to standard tones of 698 Hz ~dashed lines! and to infrequent deviant tones ~solid lines! which were 12, 19, 25, 53, or 99 Hz higher in frequency than the standard tones are overlaid. Note the between-group differences in the mismatch negativity ~MMN; shaded areas! amplitudes for different frequency deviations. Adapted from Lang et al. ~1990!. Copyright 1990 Tilburg University Press. MMN and the perception of speech sounds 5 Figure 5. Grand-average event-related potentials ~ERPs! of seven subjects at central ~Cz! electrode location to standard sound patterns ~thin line! and to deviant sound patterns ~thick line! occurring among standard patterns with a probability of.1. The standard and deviant sound patterns, illustrated at the bottom, consisted of eight sinusoidal segments of different frequencies, their only difference being that in the deviant patterns, the frequency of the sixth segment ~indicated by the arrow! was higher than in the standard patterns. ERPs were recorded during three ~early, middle, and late! reading conditions, each succeeded by a discrimination condition in which the subject s task was to press a button to each deviant stimulus. The performance of the subjects belonging to this group improved considerably during the session in this discrimination task. Mismatch negativity ~MMN!, which in these subjects first emerged and then increased in amplitude during the session, is indicated by the dotted area. Adapted from Näätänen et al. ~1993b!. Copyright 1993 by Lippincott Williams & Wilkins. ogous result with phonetic stimuli was obtained by Winkler et al. ~1999a!. They found that Hungarians who did not know any Finnish could not discriminate the Finnish 0e0 and 0ä0 vowels and had no MMN to this contrast, whereas Hungarians who had learned Finnish could discriminate these two vowels and showed a distinct MMN to this contrast. An analogous discrimination-training effect with phonetic stimuli was obtained by Kraus, McGee, Carrell, King, and Tremblay ~1995b!. These authors used different within-category var
Search
Similar documents
View more...
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks