(early release to select MD/ Scientists)
Risto J. Ilmoniemi1,5, and
The human central auditory system has a remarkable ability to establish memory traces for invariant features in the acoustic environment despite continual acoustic variations in the sounds heard. By recording the memory-related mismatch negativity (MMN) component of the auditory electric and magnetic brain responses as well as behavioral performance, we investigated how subjects learn to discriminate changes in a melodic pattern presented at several frequency levels. In addition, we explored whether musical expertise facilitates this learning. Our data show that especially musicians who perform music primarily without a score learn easily to detect contour changes in a melodic pattern presented at variable frequency levels. After learning, their auditory cortex detects these changes even when their attention is directed away from the sounds. The present results thus show that, after perceptual learning during attentive listening has taken place, changes in a highly complex auditory pattern can be detected automatically by the human auditory cortex and, further, that this process is facilitated by musical expertise.
For correct interpretation of natural acoustic input such as speech and music, it is of critical importance that the central auditory system is able to extract invariant features from the continually varying sounds. Spoken, played, and sung phrases are recognized even though they are presented by a great variety of speakers, instruments, or singers in different acoustic environments. Moreover, even when no conscious attention is paid to the surrounding sounds, changes in their regularity can cause the listener to redirect his or her attention toward the sounds.
During the past two decades, event-related potential (ERP) recordings have brought new insight to the neuronal events behind auditory change detection in audition. Components P300, N400, and P600 (named after their negative/positive polarity at the vertex and peak latency after the sound onset) are elicited when the subject attends to the sounds. These components reflect the conscious detection of a physical, semantic, or syntactic deviation from the expected sound (for review, see Rugg and Coles 1995). However, in group-comparison designs, intrinsic group differences in motivational, attentional and/or vigilance factors might contaminate the ERP recordings.
In addition, ERP recordings allow one to probe the neural processes preceding the involvement of the attentional mechanisms. In such studies, the subject is asked to concentrate on a task unrelated to the sounds heard. These studies have revealed that automatically-formed cortical memory traces for the recent acoustic input represent basic sound features such as tone frequency and the formant structure of speech sounds (for reviews, see Näätänen 1992, 2001). In addition, ERPs have been recorded that reflect memory traces representing sounds composed of several simultaneous or successive tonal elements (Schröger et al. 1992; Alain et al. 1994; Alho et al. 1996).
The results described above were obtained with the mismatch negativity (MMN) paradigm, in which an infrequently presented sound (“deviant”) among the frequently occurring stimuli (“standard”) elicits the MMN. Its presence implies that the invariant parameters of the standard sound were encoded neurally and found to differ from the parameters of the deviant sound. The MMN can be recorded even when the subject is performing a task unrelated to the stimulation under interest, such as reading a book or playing a computer game. Thus, the MMN offers a direct measure of the similarity of neural codes for different sounds, without being affected by differences in, for instance, attentional or motivational involvement of the subject. However, several studies have indicated that the MMN parameters correlate closely with the subject's behaviorally determined perceptual accuracy. For instance, the MMN amplitude and latency reflect discrimination accuracy as determined by musicality tests;Tervaniemi et al. 1997) and by hit-rate or reaction-time measurements (e.g., Tiitinen et al. 1994; Kraus et al. 1996; Tremblay et al. 1998
However, not all subjects gained equally from the training during the experimental session. The present inter-individual differences in readiness to discriminate transposed melodies could be explained by differences in long-term musical expertise. While all Accurate subjects were professional musicians, the group of Inaccurate subjects had five musicians in addition to the seven nonmusicians. Thus, none of the non-musicians learned the discrimination task, whereas most of the musicians did.
It is noteworthy that the musicians belonging to the “Accurate” and “Inaccurate” groups differed remarkably from each other with regard to the type of their musical expertise. The musicians in the Inaccurate group had most of their training in classical music, in which the musical score is used regularly during learning and occasionally during public performance, as in orchestral and chamber music, for instance. In contrast, six out of the eight musicians in the Accurate group were engaged in a musical genre in which musical information is transferred from one musician to another by playing and singing (e.g., pop and jazz). Their performances often include improvisations. In other words, in their musical communication, they rely more on auditory information than visual (i.e., musical score). Thus, the subject's readiness to process attentively and pre-attentively highly complex musical information was not influenced merely by the presence or absence of expertise in music (musician vs. nonmusician), but also by the type of this expertise (Sloboda 1985;Deliege and Sloboda 1996).