WO1999034345A1

WO1999034345A1 - Method and apparatus for training auditory skills

Info

Publication number: WO1999034345A1
Application number: PCT/US1998/027849
Authority: WO
Inventors: Gal A. Cohen; Anton Krukowski; Charles Boatwright
Original assignee: Cohen Gal A; Anton Krukowski; Charles Boatwright
Priority date: 1997-12-30
Filing date: 1998-12-30
Publication date: 1999-07-08
Also published as: AU2098899A

Abstract

The present invention describes an ear training method and associated set of devices, designed to improve the ability to identify and match harmonic elements in a complicated auditory environment. The present invention seeks to improve the ability to discern absolute and relative pitch, to identify the pitch intervals between two or more notes, and to improve the memory of a previously heard pitch or interval that is masked by a distracting sound. Further uses of this invention include training the ability to sing or play an instrument in tune and/or in harmony with a concurrent musical background. This invention also has applicability towards general language skill training, including reducing spoken accents and teaching foreign languages or dialects.

Description

METHOD AND APPARATUS FOR TRAINING AUDITORY SKILLS

CROSS-REFERENCE TO RELATED APPLICATION This application claims priority to and incorporates by reference the U.S. Provisional Patent Application entitled MUSIC AND LANGUAGE SKILLS TRAINING TASKS AND ASSOCIATED DEVICES, Serial No. 60/068,978, filed December 30, 1997, and invented by Gal Cohen, Anton Krukowski, and Charles Boatwright.

BACKGROUND OF THE INVENTION Field of the Invention

The present invention relates to a method and apparatus for training auditory skills. More particularly, the present invention relates to a method and apparatus for training music and language skills.

Description of Related Art

Training auditory skills in a music context includes training the ability to identify and match notes and intervals in a complex acoustic environment. This training is often performed using simple trial and error methods, where a note or interval is played and the student is asked to identify. This technique does not use adaptive difficulty.

Training auditory skills in a language context includes training the ability to discriminate between the elements of speech which make up words. This training has been performed by stretching speech in the time domain, in other words slowing down the speech. However, sounds sound fundamentally different when they are synchronous than when they are separate. Many individuals have good timing discrimination but still have bad accents.

Training auditory skills in a language context also includes training the ability to reproduce words without an accent. This training is often performed using simple playback and repetition methods, where an unaccented word or phrase is played and the user attempts to repeat the word or phrase as just played. However, there is no feedback and no adaptive difficulty.

What is needed is a method and apparatus for training auditory skills with microtonal gradations in the frequency domain.

SUMMARY OF THE INVENTION The present invention is directed towards a method for generating an auditory pattern for training auditory skills. This includes receiving an auditory pattern having at least one subpattern; receiving at least one response from a user, the response representing a user perception of the auditory pattern; and modifying a frequency of at least one element based on at least one response from the user.

BRIEF DESCRIPTION OF THE DRAWINGS FIGURES 1A-1C show examples of non-ideal and ideal training tasks and shortcomings of non-ideal training tasks.

FIGURE 1C shows one embodiment of the present method for an ideal training task.

FIGURES 2A-2B show one embodiment of a method for adjusting task difficulty. FIGURE 2C shows a flow process diagram of one embodiment for training auditory skills of the present invention.

FIGURE 3 A shows one embodiment of an apparatus capable of generating auditory patterns for use with the method of the present invention.

FIGURES 3B-3D show a method for generating auditory patterns suitable for use with the method of the present invention.

FIGURE 3E shows one embodiment of an apparatus capable of generating auditory patterns for use with the method of the present invention. FIGURE 4A shows one embodiment of a method of the present invention for generating complex auditory patterns from basic auditory patterns.

FIGURE 4B shows one embodiment of a method of the present invention for generating auditory patterns. FIGURES 5A-5H show a graphical representations of the auditory patterns described in the examples.

DETAILED DESCRIPTION OF THE INVENTION The present invention describes an ear training method and associated set of devices, designed to improve the ability to identify and match harmonic elements in a complicated auditory environment. The present invention seeks to improve the ability to discern absolute and relative pitch, to identify the pitch intervals between two or more notes, and to improve the memory of a previously heard pitch or interval that is masked by a distracting sound. Further uses of this invention include training the ability to sing or play an instrument in tune and/or in harmony with a concurrent musical background. This invention also has applicability towards general language skill training, including reducing spoken accents and teaching foreign languages or dialects.

The ear training method of the present invention involves a series of tasks, in which a user is asked to categorize an auditory pattern or element thereof, or to categorize the difference between two auditory patterns. Task difficulty is increased until user can consistently perform categorization, and then decreased.

For some tasks where the user is asked to categorize a difference, the auditory patterns presented to the user are derived from an initial set of two auditory patterns. A continuum is defined based on the difference of frequency characteristics of 2 patterns. Derived auditory patterns are generated each trial such that they lie on this continuum, and such that the difference between them is not less than a certain value. This minimum difference of frequency characteristics is increased until the user can consistently discriminate between the two derived patterns, and then decreased. Other embodiments of the method of the present invention employed in this patent to modulate task difficulty include frequency shifting, frequency modulation, and amplitude modulation of elements of auditory patterns. Also, the number of elements per auditory category, and number of categories as well as difference between categories, may vary from trial to trial to increase or decrease task difficulty. This training methodology seeks to train sharper tuning of auditory classification by continuous adaptation of task difficulty to maintain optimal learning. Music and language training both have in common the underlying goal of teaching the user to differentiate between categories of sound patterns. A typical training method presents auditory patterns, and asks the user to differentiate between, or to categorize them. Through enough repetition, it is hoped that the user will eventually improve their discrimination between different patterns. In FIGURE 1 A is shown "good" results of a typical approach, for the case where the user is being trained to discriminate between the notes of a musical scale. In this case, individual notes are repeated over and over again, until eventually the user can differentiate between them. However, this approach is non-optimal for several reasons.

Firstly, there are known auditory processing centers in the human brain which are involved in this type of auditory discrimination task. Within these processing centers, networks of neurons can be trained, or tuned, to recognize certain auditory patterns, and to reject other auditory patterns. These auditory patterns can be thought to lie on a continuum. In the case of the task shown in FIGURE 1 A, the continuum is one-dimensional, and can be described by the underlying frequency of a given auditory pattern. The different neural networks which are involved in this recognition task will exhibit tuning to certain bandwidths of frequency. The ideal goal of this task should be to tune the receptive fields as narrowly as possible, and to center the receptive fields directly at the desired frequency, as shown in the second column. The more usual result, shown in the first column, is to differentiate the receptive fields to the point where they include the desired note, but where they are broad and ill- centered. An expected result of this non-ideal result would be that the user could still sing significantly off-key when trying to match a given note, because his singing would still fall into the correct receptive field, and would thus sound correct.

A second problem with standard training protocols is shown in FIGURE IB. In this case, a very simple training task is being performed: the user is being trained to differentiate between a "C" and a "D". However, for this user, the current state of their receptive field tuning is such that a single broad receptive field includes both these notes. Thus, when "D" is played to the user, it sounds just like a "C", because the user can not discriminate between these two sounds. In this case, it may be possible to repeat the task endlessly, and the user may still not learn it. Frequently, music and language learning is characterized by "plateaus," in which the user seems to be stuck.

The approach of the present invention to this problem, shown in the second column, would be to modify the discrimination task, such that the two patterns fall into two different receptive fields. In this case, the frequency of the "D" is increased (signified by "D+"), to the point where it lies outside the receptive field of the "C." Over time, the frequency of the "D" can be brought back to its normative frequency, as the receptive fields of the user are trained, and adapt and sharpen. This increase of the difference between two auditory patterns, and then the subsequent reduction of the difference, is one aspect of the present invention. Furthermore, the change in the difference between the two patterns can be made in very small gradations. The ability to create microtonal gradations, with respect to the frequency characteristics of the patterns, is an advantage of this present invention.

In the realm of language training, an approach that has met with some success is to "stretch" the timing of fast elements of speech in the time domain so that they can be parsed by the user. This approach is limited in two respects. Firstly, sounds sound fundamentally different when they are synchronous, then when they are separate. As an example, many people, who have good timing discrimination, have bad accents. Thus, micro-tuning along a continuum in the frequency domain, is required. A more ideal approach, which is utilized in the present invention, would be to adaptively create categories for discrimination through combining accented and un- accented speech. This approach would train the user to discriminate along a spectrum of sounds which includes those which might be encountered in normal experience. The present invention includes provisions to increase the differentiation between two auditory patterns by stretching the difference between the frequency, rather than the timing, characteristics of the two auditory patterns, until the user can correctly categorize the difference between the two patterns. The difference is then reduced.

An additional factor to consider in designing an optimal auditory training methodology is masking. Many people have the capability to sing along with a song, as long as only one note is heard at a time. However, once harmony (one or more notes concurrent to the original note) is introduced, the singer is not able to match the original note. The additional notes which are concurrent with the original note mask, or obscure the original note. Thus, receptive fields may be tuned differently as a function of other concurrent sounds. Using the strategy of differentiating between two patterns until they fall into different receptive fields, and then narrowing the difference over time, we come to the requirement that categories of patterns need to be re-adapted in difficulty any time the complexity of the pattern changes.

What is needed is a means of simplifying complex elements of speech and harmony down to simpler elements, and then increasing complexity, as a function of the user's ability to identify and discriminate between these elements.

In addition, one can decrease the effect of one element of a pattern masking another element, by increasing the difference between the two elements on axes/continuums besides frequency shift. These axes include frequency modulation and amplitude modulation. This invention includes provisions for increasing the frequency modulation or amplitude modulation differences between elements of a pattern, until a user can differentiate between them, and then reducing this difference.

The present invention describes an ear training method and associated set of devices, designed to improve the ability to identify and match harmonic elements in a complicated auditory environment. This invention is designed to improve the ability to discern absolute and relative pitch, to identify the pitch intervals between a plurality of notes, and to improve the memory of a previously heard pitch or interval that is masked by a distracting sound. Further uses of this invention include training the ability to sing or play an instrument in tune and/or in harmony with a concurrent musical background. This invention also has applicability towards general language skill training, including reducing spoken accents and teaching foreign languages or dialects.

The ear training method described involves a series of tasks, in which a user is asked to categorize an auditory pattern or element thereof, or to categorize the difference between a plurality of auditory patterns or a plurality pattern elements. Task difficulty is increased until user can consistently perform categori3zation, and then decreased. The gradations in task difficulty can be of any granularity. For some tasks where the user is asked to categorize a difference between a plurality of patterns or a plurality of characteristics of patterns, the auditory patterns presented to the user are derived from an initial set of two auditory patterns. A continuum is defined based on the difference of frequency characteristics of 2 patterns. Derived auditory patterns are generated each trial such that they lie on this continuum, and such that the difference between them is not less than a certain value. This minimum difference of frequency characteristics is increased until the user can consistently discriminate between the two derived patterns, and then decreased.

Other embodiments employed in the present invention to modulate task difficulty include frequency shifting, frequency modulation, and amplitude modulation of elements of auditory patterns. Also, the number of elements per auditory category, and number of categories as well as difference between categories, can vary from trial to trial to increase or decrease task difficulty. This training methodology seeks to train sharper tuning of auditory classification by continuous adaptation of task difficulty to maintain optimal learning. Again, the approach is to increase the difference between a plurality of patterns or pattern characteristics until the user can consistently discriminate between them, and then to reduce the difference.

A hardware implementation, and a software emulation of the hardware implementation, are described. A series of tasks, which could be included as part of the device, are described below. Note that different versions of the basic device may be built by including different combinations of the tasks below, as well as other tasks.

FIGURE IA shows a flow process diagram of one embodiment of a method for auditory training 100 of the present invention. The method includes: (1) receiving an auditory pattern having at least one element (block 110); (2) receiving at least one response from a user, the response representing a user perception of the auditory pattern (block 120); and (3) modifying a characteristic of at least one element based on at least one response from the user (block 130).

Receiving an auditory pattern having at least one element (block 110). Typically, this is an auditory pattern which has been presented to a user. An auditory pattern may be sampled, synthesized, or recorded, or a combination. An auditory pattern may be made up of other auditory patterns or subpatterns. Each subpattern includes at least one element. Elements include musical notes and voicings. Elements may also include components of speech, such as consonants, vowels, words, and phonemes.

Each element has at least one characteristic. Characteristics may include a frequency, an amplitude, a frequency modulation, a frequency interval, an amplitude modulation, an accented or non-normative pronunciation, a difference between two characteristics, a difference between two elements, and a difference between two auditory patterns.

Receiving at least one response from a user, the response representing a user perception of the auditory pattern (block 120). The response from the user may be an identification, comparison, or matching of the auditory pattern. The response may be an identification or matching of absolute or relative pitches and/or frequency intervals of the auditory pattern. However, the response is not limited to the frequency domain.

The response may be in the form of a delayed-hold, two-alternative forced choice, or multiple alternative forced choice. The response may also be a pressed button or a vocal response. The response may include an identification of frequencies and intervals, and identification of absolute pitch (perfect pitch training), and identification of absolute interval, a comparison of two intervals, and identification of an isolated note with a note in an interval, and a comparison of a note with a note in an interval. The intervals may be concurrent or staggered, one note in an interval may be a stack, notes or elements of stacks may have differing envelopes, amplitude and/or frequency modulation, and notes in an interval may be of differing amplitude.

The response may include: sharp, flat, same, different, sharper than, flatter than; musical notes, including A, B, C, D, E, F, and G musical intervals including first, second, third, fourth, etc. more, less, more accented than, less accented than, no input the set of words, phonemes or elements a vocal response

Modifying a frequency of at least one element based on at least one response from the user (block 130). Overall task difficulty may be modulated from trial to trial based on the performance of the user. One way to do this is to use a staircase procedure. In the staircase procedure, the task increases in difficulty if the user responds correctly n trials in a row. Traditionally, n is equal to 3, although this may vary. Also, the task gets easier if the user responds incorrectly m trials in a row. Traditionally, m is picked to equal 1. In this case, each time the user responds incorrectly to a trial, the task gets easier. The staircase procedure enforces that the task is in a regime where learning will take place, and dynamically adjusts the difficulty of the task to maintain it in the current optimal learning range.

The trials may be structured using strategies in which the user must make at least one choice, in each trial. One such procedure is the multi-alternative forced-choice procedure, in which the user chooses between several possible defined answers. Another such procedure is the delayed-hold procedure, in which the user must keep pressing a button until the stimulus changes. Note that e may be changed from trial to trial by steps which can be microtonal - not limited to the half steps which characterize western music scales. If the response is wrong five times in a row, then the task may be temporarily halted, and the user given remedial instructions on how to perform the task.

The degree of pitch or interval discrimination which is required to complete a task correctly varies from trial to trial. For example, music training may be provided by teaching identification of individual notes as well as intervals between notes.

However, the auditory patterns are not limited to the musical scale, and microtonal differences may also be presented, as when a note is played sharp or flat. The method of the present invention may be used with pitch differences of any size.

In addition, other parameters such as amplitude, am and fin envelopes, and complexity of tone, may also be varied from trial to trial.

Typically, the modified auditory pattern is presented to the user again and the process repeated.

FIGURES 1 A-1C show examples of non-ideal and ideal training tasks and shortcomings of non-ideal training tasks. FIGURE 1 A shows typical training music training tasks where the goal is to identify musical notes on a scale. While the notes lie on a frequency continuum, the non-ideal task does not seek to train sharp differentiation on the frequency scale. Rather, broad categorization of discrimination on the frequency scale is the goal. FIGURE IB shows an example of a non-ideal implementation, where the user is asked to differentiate between the notes C and D. These notes are repeated, however, since the user cannot differentiate between them, no progress is made in this task. FIGURE IC shows an implementation of the present invention of the problem shown in FIGURE IB. When the user is presented with two patterns which he cannot discriminate the difference between, the difference is increased until the user can differentiate the difference between the patterns. This difference can be reduced in time.

FIGURES 2 A shows one example of a method for changing task difficulty.

This is also referred to as a staircase. If the last task was wrong, task difficulty is decreased. If a plurality of tasks were correct, task difficulty is increased. Otherwise, task difficulty is maintained. This change in task difficulty leads to a change in allowable ranges for characteristics, elements, and patterns of the auditory patterns in the next task. FIGURE 2B shows another embodiment of a method for changing task difficulty. If the task is too easy, based on a staircase, the method may increase or not change the number of patterns, characteristics, or elements of the auditory patterns. In addition, differences between the patterns may be narrowed or not changed. If the task is at the correct level of difficulty, there is no change in the number of patterns, characteristics, and elements, and no change in difference between two patterns presented from task to task. If the task was too difficult, there may be a decrease or no change in the number of patterns, characteristics, and elements presented in the next trial, and there may be an increase or no change in the difference or differences between the patterns. FIGURE 2C shows a flow process diagram of one embodiment for training auditory skills of the present invention. An auditory pattern is presented to the user. The user returns an input which represents a categorization of a pattern, an element, and/or a characteristic, or a difference between two or more patterns, elements, or characteristics. The correctness of user response is analyzed and task difficulty is adjusted, for instance, with a staircase. Parameters for generating auditory patterns in the next task are set as a function of whether task difficulty needs to be adjusted. Based on the parameters, the next auditory pattern is generated and presented to the user.

FIGURE 3 A shows one embodiment of an apparatus capable of generating auditory patterns for use with the method of the present invention. A sinusoidal tone generator can present one or more sinusoids which spectral content is described.

Various frequency bands can be shifted independently relatively to one another or modulated by a frequency modulation. Amplitude of these bands can then be modulated in amplitude. Other filters are also possible. This tone may be combined with a sampled or stored complex tone which is also described spectrally, modulated in frequency and amplitude, and possibly by other filters. These are then combined to form an element of music or speech. Elements can also be combined to form an auditory pattern.

FIGURES 3B shows the frequency characteristics of two patterns, PI being considered a normative pattern, P2 being considered a non-normative pattern. P2 is different from PI because an element of P2 has been shifted in frequency relative to

PI . These patterns can be combined to create a new pattern in at least two possible ways, shown in FIGURES 3C and 3D. In FIGURE 3C, a linear addition is shown of the frequency characteristics of PI and P2. In FIGURE 3D, a morphing combination of PI and P2 is shown, creating a new pattern which has a morphed element. FIGURE 3E shows one embodiment of an apparatus capable of generating auditory patterns for use with the method of the present invention.

FIGURE 4A show one method for deriving patterns from a continuum defined by two initial patterns. The two initial patterns, for example, may be a normative pattern and a non-normative pattern of language, although any two auditory patterns may be used as inputs. In this case, the normative pattern is the word "think", and the non-normative pattern is a word which sounds like "sink". The normative and non- normative patterns are decomposed into frequency versus time elements. A correspondence is assigned between elements of the normative and non-normative pattern. This correspondence may include a one-to-one correspondence, a one-to- many correspondence, and a one-to-none or null element correspondence. A continuum is derived on which the normative and non-normative patterns lie. Each point on this continuum is defined by a specific set of values describing the correspondence, or correspondence weights. By varying the weight, various patterns can be generated from the continuum which have a different degree of normative versus non-normative content. Note that the continuum also allows creation of patterns which are super-normative or super-non-normative. FIGURE 4B shows one embodiment of a method of the present invention for generating auditory patterns. Differences in the patterns along the continuum, defined in this case by a set of difference in weights termed epsilon. Two patterns are derived from the continuum satisfying the requirement that the absolute value of the difference in weights is greater than or equal to epsilon. These patterns are presented to the user. The user responds with an answer categorizing their perception of the difference between the patterns. Based on a history of answers, for instance using a staircase, epsilon is modified to increase, decrease, or maintain task difficulty. For example, epsilon may be increased until the user can correctly categorize the difference between the patterns. Then epsilon can be decreased to increase task difficulty.

FIGURE 4C shows methods for deriving a non-normative pattern of FIGURE 4A. A normative pattern may be presented to the user, and the user's pronunciation of the user pattern may be recorded. The user's pronunciation can be used as the non- normative word. Another case involves presenting the non-normative pattern to the user, and the user responds with the user's pronunciation of the non-normative pattern.

The accent can be extracted from the user's pronunciation of the normative pattern. For instance, a set of frequency differences may be defined, representing the difference between the user's pronuncation of the normative pattern and the initial normative pattern. This accent may be applied to the normative word to derive a non- normative word. Another possibility is to use prerecorded normative and non- normative pairs. Yet another possibility is to synthesize or derive normative and non- normative pairs.

The method described above and associated tasks may be implemented on wide variety of apparatus, such as a computer, portable dedicated hardware device, or other device. Such an apparatus may be implemented in part or all in hardware, or it could also be emulated in software, on a computer, and/or as part of a video game. EXAMPLE 1 The user is presented with a set of two pure tones of different frequencies and the same or variable duration and amplitude. FIGURE 5A shows a graphic representation of this auditory pattern. The user is asked to select the tone with the higher frequency.

1. Receive an auditory pattern having at least one subpattern (block 110). A first set of two pure tones, Tone 1 and Tone 2, is played one after another. Tone 1 has a single frequency f„ and amplitude a, and duration d,. Tone 2 has a single frequency f₂, and amplitude a, and duration d₂. A time interval Δt may be inserted between the two tones.

The absolute value of the difference between f, and f₂ (Δf) may be equal to zero, or may be equal to or greater than an amount termed e.

2. Receive at least one response from a user, the response representing a user perception of the auditory pattern (block 120). The user may respond that Tone 2 was higher than, lower than, or the same as, Tone 1. The user may be told whether the response was correct.

3. Modify a frequency of at least one element based on at least one response from the user (block 130). A second set of two pure tones, Tone 3 and Tone 4, are selected. Tone

3 has a single frequency f₃, and amplitude a₃ and duration d₃. Tone 4 has a single frequency f₄, and amplitude a₄ and duration d₄. Tone 3 may be the same as Tone 1. Tone 4 is selected based on whether the last response and other previous responses were correct. A staircase scheme may be used to adjust e and modulate task difficulty. For example, e is increased for wrong answers to make the task easier, and decreased for correct answers to make the task more difficult.

EXAMPLE 2 The user is presented with a single tone or a series of tones played in sequence.

FIGURE 5B shows a graphical representation of this auditory pattern. The user is asked to identify and match a single tone or a series of tones. 1. Receive an auditory pattern having at least one subpattern (block 110). The auditory pattern includes a single tone f, or a series of tones f„ f₂, . . . f_n played in sequence.

2. Receive at least one response from a user, the response representing a user perception of the auditory pattern (block 120).

The user may identify the tone or series of tones by the absolute pitch of each tone. The user may identify the tone by frequency, e.g. 660 Hz, or by name, e.g. G. The user may identify the tone or series of tones through a user interface such as a button. In a simple task, only one button would be available to the user. For example, one button corresponding to a tone of 660 Hz is available to the user. After the user hears a tone, the user activates the button if the user perceives the tone to have a frequency of 660 Hz. In a more difficult task, multiple buttons would be available to the user. For example, three buttons corresponding to 660, 1000, and 1400 Hz are available to the user.

After the user hears a tone, the user activates the corresponding button if the user perceives the tone to have a frequency of 660, 1000, or 1400 Hz.

3. Modify a frequency of at least one element based on at least one response from the user (block 130).

A second tone or series of tones is selected. To increase difficulty, Δf between the buttons may be reduced, in other words the number of possible responses available to the user is increased. A staircase scheme may be used to adjust e and modulate task difficulty. For example, e is increased for wrong answers to make the task easier, and decreased for correct answers to make the task more difficult. Three right answers in a row lead to a decrease of e. A variation of this task presents the user with a tone of a single pitch, and asks the user has to select one of three or more buttons corresponding to one or more specific frequencies or notes and a sharp and flat buttons. For example, the user may respond that the tone is middle C, or sharp or flat of middle C by an amount equal to or greater than e, where e is modulated by a staircase scheme.

EXAMPLE 3 The user is presented with a set of two tones of different frequencies and the same or variable duration and amplitude. FIGURE 5C shows a graphic representation of this auditory pattern. The user is asked to identify the frequency interval between the tones.

1. Receive an auditory pattern having at least one subpattern (block 110). A first set of two tones, Tone 1 and Tone 2, is played simultaneously or in a staggered fashion.

The user indicates the frequency interval between the two tones. For example, the user may respond that the frequency interval was a fifth.

A second set of two tones, Tone 3 and Tone 4, is selected. A staircase scheme may be used to modulate task difficulty. The user must press a button to indicate which tones they heard. In a simple task, the user would only have one button to press - the interval is always a fifth. In a more difficult task, more buttons would be added, so now the user must choose between flat second, fourth, and sharp sixth if there are three buttons. Also, to increase difficulty, Δf between the buttons may be reduced.

A variation of this example is a task where the user hears one interval, and has to select one of three buttons - a given interval, or a sharp or flat button. Thus, in each trial, the interval could be a fourth, or sharp or flat of a fourth by an amount equal to or greater than e, where e is modulated by the staircase. As in Example 1, if the answer to a given trial is wrong, then e is increased, to make the task easier. Three right answers in a row lead to a decrease of e. Another variation of this example is to use a delayed hold procedure - the user presses a button to start the trial. A series of intervals are then heard. The user must keep the button down, until an interval different from the others is heard. The user must react within a certain time window. False positives, and false negatives both count as wrong answers, and can be used to drive the staircase algorithm. An interval may start at a different root note, each time it is played. The staircase may also be used to modulate the duration, and amplitude, of each tone, as well as the Δt between tones.

EXAMPLE 4 This task asks the user to compare between two intervals.

1. Receive an auditory pattern having at least one subpattern (block 110). The auditory pattern includes two intervals played one interval staggered after the other. FIGURE 5D shows a graphical representation of this auditory pattern.

The user must indicate whether the second interval was the same as, sharper than, or flatter than, the first interval.

A staircase scheme may be used to modulate task difficulty. The durations, amplitudes, and amplitude and frequency modulation of a given tone may be modulated. The time between the two intervals may be modulated. Also, the range of the root notes, as well as the difference in interval between the two intervals, may be modulated. Note that having roots which are close to the same frequency for the two intervals may make the task easier, rather than harder, in some cases, and the staircase should account for this.

One variation of this example is where the two intervals overlap in time, but have a different envelope, or voicing, which the user can utilize to differentiate between them. The staircase can be used to modulate the temporal overlap, and the similarity between the two voicings.

EXAMPLE 5

This task asks the user to identify intervals.

1. Receive an auditory pattern having at least one subpattern (block 110). The auditory pattern includes several intervals played one interval staggered after the other. FIGURE 5E shows a graphical representation of this auditory pattern.

The user must indicate the interval.

The staircase is used to modulate task difficulty. In the easiest task, all intervals would be the same, and the user would only have one button to choose. The task gets harder as more choices are added. The interval steps between the buttons can be reduced, to increase difficulty. Note that buttons don't need to be restricted to "pure" intervals like fourths or sixths - you could have flat seconds, for instance, as one of your choices.

The durations, amplitudes, and amplitude and frequency modulation of a given tone which is an element of an interval may be changed from trial to trial. The time between the two intervals may be modulated.

Also, the range of the root notes, as well as the difference in interval between the two intervals, may be modulated. Note that having roots which are close to the same frequency for the two intervals may make the task easier, rather than harder, in some cases, and the staircase should account for this.

One variation of this example is where the user presses a button, or other controller, indicating which interval they would like to hear. They then have to decide if the interval that they actually heard was the same, or different from the interval they asked for.

EXAMPLE 6 This task asks the user to identify one or more elements of an interval.

1. Receive an auditory pattern having at least one subpattern (block 110). The auditory pattern includes an interval or chord, followed or preceded by an individual tone. If the chord only consists of two tones, then the user will have two buttons available to press, corresponding to the two elements of the chord.

The user must indicate which element of the chord the individual tone matches. 3. Modify a frequency of at least one element based on at least one response from the user (block 130).

A staircase scheme may be used to modulate task difficulty. The interval steps between the elements can be changed, as can the number of elements in the chord. The durations, amplitudes, and amplitude and frequency modulation of a given tone which is an element of an interval may be changed from trial to trial. The temporal separation between the two intervals may be modulated. Also, the range of the root notes, as well as the difference in interval between the two intervals, may be modulated. Note that having roots which are close to the same frequency for the two intervals may make the task easier, rather than harder, in some cases, and the staircase should account for this.

EXAMPLE 7 This task asks the user to identify an element in a complex acoustic environment.

1. Receive an auditory pattern having at least one subpattern (block 110). An interval or chord is played, followed or preceded by an individual tone. FIGURE 5F shows a graphical representation of this auditory pattern.

The user is asked whether the individual tone is the same as, or sharper, or flatter than one of the elements of the chord.

3. Modify a frequency of at least one element based on at least one response from the user (block 130). A staircase may be used to modulate task difficulty. The interval steps between the elements can be changed, as can the number of elements in the chord. Also, if the individual tone is sharp or flat from the element to which it is being compared, that difference, termed epsilon, may be modulated by the staircase. The durations, amplitudes, and amplitude and frequency modulation of a given tone which is an element of an interval may be changed from trial to trial. The time between the two intervals may be modulated. Also, the range of the root notes, as well as the difference in interval between the two intervals, may be modulated. Note that having roots which are close to the same frequency for the two intervals may make the task easier, rather than harder, in some cases, and the staircase should account for this.

EXAMPLE 8 This task asks the user to identify a voicing in a complex acoustic environment.

1. Receive an auditory pattern having at least one subpattern (block 110). An interval, chord, or set of complex acoustic sounds is played, followed by, preceded by, or including, an individual tone or complex sound. Each element of the interval, chord, or set of sounds has a different voicing, and these elements can be synchronous, or staggered. This difference in voicing can arise from amplitude and/or frequency modulation, or other envelopes. Each element may have different AM and/or FM modulation and/or envelope in each frequency band. FIGURE 5G shows a graphical representation of this auditory pattern.

The individual sound will match one of the elements, in terms of its envelopes, and/or AM and/or FM modulations. As an example, assume you have a chord with two elements. Element 1 has frequency f„ and amplitude modulation pattern a,. Element 2 has frequency f₂, and amplitude modulation a₂. The individual tone has frequency f„ but amplitude modulation a^ The user must press a button, indicating whether the individual tone matched the amplitude modulation pattern of element 1 or element 2.

A staircase scheme may be used to modulate task difficulty. The interval steps between the elements can be changed, as can the number of elements in the chord. The durations, amplitudes, envelopes, and amplitude and frequency modulation of a given tone which is an element of an interval may be changed from trial to trial. The time between the two intervals may be modulated. Also, the range of the root notes, as well as the difference in interval between the two intervals, may be modulated. Note that having roots which are close to the same frequency for the two intervals may make the task easier, rather than harder, in some cases, and the staircase should account for this.

EXAMPLE 9 This task asks the user to identify language spoken without accent. It is possible, when learning a foreign language, or a new dialect of a non-foreign language, that the student mis-hears the phrasing that they are supposed to learn, because their auditory processing isn't tuned to pick up these new phrasings. 1. Receive an auditory pattern having at least one subpattern (block 110). The auditory pattern includes versions of a word or word fragment, one of which has an undesired accent. FIGURE 5H shows a graphical representation of this auditory pattern. 2. Receive at least one response from a user, the response representing a user perception of the auditory pattern (block 120). The user must pick which word had the accent, or alternately, which word was correct. 3. Modify a frequency of at least one element based on at least one response from the user (block 130).

A staircase is used to modulate the amount of accent which is present in the word with an accent. This could also be used to teach the difference between two nearly similar-sounding, but different, unaccented words or word fragments. Several possible methods could be used to vary the amount of accent or distortion of the accented word. a. The spectrogram (frequency vs time pattern) of the accented word may be a weighted average of a fully accented word and an unaccented word. b. The accented and normative words' spectrograms could be reduced to a set of bands limited to specific frequency and time windows, with specific envelopes. In some cases, a band for the accented word will have a corresponding band in the normative word. However, these two corresponding bands may be transformed from one another. If we take a case where the normative band is exactly the same as the accented band, but is shifted in frequency by 100 Hz, then in the previous algorithm, a word which was the average of the two words would have two bands, each of half normal amplitude, with one being 100 Hz shifted from the other. Instead, in this algorithm, your "half accented" word would only have one band, shifted by 50 Hz from the normative band, and this band would be full amplitude. For unmatching bands, algorithm 1 could be used, c. Amplify those bands which are different between the accented and normative words. d. "Stretch" in the frequency domain the difference between the normative and the accented word. e. If e is difference between accented and normative word/phoneme, then one can define the "anti-accented word" as normative word/phoneme - e. User must pick if they heard normative, accented, or anti-accented word, e can change as function of the staircase. f. The user is asked to read a list of words. These recorded words form the accented set of words. g. This task may involve the presentation of a digital display of the spelling of a given word.

Each of the examples given above may have variations based on complex tones, envelopes, and masks.

Complex tones, as opposed to simple tones, may be stored and/or filtered samples of a violin playing a note, or a sample of a human voice singing that note.

Or, the tones could be a baritone and a tenor voice. Thus, each tone could really consist of a concurrent stack of individual frequencies, in which each frequency band might have a differing envelope. The staircase could be used to change the difference between the two voices. Envelopes may be amplitude-modulated and/or frequency-modulated. The staircase could be used to control the amount of envelope modulation on the tones.

Masks are a set of distracting background tones. These tones may be concurrent, in a stack, and/or they may be staggered in time. A mask may have an amplitude-modulated or frequency-modulated envelope. To change the difficulty of the task, the amplitude, duration, or envelope of the mask may be varied from trial to trial. For instance, the staircase may be used to modulate these parameters. The foregoing description of the invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise forms disclosed. Many modifications and variations will be apparent. It is intended that the scope of the invention be defined by the following claims and their equivalents.

What is claimed is:

Claims

1. A method for generating an auditory pattern for training auditory skills, comprising: receiving an auditory pattern having at least one subpattern, each subpattern having at least one element; receiving a response from a user, the response representing a user perception of the subpattern; and modifying a frequency of at least one element based on at least one response from the user.

2. The method of claim 1, wherein the auditory pattern has a variable frequency.

3. The method of claim 2, wherein the frequency is modulated by a time- varying envelope.

4. The method of claim 1 , wherein the auditory pattern has a variable amplitude.

5. The method of claim 4, wherein the amplitude is modulated by a time- varying envelope.

6. The method of claim 1 , wherein the auditory pattern has a variable duration.

7. The method of claim 1, wherein the auditory pattern is sampled, synthesized, or recorded.

8. The method of claim 1 , wherein the auditory pattern includes at least one tone.

9. The method of claim 1, wherein the auditory pattern includes at least one musical note.

10. The method of claim 1 , wherein the auditory pattern includes elements of speech.

11. The method of claim 10, wherein the elements of speech include words.

12. The method of claim 10, wherein the elements of speech include phonemes.

13. The method of claim 10, wherein the elements of speech are accented.

14. The method of claim 10, wherein the elements of speech includes a combination of accented and unaccented elements of speech.

15. The method of claim 14, wherein the combination is non-linear.

16. The method of claim 1 , wherein the auditory pattern is generated randomly.

17. The method of claim 1 , wherein the auditory pattern is pre-set.

18. The method of claim 1 , wherein the auditory pattern is selected based on a history of user responses.

19. The method of claim 1 , wherein at least one element is a mask.

20. The method of claim 1 , wherein the response is a delayed-hold answer.

21. The method of claim 1 , wherein the response is sharp, flat, same, different, sharper than, or flatter than.

22. The method of claim 1 , wherein the response is musical notes, including A, B, C, D, E, F, and G.

23. The method of claim 1, wherein the response is musical intervals including first, second, third, fourth, and fifth.

24. The method of claim 1, wherein the response is more, less, more accented than, or less accented than.

25. The method of claim 1, wherein the response is no response.

26. The method of claim 1 , wherein the response includes a set of words, phonemes, or other elements of speech.

27. The method of claim 26, wherein the response is spoken.

28. The method of claim 1 , wherein the characteristic is a frequency of at least one element.

29. The method of claim 1 , wherein the characteristic is a modulation of a frequency of at least one element.

30. The method of claim 1 , wherein the characteristic is an amplitude of at least one element.

31. The method of claim 1 , wherein the characteristic is a modulation of an amplitude of at least one element

32. The method of claim 1 , wherein the characteristic is a frequency interval between a plurality of elements.

33. A method for training auditory skills, comprising: presenting an auditory pattern to a user, the auditory pattern having at least one element, each subpattern having at least one element; receiving a response from the user, the response representing a user perception of the auditory pattern; modifying a frequency of at least one element based on at least one response from the user; and presenting the modified auditory pattern to the user.

34. The method of claim 33, wherein the modified auditory pattern is presented in a random order of subpattems.

35. A method for generating an auditory pattern for training auditory skills, comprising: receiving an auditory pattern having a plurality of subpattems, each subpattern having at least one element; receiving a response from a user, the response representing a user perception of a difference between the subpattems; and modifying a frequency of at least one element based on at least one response from the user.

36. The method of claim 35, wherein modifying includes decreasing the frequency difference between the subpattems when the response is correct.

37. The method of claim 35, wherein modifying includes increasing the frequency difference when the response is incorrect.

38. A method for training auditory skills, comprising: presenting an auditory pattern to a user, the auditory pattern having at least one subpattern, each subpattern having at least one element; presenting a plurality of categories to the user, the categories representing characterizations of the auditory pattern; receiving a selected category from the user, the selected category representing a user perception of the auditory pattern; modifying a frequency of at least one element based on the selected category from the user; and presenting the modified auditory pattern to the user.

39. The method of claim 38, wherein modifying includes decreasing the frequency difference between the subpattems when the response is correct.

40. The method of claim 38, wherein modifying includes increasing the frequency difference when the response is incorrect.

41. A method for training auditory skills, comprising: presenting an auditory pattern to a user, the auditory pattern having at least one subpattern, each subpattern having at least one element; presenting a plurality of categories to the user, the categories representing characterizations of the auditory pattern; receiving a selected category from the user, the selected category representing a user perception of the auditory pattern; changing a quantity of categories based on at least the selected category; and presenting the plurality of categories to the user.

42. The method of claim 41 , wherein changing includes increasing the quantity when the selected category is correct.

43. The method of claim 41 , wherein changing includes decreasing the quantity when the selected category is incorrect.

44. A method for generating an auditory pattern for training auditory skills, comprising:

(a) presenting a first and second subpattems, the first and second subpattems each having at least one element; (b) receiving a response from a user, the response representing a user perception of a difference between the first and second subpattems; and

(c) increasing the difference between the first and second subpattems;

(d) repeating (a) - (c) until the user identifies the difference between the first and second subpattems; (e) reducing the difference between the first and second subpattems.

45. A method for training auditory skills, comprising: presenting a first auditory pattern to a user, the first auditory pattern having no rapid temporal changes; receiving a response from the user, the response representing a user categorization of the first auditory pattern; selecting a second auditory pattern based on the response and a user history of responses, the second auditory pattern differing from the first auditory pattern in the frequency domain, the second auditory pattern having no rapid temporal changes; and presenting the second auditory pattern to the user.

46. A method for training auditory skills, comprising: presenting a first auditory pattern to a user; receiving a response from the user, the response representing a user categorization of the first auditory pattern; selecting a second auditory pattern based on the response and a user history of responses, the second auditory pattern differing from the first auditory pattern in the frequency domain, the second auditory pattern having no rapid temporal changes; and presenting the second auditory pattern to the user.

47. A method of teaching an individual to discriminate between the characteristics of a sound stimulus, the method comprising: (a) providing a sound stimulus having no rapid temporal changes in frequency content;

(b) emphasizing the energy level of a portion of the sound stimulus;

(c) repeating the sound stimulus until the individual is able to distinguish the characteristics of the sound stimulus;

(d) reducing the emphasis of the energy level of the portion of the sound stimulus; and

(e) repeating step (c) until the individual is able to discriminate between the characteristics of a sound stimulus.

48. A method for training a user to discriminate between a normative auditory pattern and a non-normative auditory pattern, comprising: receiving a normative auditory pattern having at least one element of speech, each element having normative characteristics; receiving a non-normative auditory pattern, the non-normative auditory pattern differing from the normative auditory pattern by having at least one non-normative element of speech, each non-normative element having at least one non-normative characteristic; generating a combined auditory pattern by mixing the first and second auditory patterns using a first and second plurality of weights, the first plurality of weights being associated with the characteristics of the first auditory pattern, the second plurality of weights being associated with the characteristics of the second auditory pattern, the combined auditory pattern having at least one combined characteristic; presenting the combined auditory pattern to an user; and receiving a response, the response representing a user categorization of the combined auditory pattern.

49. The method of claim 48, wherein the auditory pattern is a word.

50. The method of claim 48, wherein the auditory pattern is a portion of a word.

51. The method of claim 48, wherein the categorization is an identification of at least one combined characteristic of the combined auditory pattern.

52. The method of claim 48, wherein the categorization is a discrimination between the normative and non-normative characteristics.

53. The method of claim 48, wherein the categorization is a discrimination between a plurality of normative characteristics.

54. The method of claim 48, wherein the categorization is a discrimination between a plurality of non-normative characteristics.

55. The method of claim 48, wherein the categorization is a discrimination between a normative and a combined characteristic.

56. The method of claim 48, wherein the categorization is a discrimination between a non-normative and a combined characteristic.

57. The method of claim 48, wherein the categorization is a discrimination between a plurality of combined characteristics.

58. The method of claim 48, wherein the characteristic is a frequency.

59. A method for training a user to discriminate between a normative auditory pattern and a non-normative auditory pattern, comprising: presenting an auditory pattern to a user, the auditory pattern having at least one element of speech; presenting a first plurality of categories to the user, the first plurality of categories representing possible classifications of the auditory pattern; receiving a response from the user, the response representing a user perception of the auditory pattern, the response selected from the first plurality of categories; and modifying the first plurality of categories based on a history of responses.

60. The method of claim 59, wherein modifying is also based on a second plurality of categories.

61. A method for training a user to discriminate between a first and second auditory patterns, comprising: receiving a first auditory pattern having at least one element of music, each element of music having at least one characteristic; receiving a second auditory pattern having at least one element of music, each element of music having at least one characteristic, the second auditory pattern differing from the first auditory pattern by at least one characteristic; generating a combined auditory pattern by mixing the first and second auditory patterns using a first and second plurality of weights, the first plurality of weights being associated with the characteristics of the first auditory pattern, the second plurality of weights being associated with the characteristics of the second auditory pattern, the combined auditory pattern having at least one combined characteristic; presenting the combined auditory pattern to an user; and receiving a response, the response representing a user categorization of the combined auditory pattern.

62. A method for training a user to discriminate between a first and second auditory patterns, comprising: presenting an auditory pattern to a user, the auditory pattern having at least one element of music; presenting a first plurality of categories to the user, the first plurality of categories representing possible classifications of the auditory pattern; receiving a response from the user, the response representing a user perception of the auditory pattern, the response selected from the first plurality of categories; and modifying the first plurality of categories based on a history of responses.

63. A method for generating an auditory pattern for training the identification of different elements of music within a complicated musical environment, and for training the identification of the relationship or difference between these elements, comprising:

(a) receiving a first auditory pattern having at least one element of music, each element having a characteristic; (b) modifying the first auditory pattern;

(c) presenting the first modified auditory pattern to an user;

(d) modifying a number of categories by which the user may identify the first auditory pattern; and

(e) repeating steps (a) - (d) until the user is able to identify at least one characteristic of the modified auditory pattern.

64. The method of claim 63, wherein the characteristic is a frequency.

65. The method of claim 64, wherein modifying the first auditory pattern includes changing the frequency of the element at at least one point in time.

66. The method of claim 63, wherein the characteristic is a frequency difference between a plurality of elements.

67. The method of claim 66, wherein modifying the first auditory pattern includes changing the frequency difference between the plurality of elements at at least one point in time

68. The method of claim 63, wherein the characteristic is an amplitude.

69. The method of claim 68, wherein modifying the first auditory pattern includes changing the amplitude of an element at at least one point in time.

70. The method of claim 63, wherein the characteristic is an amplitude difference between a plurality of elements.

71. The method of claim 70, wherein modifying the first auditory pattern includes changing the amplitude difference between the plurality of elements at at least one point in time.

72. The method of claim 63, wherein modifying the first auditory pattern includes changing a number of characteristics of the pattern at at least one point in time.

73. The method of claim 63 , wherein modifying a number of categories includes no change in the number.

74. A method for training auditory skills, comprising:

(a) receiving a first auditory pattern having at least two elements of music, each element having a characteristic; (b) modifying the first auditory pattern;

(c) presenting the first modified auditory pattern to an user; and

(d) repeating steps (a) - (c) until the user is able to identify between at least two characteristics of the modified auditory pattern.

75. The method of claim 74, wherein the first and second weights change from trial to trial.

76. The method of claim 74, wherein the first and second weights are modulated.

77. The method of claim 74, wherein mixing is performed non-linearly.

78. The method of claim 74, wherein combining includes morphing.

79. The method of claim 78, wherein morphing includes creating a continuum between the normative and the non-normative auditory patterns and selecting a point along the continuum.

80. The method of claim 74, wherein modifying includes calculating a linear weighted combination of the Fourier components of each element over time.

81. The method of claim 74, wherein modifying includes calculating using positive or negative weighting.

82. The method of claim 74, wherein modifying includes calculating using different weights at different frequency bands.

83. The method of claim 74, wherein modifying includes calculating using amplitude or frequency modulation of different bands.

84. The method of claim 74, wherein the combination of accented and unaccented elements also is combined with an auditory mask which can vary over time.

85. The method of claim 74, wherein the speech includes sampled, recorded, synthesized, or filtered speech.