US3676595A - Voiced sound display - Google Patents

Voiced sound display Download PDF

Info

Publication number
US3676595A
US3676595A US30123A US3676595DA US3676595A US 3676595 A US3676595 A US 3676595A US 30123 A US30123 A US 30123A US 3676595D A US3676595D A US 3676595DA US 3676595 A US3676595 A US 3676595A
Authority
US
United States
Prior art keywords
display
pulse
pitch
output
sweep
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US30123A
Inventor
Ladislav Dolansky
Nathan D Phillips
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Research Corp
Original Assignee
Research Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Research Corp filed Critical Research Corp
Application granted granted Critical
Publication of US3676595A publication Critical patent/US3676595A/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/90Pitch determination of speech signals
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B19/00Teaching not covered by other main groups of this subclass
    • G09B19/04Speaking

Definitions

  • I4 TRIGGER ,I44 D 7 CHANNELI FIG. l2 I I465 sw 3 I i T i
  • the invention relates to cathode ray display equipment and in particular to circuitry for automatic display of patterns dervied from voiced utterances.
  • Prior Art relates to the visual display of voiced utterances and related sounds.
  • Gruenz and Schott described a method for extraction and portrayal of pitch of speech sounds. (See Gruenz, 0.0., .lr., and Schott, L.O. Extraction and Portrayal of Pitch of Speech Sounds," B.T.L. Mono 3-1698. Also: .1. Acous. Soc. Amer. Sept. 1949). They claimed reliable pitch extraction for frequencies from I to 600 Hz. in order to reduce errors in producing the basic pitch-period indicating pulses, an input filter was used to limit the pass-band for the speech signals.
  • the invention makes use of visual display techniques utilizing the fundamental frequency of the signal under study (e.g. the microphone signal for voiced utterances of speech).
  • a visual pattern which represents the fundamental frequency of the signal under study as a function of time is displayed on the face of a storage oscilloscope. Because the intonation pattern is presented as a function of time, temporal aspects of the signal can be studied.
  • the intensity of the signal is displayed by means of lights.
  • the invention comprehends a complete intensity display and intonation display system capable of producing and storing intonation patterns for extended periods of time and suitable for the quantitative study and teaching of intonation patterns as well as for other frequency versus time displays of variable frequency signals.
  • One aspect of the invention includes an error suppression scheme based on (a) the total permissible frequency range and (b) the permissible rate of change of the fundamental frequency of the signal.
  • Another aspect of the invention comprehends a specific improved method of conversion of the frequency describing voltage which approximates the desired logarithmic conversion to a high degree of accuracy.
  • Still another aspect of the invention comprehends a system of automatic and semiautomatic modes of operation permitting a great flexibility in the desired procedures for the study and teaching of intonation patterns.
  • the equipment for (a) an intensity display, using the lights, (b) an intonation contour display, using the storage-oscilloscope screen display, (c) both displays simultaneously.
  • the system of the invention is a welcome aid to show differences in intonation in a most straightforward and quantitative way. This may be of special importance in languages in which the meaning is changed when the intonation pattern changes.
  • FIG. I is a front view of the external physical layout of a system embodying the invention.
  • FIG. 2 is a graph in which the against time
  • FIG. 3 is a graph showing the pulsed output of the pitch detector and illustrating regions of error suppression
  • FIG. 4 is a block diagram of a circuit embodying the invention.
  • FIG. 5 is a circuit diagram showing the preamplifier, mixer and pitch detector used in the system of the invention.
  • FIG. 6 is a circuit diagram of the pulse rate to analogous log pitch frequency converter used in the system of the invention.
  • FIG.,7 is a circuit diagram showing the logarithmic conversion circuit and the sample and hold circuits of the system of the invention.
  • FIG. 8 is a circuit diagram of the AT comparator of the system of the invention.
  • FIG. 9 is a circuit diagram of the tri-level indicator or frequency range comparator of the system of the invention.
  • FIG. 10 is a multiple graph showing the wave forms of the various portions of the P-A analog circuit
  • FIG. 11 is a circuit diagram showing the horizontal sweep control and presentation sequencer of the system of the invention.
  • FIG. 12 is a circuit diagram showing a trigger generator for use with the horizontal sweep control and presentation sequencer of FIG. I I;
  • FIG. 13 is a circuit diagram of a sweep generator suitable for use with the horizontal sweep control and presentation sequencer of FIG. I 1;
  • FIG. I4 is a multiple graph showing the timing diagram for delay and sequencing logic in the MFRT mode as employed with the horizontal sweep control and presentation sequencer of FIG. 11;
  • FIG. 15 is a circuit diagram showing a clock and channel switch suitable for use in the system of the invention.
  • a dis play screen 1 displays an intonation pattern representing the pitch frequency of a human voice which it is desired to analyze.
  • Input signals from a microphone into which the person whose voice is to be analyzed speaks, are connected either to microphone jacks 2 or to the auxiliary input terminals 3 via an external pre-amplifier.
  • the auxiliary input terminal may be used to receive signals from the output of a tape recorder rather than from the output of a microphone preamplifier.
  • An earphone jack 4 is provided to provide an output for an earphone.
  • the control for this purpose may most conveniently be provided by an auxiliary plug-in trace width control 6 which can be connected to the side of the oscilloscope unit.
  • a loudness display represented by four light bulbs 7,8,9 and 10 is activated by depressing a loudnes switch II.
  • the pitch display 1 may be activated by depressing a pitch control switch 12.
  • a mode selector switch 13 provides means for selecting the various modes of operation.
  • a reset button 14 is provided for erasing the intonation pattern on the screen and where applicable to return the circuits into the initial state preceding a new display sequence.
  • While this reset button 14 is a single switch it includes two illuminated captions l5 and l6; 15 being labeled "START” and 16 being labeled CONT.” (i.e., "continue).
  • a mirror 17 permits monitoring the subjects facial movements as well as those of the teacher. The mirror 17 may be adjusted in such a way that the face of the teacher, the face of the student, and the display screen I can all be seen by the student without moving his head.
  • a power switch 18 is used to switch the unit on and off.
  • CFFI Useful in situations where the subject is monitoring continuous speech.
  • MFFT Useful when it is desired to display and study a complete phrase pattern or sentence.
  • MFST Suitable for automated programming, or rigidly sequenced lessons.
  • MFRT This mode can also be used for automated programming; however, a greater flexibility in the number of responses per stimulus is possible.
  • MFSD Useful when the teacher and the subject practice simultaneously, which is sometimes desirable for greater motivation. This mode can also be used in place of MFFI but in this case a variable length of time to study the patterns is available to the subject.
  • an intensity display is also provided. This may consist of the four lights 7,8,9 and 10. With increasing intensity of the speech signal, more lights are progressively turned on. Thus, for example, in order to operate the equipment the intensity should generally be at least sufficient to turn on light number 7. Somewhat greater intensity will turn on both lights 7 and 8; still greater intensity will turn on lights 7,8 and 9 and still greater intensity will turn on all the lights 7,8,9 and 10.
  • This intensity display may be used either separately or simultaneously with intonation display since it may be separately activated independently of the activation of the intonation display by means of the loudness switch marked LOUD and indicated by reference numeral ll.
  • a pitch detector is employed of a type which disregards the future wave form of speech and bases the process of establishing the value of the present pitch frequency on the past values of the wave form alone.
  • Use of this type of pitch detector permits simple equipment and avoids excessive requirements for storage elements.
  • an error suppression scheme is provided based on (a) the total permissible freq uency range and (b) permissible rate of change of the fundamental frequency of the signal. It is necessary to have such a system for eliminating erroneous pitch period indications, owing to the very nature of the manner in which pitch periods are identified.
  • the pitch period detector must operate from a signal which has a relatively complex wave form, but which has certain regularly recurring characteristics which permit the establishment of a so-called pitch frequency.
  • the sound posesses certain random characteristics which give rise to occasional erroneous indications.
  • the elimination in a simple manner of erroneous pitch period indications is therefore of great importance. It has been established that certain pitch periods cannot occur from the sound of the human voice and the invention makes use of these facts in establishing the error elimination circuit.
  • the first parameter used by the invention is the pitch frequency range. Many estimates have been made of the total range of pitch frequencies. These estimates vary greatly and frequencies as low as 33 (creaking) and as high as 3.1 kilohertz have been reported. However, in accordance with the invention the system is designed to accommodate only the frequency range of normal speech plus any additional range which might occur in the speech of the deaf. This frequency range is approximately 70 to 600 hertz (i.e. cycles per second).
  • the second parameter utilized by the invention to detect errors is the maximum rate of change of pitch frequency.
  • This phenomenon may perhaps best be understood by reference to FIG. 2 wherein an example of the variation of pitch frequency plotted against time is shown. It is seen that in this case the maximum rate of change of pitch frequency occurred at the beginning. In general, this maximum rate of change of pitch frequency cannot exceed a certain value due to the physical limitations of the speech producing mechanism. Therefore, in apparatus constructed in accordance with the invention, a rate of change of pitch frequency in excess of: l 0 to 20 percent of the reference frequency is regarded as an error and is eliminated.
  • the system embodying the invention includes a logic design activated by any signal outside the limited range of the abovementioned parameters.
  • f(R) designates reference pitch frequency
  • T( HF) represents the period of highest possible pitch frequency
  • T(LF) represents the period of lowest possible pitch frequency
  • T( IR) represents the period of pitch frequency obtained with the highest possible positive rate of change, when starting from the pitch frequency of the preceding pitch period
  • T(DR) represents the period of pitch frequency obtained with the highest possible negative rate of change, when starting from the pitch frequency of the preceding pitch period.
  • T l lflR
  • the logical decisions to be made about the acceptability of the next pitch-period indicating pulse are based on the duration of the first pitch period.
  • certain quantities should first be defined asfollows:
  • T( HF) is the pitch-period duration for the highest possible pitch frequency (e.g. 600 hertz), while T(LF) corresponds to the lowest possible pitch frequency (e.g. 70 hertz).
  • a pitchperiod indicating pulse which occurs before T(l-IF) is terminated must be in error, and is therefore always suppressed.
  • the next condition that a new pitch-period indicating pulse must meet in order not to be disqualified as an erroneous pulse is that it must occur within the interval AT.
  • AT is the interval between T(IR) and T(DR).
  • TtlR) and T( DR) are related to the maximum pennissible rate of change of the pitch frequency (e.g.
  • the visual display is suppressed until a sequence of three acceptable pitch periods develops. If, however, an expected indicating pulse does not occur within the interval AT or between T(DR) and T(LF), the voiced utterance is considered to be terminated, and the circuit settings return to their initial state, suitable for the beginning of a new voiced utterance.
  • the invention also comprehends means for eliminating a very short part of the initial portion of the intonation display in order to avoid confusion.
  • the interfering noise pulses cause the pitch indicator to indicate a frequency which is one-half of the correct frequency (i.e. skipping every other pitch-indicating pulse).
  • a special circuit searches for the correct pitch-indicating frequency, starting from the highest permissible frequency.
  • the operation of the pitch/intensity display system is best explained with the help of the block diagram shown in FIG. 4.
  • the microphone signals are amplified by means of either of the two microphone pre-amplifiers 19,20 and fed to the corresponding mixers 21,22 which can also accept an auxiliary input signal, for example from a tape recorder.
  • Each of these two mixers 21,22 feeds a separate pitch detection and pulseto-analog (P-A) circuit 23,24.
  • P-A pitch detection and pulseto-analog
  • the outputs of these two P-A circuits 23,24 are fed to a vertical amplifier via a corrunon switch 26.
  • the output of the vertical amplifier 25 is used as the vertical deflection voltage for a cathode-ray tube.
  • the horizontal deflection voltage for the cathode-ray tube is generated in the sweep generator 27, and amplified in the horizontal amplifier 28.
  • the mode switch 29, indicated by dashed lines in the diagram of FIG. 4, is used to obtain the various sequencing and display operations of the system by adjusting the parameters in the blocks indicated by arrows on the dashed lines.
  • the outputs of the mixers 21,22 of channel I and channel 2 are used to activate a trigger 30 which generates trigger signals which activate a delay and sequencer circuit 31 and a switch control 32 for the channel-selecting switch 26.
  • the switch 26 is under the control of a clock 33.
  • This preamplifier-pitch-detector circuit includes a microphone preamplifier 34.
  • another input e.g. from a tape recorder, can be fed into the terminal AUX. INPUT", identified by the reference numeral 35. Any of these two input signals pass through an amplifier-mixer stage 36 and a 4 kHz low-pass filter 37. This filter 37 is included to reduce the occurrence of erroneous indications for the speech sounds [z], .t 1; these speech sounds have a considerable high-frequency content.
  • the filter output is fed into the cascaded pitch-period incidating stages 38,39, which operate in a manner similar to the stages of the Instantaneous Pitch-period Indicator of Dolansky described in the publication An Instantaneous Pitch-period Indicator," Journal of the Acoustical Society of America, Volume 27, No. I, pp. 67-72 (1955) by LO. Dolansky.
  • the output pulse at 40 serves as the input signal to the pulseanalog circuit of FIG. 6.
  • FIG. 6 therein is shown a pulse-rate-toanalogous-log-pitch-frequency converter, suitable for use as the P-A circuit shown at 23 and 24in FIG. 4.
  • the pitch dctector circuit previously described in connection with FIG. 5 generates a pulse output wherein pulses occur at a frequency corresponding to the pitch frequency of the voice being analyzed.
  • This sequence of pulses then forms an input at 4
  • the converter of FIG. 6 operates to deliver an output at a PA analog output terminal 42 which output is a voltage signal the magnitude whereof is proportional to the logarithm of the input pulse frequency.
  • the system includes specific improved method of conversion of the frequency-describing voltage which approximates the desired logarithmic conversion to a high degree of accuracy.
  • This particular aspect of the invention is shown in FIG. 7 in detail and is shown in FIG. 6 as the current switch 43, RC circuits 44 and summation amplifier 45. These three items convert the incoming pulses to a wave form consisting of a sequence of recurring peaks corresponding to the pulses but followed by an em ponential decay. The magnitude of this voltage signal from the summation amplifier 45 just before each pulse is proportional to a high degree of accuracy to the logarithm of the pulse frequency.
  • a bridge switch 46 is provided as a sample circuit to select only that portion of the output of the summation amplifier 46 which exists just prior to the succeed ing pulse.
  • the bridge switch 46 delivers the sampled output to a follower 47 having a capacitor 48 (FIG. 7) which holds the signal sampled by the bridge switch 46 until the next signal issues therefrom.
  • the PA analog output 42 steadily delivers a voltage corresponding to the pitch frequency represented by the period between the two latest pulses received by the apparatus, and this voltage remains the same until a new pulse comes in thereby establishing a new interpulse period.
  • a gate 49 is provided between the pulse input 41 and the logarithmic conversion circuits 43,44,45,46 so that erroneous pulses may be suppressed.
  • multivibrators 50,51 are provided for converting the actual pulse input into cleaner pulses which are easier to handle in the logic circuits.
  • the pulses produced by the multivibrators 50,51 are consistently of the same duration and amplitude.
  • the pulse in the first multivibrator 50 is initiated at the time of the initiation of the incoming pulse whereas the pulse of the second multivibrator 51 is produced simultaneously with the conclusion of the pulse issued by the first multivibrator 50.
  • these pulses may be designated as an early pulse and a late pulse.
  • the duration of these pulses is approximately 30 microseconds while the time interval between pulses is of the order of milliseconds.
  • the logic circuit of FIG. 6 uses DC logic, so that events occur depending upon whether the various logic elements are in one state or another.
  • All gates are of the NAND type.
  • the not excited condition is represented by the numeral 0 and corresponds to a voltage of minus 12.6 volts; the excited state is represented by the numeral I and represents a voltage of zero volts.
  • the gate 49 encountered by the incoming pulse is designed to prevent passage of the pulse if a signal for this purpose is delivered either by the frequency range comparator 52 or the rate of change comparator 53.
  • the output from this gate 49 will be 0 if all its inputs are l and this output of 0 is changed by an inverter 54 to a signal of l which serves to trigger the multivibrators 50,51.
  • the states of the multivibrators 50,51 when the output of the gate 49 is l are such that the bridge switch 46 does not sample the output of the summing amplifier 45 and no input is delivered to the current switches 43; and this is the case if any of the inputs to the gate 49 are 0 Consequently, if no pulse comes in, or if the frequency range comparator 52 delivers a 0 to the gate 49, or if the rate of change comparator 53 delivers a logarithmic conversion to the gate 49, there is no input to the converter of circuits 43, 44, 45, 46. In this way, erroneous signals lying outside the frequency range specified or representing a greater rate of change of pitch frequency than that specified will temporarily out off further incoming signals, and other portions of the circuit of FIG. 6, to be described hereinafter, will suppress the display or the screen 1.
  • the lower portion of FIG. 6 represents the logic circuit which suppresses the display, in the event of an erroneous input signal, in response to a control signal generated by the frequency range comparator 52 or the rate of change comparator 53, and also at the start of the utterance. However, at the start of the utterance, the display is additionally suppressed by a delay circuit forming part of the follower 47.
  • the function of the frequency range comparator 52 and rate of change comparator 53 is to detect an input signal outside the permissible range, which input signal is presumptively erroneous, and in response to that input signal to prevent further excitation of the display for the next three input pulses or so.
  • the frequency range comparator 52 delivers its signal to the suppression circuit via a multivibrator 55 which produces a clean signal pulse from the relatively unclean pulse out of the frequency range comparator 52.
  • the relatively unclean output from the rate of change comparator 53 is delivered to the suppression circuit via multivibrator 56. Whether or not these signal pulses will have an effect in the suppression circuit is determined by whether or not they occur simultaneously with the input pulses. For this purpose the early pulse is delivered to certain portions of the suppression circuit and the late pulse is delivered to other portions of the suppression circuit.
  • the pitch period indicating pulse from the pitch detector enters gate input terminal 57 via diode 58.
  • Diode 58 is only required in the event certain conventional gating circuits are used.
  • gate input terminal 59 is 1, it being assumed that the output terminal 60 of control-gate 61 is l, which assumption is true if either input 62 and 63 of control-gate 61 is 0.
  • Controlgate input 63 is the same as inverter-output 63, and this is shown as item g in FIG. 10, from which it appears that inverter-output 63 (and hence control-gate input 63) is 0 when the incoming pulse starts.
  • the incoming pulse gets through the gate 49 is inverted once in the gate 49 and again in the inverter 54, so that it enters the first multivibrator 50 as a state I trigger pulse.
  • one output 64 of the first multivibrator 50 delivers a pulse of one polarity to the bridge switch 46 and to an input 65 of a gate 66, while the corresponding pulse of opposite polarity is delivered from another output 67 of the first multivibrator 50 to the bridge switch 46 and also to the second multivibrator 51.
  • the pulse of the second multivibrator 51 starts when the pulse of the first multivibrator 50 stops.
  • the pulse of one polarity of the second multivibrator 51 is delivered to the current switches 43 while the pulse of opposite polarity is delivered to the suppression circuit so as to determine whether or not certain activity therein will take place.
  • the late pulse charges the RC circuits 71,72 to maximum potential and then the potential of these circuits declines exponentially giving the output signal of the summing amplifier 45.
  • the voltage output of the summing amplifier 45 returns to its maximum value and each time it decays over the same exponential path, so that the lowest value achieved by the exponential pattern is a measure of the distance between pulses.
  • the pulse which charges the RC circuits 71,72 is the late pulse, while the pulse which causes the diode bridge switch 73,74,75,76 to take a sample is the early pulse.
  • the diode bridge switch 73,74,75,76 thus acts as a sampler of that portion of the exponential curve which occurs just prior to the next pulse.
  • the sampled output of the switch, 73,74,75,76 which is a logarithmic representation of the pitch frequency, is fed to a holding circuit 77.
  • the holding circuit 77 is of conventional design and includes a capacitor 48 which receives the sampled output and a follower 47 which has a very high input impedance; the sampled output signal causes this capacitor 48 to be charged to a voltage virtually identical to the voltage of the output of the summing amplifier 46 at the moment of sampling.
  • the follower 47 delivers a corresponding voltage signal of the same magnitude and this is the PA analog output at 42.
  • a circuit which is used, during the initial portion of a voiced utterance, to bypass the holding capacitor 48 and thereby cause the apparatus to deliver the same type of signal as would be delivered from an excessive rate of change signal.
  • Such a signal has the effect of preventing display and thus display is prevented during the first several pulses of any voiced utterance.
  • the frequency range comparator 52 At the start of any voiced utterance, the frequency range comparator 52 will be delivering the negative signal characteristic of an excessively long pitch period since the time since the last pulse will have been essentially infinite. See FIG. 10c. This means that under starting conditions the transistor 78 is on so as to connect the holding capacitor 48 to a (l2.6)-volt voltage source 79 via a 5.1 kilo-ohm resistor 80.
  • RC circuit 71 comprises a capacitor C, which is charged by the late pulse and which discharges between pulses through a resistor R,.
  • RC circuit 72 comprises a capacitor C, which is also charged by the late pulse and which discharges between pulses through a resistor R,.
  • the resistors R,, R are connected at the ends thereof remote from the respective capacitors C,, C, so that the current discharges from the capacitors through the resistors are added and their sum delivered to the summing amplifier 45.
  • these RC circuits 71,72 must be so designed that R,C is approximately 0.01 l3 sec., R,C, is approximately 0.00l9 sec., and R is approximately 2R for circuits which are to display voiced utterances.
  • R might be kiloohms
  • R might be I00 kilo-ohms
  • C might be 0.226 microfarads
  • C might be 0.019 microfarads.
  • the AT comparator 53 used to compare the durations of consecutive pitch periods is shown in detail in FIG. 8.
  • the output from the summing amplifier 45 and the signal at the P-A analog output 42 are compared in a difference amplifier 83.
  • the output of the difference amplifier 53 is l2 volts. This result comes about through the use of the diode bridge 84,85,86,87.
  • the voltage output of the summing amplifier 45 will be about l2 volts.
  • the voltage at the P-A Analog Output 42 (which represents the sample at the end of the previous summing amlifier pulse) is at 3 volts.
  • the left hand terminal 88 of the bridge 84,85,86,87 is at a lower voltage than that of the right hand terminal 89 of the bridge 8435,8637 and therefore current flows through diode 86 from the right hand terminal 89 through a resistance 90 to one input 91 of the diflerence amplifier 83, thus creating a voltage at this point of about 3 volts.
  • the potential of the other input 92 to the amplifier 83 will correspond to the l2 volts at the left hand terminal 88, which is transrm'tted through the diode 85 to that input 92.
  • the absolute value of the voltage difference exceeds 0.3 volts and the output of amplifier 83 is -l 2 volts.
  • transistor 93 is cut ofl' as the output voltage of amplifier 83 rises through 6 volts, and this cutoff of transistor 93 is aided by a positive feedback via inverter output 64 which feeds a signal through an RC circuit 98 and a diode 99. This positive feedback causes the emitter of transistor 93 to go negative, thereby supplementing the action otits base in becoming more positive.
  • the AT comparator 53 causes state 1 to appear at AT output during that period of time when the summing amplifier 45 has a voltage output in the vicinity of the sampled output of the preceding pulse.
  • the circuit is so designed that if the next late pulse does not exist at the time when the state of AT output 95 reverts to O, a reset pulse or error signal will be delivered.
  • the time interval during which AT output 95 is in state 1 is longer when a longer period between input pulses is involved, so that the size of the time interval within which the next input pulse can exist while AT output 95 is instate l (and hence exist without causing a reset pulse), is roughly a percentage of the length of the preceding period. Previous studies have shown that this percentage may be in the vicinity of i 10 to 20 percent, and for the circuits shown this result is achieved by setting the difference amplifier 83 so as to react to input voltage differences greater than 0.3 volts.
  • a tri-level indicator is used to exclude pulses which would correspond to pitch frequencies outside of the permissible range.
  • This tri-level indicator of T(LF)/T(HF) comparator is the frequency range comparator 52 of FIG. 6 and is shown in detail in FIG. 9. Referring thereto, the limits of this range are determined by the settings of two l-kilo-ohm adjustable resistors 100,101. The voltage at the anode of a diode 102 connected to one adjustable resistor determines the magnitude of T(LF), while the voltage at the cathode of another diode 103 connected to the other adjustable resistor 101 determines the magnitude of T(HF). (See FIG. 3).
  • the cathode of diode 102 is connected to the anode of diode 103 by a pair of equal resistors 104,105 the junction 106 of which is connected to one input 107 of a difference amplifier 108.
  • the voltage at imput 107 is equal to the mean of the two voltages determined by the setting of the l-kilo-ohm resistors.
  • the voltage of the other input 109 of the difference amplifier 108 is also adjusted to the same value by the potentiometer 110.
  • the output 111 of the difference amplifier 108 is equal to zero volts.
  • any subsequent pulse will be rejected if it occurs before the magnitude of the summing amplifier output has fallen below the high voltage end of this permissible range which might be, for example, a (negative) voltage having a magnitude of 8 volts.
  • a (negative) voltage having a magnitude of 8 volts might be, for example, a (negative) voltage having a magnitude of 8 volts.
  • an error signal will also result. This might occur, for example, if the output of the summing amplifier were a (negative) voltage having a magnitude below I volt. This of course always occurs after the end of the voiced utterance.
  • an interface circuit 117,118 is used.
  • the collector voltage of 118 is at ground potential (stage 1 as long as the voltage at the amplifier output 111 does not exceed +3 volts.
  • transistor 118 saturates and the input I16 to gate 49 goes to state 0. As appears from the diagram of FIG. 6, this signal only occurs when the low frequency end of the range is exceeded such as at the end of the voiced utterance.
  • the clock pulses for the logic are the early and late pulses generated respectively by means of the first and second multivibrators 50,51 and are presented in FIG. l0b,c. First the mechanism of the initial delay in the display (at the beginning of an utterance) will be described.
  • the first late pulse (FIG. 10c-l) generates a decreasing exponential at the output of the summing amplifier (FIG. 10d). Assuming that the following pulses occur at a frequency f,,,,,, or greater, the exponentials which restart at each pulse will never reach the lower dashed line in FIG. 10d; thus the T(LF)/T(HF) output will be UV or higher .(FIG. 10). Consequently the transistor 78 in FIG. 7 exponentially approaches pinch off. As a result, in about 40 msec, a staircase waveform results at the P-A analog output (FIG. 10f). During this 40 msec interval, reset pulses (FIG. 10h) are generated at inverter output 119.
  • AT comparator 53 causes the interface output to be in state 0 when an improper amount of time has elapsed since the last pulse. This corresponds to state I at inverter output 64, and it is inverter output 64 which is designated AT output" at FIG. 10g.
  • AT comparator 53 does this by causing AT output to go to state 0 whenever the summing amplifier output is near the P-A analog output. Initially, the summing amplifier output is near the P-A analog output only during delivery of each pulse to the holding capacitor: the P-A analog output then rises and passes through the voltage being put out by the summing amplifier. However, during this initial period the AT output reverts to state I after the next late pulse has finished.
  • the output of the second multivibrator 51 which is delivered to the AT error gate is in state I when the output of the AT error multivibrator 56 which is delivered to the AT error gate 120 reverts to state I.
  • a reset pulse (FIG. 10h) occurs at inverter output I19 and the AT comparator 53 and the visual display I are disabled. If the following pulses are within the permissible range, the AT comparator 53 returns to its normal operating state at pulse (N+l (FIG. 10:), and the visual display reappears on the screen at the time of pulse (N+3) (FIG. 10j). The net result is that the visual display is interrupted for a time no longer than four pitch periods but no erroneous indication appears on the screen.
  • the horizontal sweep control 125 and presentation sequencer 31 are shown in FIG. I1.
  • One function of this circuit is to start the horizontal sweep across the display upon receipt of a trigger initiated by the start of the voiced utterance.
  • the gates 126,127,128 determine whether or not trigger signals initiated by the voiced utterance will be operative to start the sweep. It will be recalled that one of the principal differences between the various modes is whether signals from channel I or channel 2 will be operative and when. Consequently, the mode switch 29 is also shown in FIG. 11. Depending upon the position of the mode switch 29 the ability of a trigger from channel I or channel 2 to initiate a sweep is controlled.
  • a first flip-flop 129 will always respond to an incoming trigger so as to initiate a horizontal sweep.
  • the signal to the first flip-flop 129 which initiates the sweep has the appearance of state at the output of the final gate 128. This can occur only if both inputs to the final gate 128 are in state I. Thus, in order for state 0 to appear at the output of the final gate 128, the input at 130 must be 1 for all signals.
  • the intermediate gate 127 acts simply as an inverter, since it will have its output in state I if either input to the intermediate gate 127 is in state 0, and this occurs for any trigger coming from channel 2 and will also occur for a trigger coming from channel I provided the input 131 to the channel-1 gate 126 is in state 1.
  • the function of the second flip-flop 132 is to prevent initiation of a second trace until it permits this to happen. It accomplishes this by delivering a pulse at its output 133 which is transmitted through switch SBB to the input 130 of the final gate 128.
  • the incoming signal which causes the second flipflop 132 to again permit a trigger to initiate a sweep arrives at the second flip-flop 132 from switch SCB.
  • the time of this signal is caused to be a certain number of seconds after substantial completion of the sweep by means of the clocking mechanism.
  • an interface I34 delivers a signal to a first multivibrator 135 which then produces a shaped pulse that is delivered to a clock 136 via the switch SDA.
  • the clock 136 then initiates operation of various flip-flops 137,138,139 whose outputs occur at varying times after receipt of the input signal. These timed outputs of flip-flops 137,138,139 are also delivered to the elements at the left-hand side of FIG. 11 some of which serve to open channel I and some of which serve to cause automatic erasing of the display.
  • FIG. 12 The details of the circuit for producing the trigger inputs to the sequencer of FIG. 11 are shown in FIG. 12.
  • the circuit for the sweep generator of FIG. 11 is shown in FIG. 13.
  • transistor 141 is just at the cutoff point; i.e. the base of this transistor is at approximately 6v.
  • This adjustment is made by means of the potentiometer 142 in the input connection.
  • the envelope-detection circuit 144 derives a voltage from the output of transistor 143.
  • the resulting voltage turns transistor 145 on.
  • inverter output 146 is in state 1.
  • inverter output 147 is in state I.
  • gate output 148 is in state I and transistor 149 is turned on.
  • transistor 149 is disabled by opening SW3.
  • the purpose of the sweep generator (FIG. 13) is to produce the required sweep voltage for the visual display.
  • the sweepcontrol signal (in the form of the complement of waveform CONT. shown in FIG. 14) is obtained from the first flip-flop 129 (FIG. 11).
  • transistor 151 (FIG. 13) is on, and the voltage across the 25 [.Lf capacitor 152 is kept at 0v.
  • transistor 151 With the input voltage at 150 at-l 2v, transistor 151 is cut off and the capacitor 152 charges with a constant charging current. The magnitude of this charging is determined by the voltage across transistor 153 and the setting of adjustable SO-kilo-ohm resistor 154.
  • Stages 155,156 and 157 are used to isolate the load from the sweepvoltageproducing capacitor 152, to improve the linearity of the sweep, and to provide a low-impedance sweep-voltage source.
  • a voltage derived from the sweep output by means of a Zener diode-resistance divider 134 (interface output) is used to reset a number of voltages in the logic circuits so that conditions required for the next sweep are obtained. This occurs when the "interface output"158 reaches about 6v.
  • the other output of the first flip-flop 129 goes to state 0, thereby initiating the sweep and delivering a pulse to the clock 136 at the input 162 to gate 163.
  • the output 133 of the second flip-flop 132 causes the input 130 to the final gate 128 to become state 0, thus turning ofi the START lamp and preventing further inputs in either channel from triggering the sweep: in other words, the final gate 128 is closed. In the CFFI mode this action of the output 133 of the second flip-flop 132 is delivered directly to the final gate 128. In all other modes, this output 133 of the second flip-flop 132 is delivered to the final gate 128 via gate 164 and inverter 165, Gate 164 also inverts the output of multivibrator 189.
  • the resulting output at 166 from the first multivibrator 135 resets the first flip-flop 129.
  • the resulting change in state at output 161 of the first flip-flop 129 turns the CONT.” light 16 off, and the resulting change in state at output 150 of the first flipflop 129 causes the beam to return to its initial position.
  • the other output 167 of the first multivibrator 135 via switch SDA triggers the clock 136 which generates a delay sequence.
  • the first delay equals 2 seconds and is caused by the output of the fifth flip-flop 138 while subsequent delays are I second and are produced by the output of the sixth flip-flop 139.
  • the original rate of the clock pulses are shown in FIG. 14 and are represented by the state of the clock output at 168. This output changes state every half second so that the full cycle is one second. This rate is reduced by a factor of 2 in each of the three cascaded flip-flop stages numbered 6,5 and 4 (139,138 and 137).
  • the outputs of these flipflops or binary counters are used to set the second flipflop 132 via the switch SCB.

Abstract

Voiced sounds are displayed by means of a pitch period indicating circuit including circuits for the suppression of those errors which arise from a minimal number of parameters. Automatic sequencing is provided in a compact setting to provide equipment ideally suited for instruction of young students.

Description

United States Patent Dolansky et al. 1451 July 11, 1972 [s41 VOICED SOUND DISPLAY 3,109,142 10/1963 McDonald ..179/| SA [72] Inventors: n I v n l I Weston; N u n R 2,416,353 2/1947 Sh1pman 179/1 VS Phillipa, Stoneham, both of M888. OTHER PUBLICATIONS [73] Assign: Comm!" New York F. Anderson, An Experimental Pitch Indicator for Training 2 Filed; Apr 10, 1970 Deaf Scholars, .I.A.S.A. vol. 32, 8/1960 p. 1065- I074 [21] App]. No.: 30,123 Primar3-ExaminerWilliam C. Cooper Assistant Examiner-Jon Bradford Leaheey 52 us. (:1. ..179/1 vs Mower-Russell and Nields [51] Int. Cl. ..Gl0l 1/14 [58] FteldolSearch ..179/1 VS, 1 SA;324/77 B; ABSTRACT 35/35 C Voiced sounds are displayed by means of a pitch period indicating circuit including circuits for the suppression of those [56] Ram CM errors which arise from a minimal number of parameters. Auns ATES PATENTS tomatic sequencing is provided in a compact setting to provide equipment ideally suited for instruction of young students. 2,915,589 12/1959 Plant ..179/1VS 3,165,586 1/1965 Campanella .324/77 B 8 China, 15 Drawing Figures CHANNEL 1 23 AUX L" 2| PtTCH DET PRE MIC AMP J P-A cm INTENSITY AVERAGE INEEEISCITY LAMPS EARPHQNE EARPHONES l fi AMPLIFlER [2? O/RZB RT sw P H 12 GENEEETOR AMPLIFIER ng'gggs fi J 125 SWEEP DELBAY START-CONT bONTROL saouewcsn INDOCATOR l PERSISTENCE ERASE 29 L1 2; 2 1 5"); 30 T\\ TR THRESHOLD .GGER 32 26 25 INTENSITY LAMPS 33 1 SWITCH I F VERTICAL CRT CONTROL AMPLIFIER VERTICAL We MIXER mag DET P-A CKT Aux minimum I972 s. s 76 S95 saw 010? 11 I TART) PITCH.// H
CONT MFST LOUD g MFFT MFRT 9/0 CFFT MFSD 8 9 MODES 7 Q5 Q CHANNEL I CHANNELII O o o o o o O PITCH FREQUENCY f vs TIME t FIG. 2
INVENTORS LADISLAV DOLANSKY' NATHAN D. PHILLIPS BY mils v ATTQRNEY P'A'TENTEnmn ma 3.676.595
SHEET 02 HF 11 wNT Jl Ir INVENTORS LADISLAV DOLANSKY NATHAN D. PHILLIPS BY 'rU- ATTORNEY FAWIIIIII Im 3.676.595
SHEET 03 0F 11 CHANNEL I 23 Aux 2I f I PITcI-I DET PRE MIC 4MIXER 8 T P-A CKT INT N ITY AVERAGE :INTENSITY ii EARPHONE EARPHONES M'XER AMPLIFIER 2? 1 2s SWE/EP Hd RIz [I25 SWEEP START;CONT CONTROL SEQUENCER INDICATOR PERSISTENCE E ERASE I L PE..EQE1 L 30 X A THRESHOLD 32 6 25 INTENs'ITY LAMPS 33 V I f SWITCH VERTICAL CRT v H F CLOCK CONTROL SW'TC AMPLIFIER vERTIcAL {22 A PITCH DET PRE AMP MIXER a.
P-A CKT 24 Aux CHANNEL 2 INVENTOR LAoIsLAv InoLANsIo NATHAN o. PHILLIPS BY 5 AIM/- ATTORNEY PmNTEflJuL H 1912 3,676,595
LADISLAV DOLANSK? NATHAN D. PHILLIPS BY Russ M,
ATTORNEY PATENTEBJULH m2 3. 676. 595
sum 05 OF 11 FIG. 8
z-AMP P-A ANALOG OUTPUT OUTPUT 425V T(LF)/T(HF) OUTPUT INVENTORS LADISLAV DOLANSKY' NATHAN D. PHILLIPS ATTORNEY FATENTEBIIIIH m2 3.676.595
saw us 0F 11 CHANNELI AUDIO I .L
I4 TRIGGER ,I44 D 7 CHANNELI FIG. l2 I I465 sw 3 I i T i |2.6V CHANNEL 2 TR|GGER C'RCU'T L- TRIGGER AUDIO I CHANNEL 2 I Dz CHANNEL 2 swELP 4 CONT OUTPUT I34 INTERFACE |5B"-OUTPUT INvENToRs LAoIsLAv DOLANSK? NATHAN D PHILLIPS BY w MIMI) ATTORNEY vorcau SOUND DISPLAY BACKGROUND OF THE INVENTION 1. Field of the Invention The invention relates to cathode ray display equipment and in particular to circuitry for automatic display of patterns dervied from voiced utterances.
2. Prior Art The invention relates to the visual display of voiced utterances and related sounds.
Previous pertinent developments fall into two problem areas: (A) pitch-signal extraction, (B) visual display.
A. PITCH EXTRACTION Several methods of extracting a meaningful pitch signal have been developed in the recent decades. Of these, the signal envelope detection type appears to be most suited for the present application: accordingly, the previous developments in pitch-extraction are limited to the most essential developments in this area.
In 1949, Gruenz and Schott described a method for extraction and portrayal of pitch of speech sounds. (See Gruenz, 0.0., .lr., and Schott, L.O. Extraction and Portrayal of Pitch of Speech Sounds," B.T.L. Mono 3-1698. Also: .1. Acous. Soc. Amer. Sept. 1949). They claimed reliable pitch extraction for frequencies from I to 600 Hz. in order to reduce errors in producing the basic pitch-period indicating pulses, an input filter was used to limit the pass-band for the speech signals.
The need to obtain pitch-period indicating pulses with almost no delay precluded the use of such filters and led to the development of the instantaneous pitch-period indicator (IP- PI)." (See Dolansky, L. "An Instantaneous Pitch-Period Indicator," J. Acous. Soc. Amer., Vol. 27, No. I, pp. 67-72 (I955). Anderson, who wanted to use a pitch indicator for deaf children, used the basic circuit of the IPPI but incorporated a voltage-holding circuit, used between pitch-indicating pulses, and also included a logarithmic transformation, which made a more desirable display possible. (See Anderson, F. An Experimental Pitch Indicator for Training Deaf Scholars," J. Acous, Soc. Amer., Vol. 32, No. 3, pp. l,O65-l,074 Aug. 1960). In Sweden, Tjernlund also developed a similar pitch extractor based on the envelope-detection process. (See Tjernlund, P.A. A Pitch Extractor with larynx Pick-up, Quarterly Progress and Status Report, Speech Transmission Laboratory, Royal Institute of Technology, Stockholm, Sweden, pp. 32-34 (Oct. 15, 1964). Other developments of pitch-extraction methods and devices are not considered to be essential for inclusion in this brief summary of pertinent developments. However, an excellent summary of such efforts is to be found in McKinneys report on laryngeal frequency analysis, and shorter summaries elsewhere. (See McKinney, N.P. Laryngeal Frequency Analysis for linguistic Research, Report No. l4, Contract Nonr [224(22), N R 049/122, The University of Michigan, Ann Arbor, Mich. (September, 1965); Dolansky, L. Aid: for the Deaf and Some Related Problems, Final Report, VRS and USBC Fellowship Grant, (PL 87/256, Fulbright-Hayes Act), Northeastern University, Boston, Mass. (Sept. 30, I967 Dolansky, L. and Manley, H. Speech Analysis (A Survey Report), Sylvania Elec. Sys., Applied Res. Lab., Waltham, Mass. Proj. 72-40] (Oct. l, 1960).
B. VISUAL DISPLAY From the point of view of practical usefulness, when an intonation display is used in any kind of teaching, it is essential to be able to keep the display as long as desired, and to discard it instantly when it is desired to continue with the next task. During the early attempts only long-persistence crt displays and photographic reproduction were available. (See Gruenz, supra) The former did not have sufficient duration to permit, say, a comparison of the student's and the teacher's pattern. Photography was not a satisfactory solution for a rapid sequence of teaching tasks. Later, attempts were made to use a mechanically rotating long-persistance crt tube. (See Anderson, supra; Loos, R. Ein Tonhiiheschret'ber fllr die Sprecherziehang von Gehorlasen Diplomarbeit, Inst. f. Elektrotchnik der Johannes Gutenberg Universitat in Main: (September, 1965). However, control of the pattern retention time was still not available.
In 1965, with the storage oscilloscope then available, the Northeastern University speech research group embarked on an extended research efiort in the area of teaching intonation to the deaf. (See Dolansky, L., Ferullo, R. J., O'Donnell, MC. and Phillips, N.D. Teaching of Intonation and inflections to the Deaf, Final Report, Cooperative Research Project No. 8-28 I Northeastern University (I965); Dolansky, L., Karis, (3., Phillips, N and Pronovost, W. Teaching Vocal Pitch Patterns Using Visual Feedback from the Instantaneous Pitch-period Indicator for Self-monitoring, Part I. VRA Project No. l907-S, Northeastern University (Dec. 31, I966); Dolansky, L. and Phillips, N.D. Teaching Vocal Pitch Patterns Using Visual F eedback from the Instantaneous Pitch-period Indicator for Selfmom'ton'ng, Final Report, Part II, VRA Project No. 1907-8, Northeastern University (Oct. 3i, 1966); Dolansky, L., Pronovost, W.L., Anderson, D.C., Bass, SD. and Phillips, N.D. Teaching oflntonation Patterns to the Deaf Using the Instantaneous Pitch-period Indicator, Final Report, VRA Grant 2360-8, Northeastern University (Feb. 28, 1969); Phillips, N.D., Remillard, W. 1., Bass, S. and Pronovost, W. Teaching of Intonation to the Deaf by Visual Pattern Matching," American Annals of the Deaf; Vol. 113, No. 2, pp. 239-346 (Man, 1968 The essential instrument in this study was the lPPl.
SUMMARY The invention makes use of visual display techniques utilizing the fundamental frequency of the signal under study (e.g. the microphone signal for voiced utterances of speech). In accordance with the invention a visual pattern which represents the fundamental frequency of the signal under study as a function of time is displayed on the face of a storage oscilloscope. Because the intonation pattern is presented as a function of time, temporal aspects of the signal can be studied. In addition, the intensity of the signal is displayed by means of lights. The invention comprehends a complete intensity display and intonation display system capable of producing and storing intonation patterns for extended periods of time and suitable for the quantitative study and teaching of intonation patterns as well as for other frequency versus time displays of variable frequency signals.
One aspect of the invention includes an error suppression scheme based on (a) the total permissible frequency range and (b) the permissible rate of change of the fundamental frequency of the signal.
For signals that are as complex and varied as the speech signals, it is virtually impossible to guarantee that occasional errors will not occur. The occuring errors are usually in the form of an extraneous or missing pitch-period indicating pulse, and result in momentary displays of twice or one-half of the correct frequency. Since considerable knowledge is available about the total range of the pitch frequency and the rate of change of this frequency, it is possible to devise schemes which would prevent such erroneous indications from actually appearing on the screen in most cases.
Another aspect of the invention comprehends a specific improved method of conversion of the frequency describing voltage which approximates the desired logarithmic conversion to a high degree of accuracy.
Still another aspect of the invention comprehends a system of automatic and semiautomatic modes of operation permitting a great flexibility in the desired procedures for the study and teaching of intonation patterns.
Since it is highly desirable to develop adequate voice volume before it is attempted to obtain meaningful intonation patterns, an intensity display in the form of four lights has been incorporated into one embodiment of the invention.
Thus it is possible to use the equipment for (a) an intensity display, using the lights, (b) an intonation contour display, using the storage-oscilloscope screen display, (c) both displays simultaneously.
When it is used as an intonation-contour display, it may be desired to produce patterns continuously, and have them fade at a constant rate, or it may be desirable to produce either one or two patterns, keep them unchanged for observation for a brief time, and then have the patterns automatically erased by the built-in sequencing program. Alternatively, it may be desired to perform repeated attempts, and subsequently erase all patterns by actuating a push-button switch. While the system of the invention can actually produce two intonation patterns (one from either of its complete pitch-extraction circuits) at the same time, in most cases it is desirable to produce only one pattern at a time (e.g. the teacher's) and when the circuits become ready to process another signal, to so indicate and at the same time unblock the other input channel, so that the new pattern can be properly produced and lined up against the first pattern.
A few of the possible a given below:
pplications of this apparatus are A. Auditory Feedback Replacement Especially in the case of profoundly deaf persons, the absence of the usual auditory feedback makes it difficult for these individuals to acquire natural intonation in their speech. If, however, the visual intonation pattern is available to them, they can learn to imitate the teacher's intonation pattern in which case their intonation sounds also quite natural to other persons. This use was the primary reason for developing this equipment, and encouraging results have been obtained in the experimental research with previous, less developed models.
B. Speech Correction In attempts of correcting speech defects in speech-correction clinics, the system of the invention may be quite helpful if the corrective process is concerned with pitch, intonation, intensity, or duration aspects of the utterances.
C. Teaching of Music (Voice) (voice) Such aspects as vibrato and other attributes of pitch can quite clearly be seen and studied in a quantitative way, both with respect to frequency and duration.
D. Teaching of Languages It is often difficult to teach proper intonation in a foreign language. The system of the invention is a welcome aid to show differences in intonation in a most straightforward and quantitative way. This may be of special importance in languages in which the meaning is changed when the intonation pattern changes.
BRIEF DESCRIPTION OF THE DRAWINGS The invention may best be understood from the following detailed description thereof, having reference to the accompanying drawings, in which:
FIG. I is a front view of the external physical layout of a system embodying the invention;
FIG. 2 is a graph in which the against time;
FIG. 3 is a graph showing the pulsed output of the pitch detector and illustrating regions of error suppression;
FIG. 4 is a block diagram of a circuit embodying the invention;
FIG. 5 is a circuit diagram showing the preamplifier, mixer and pitch detector used in the system of the invention;
FIG. 6 is a circuit diagram of the pulse rate to analogous log pitch frequency converter used in the system of the invention;
FIG.,7 is a circuit diagram showing the logarithmic conversion circuit and the sample and hold circuits of the system of the invention;
pitch frequency is plotted FIG. 8 is a circuit diagram of the AT comparator of the system of the invention;
FIG. 9 is a circuit diagram of the tri-level indicator or frequency range comparator of the system of the invention;
FIG. 10 is a multiple graph showing the wave forms of the various portions of the P-A analog circuit;
FIG. 11 is a circuit diagram showing the horizontal sweep control and presentation sequencer of the system of the invention;
FIG. 12 is a circuit diagram showing a trigger generator for use with the horizontal sweep control and presentation sequencer of FIG. I I;
FIG. 13 is a circuit diagram of a sweep generator suitable for use with the horizontal sweep control and presentation sequencer of FIG. I 1;
FIG. I4 is a multiple graph showing the timing diagram for delay and sequencing logic in the MFRT mode as employed with the horizontal sweep control and presentation sequencer of FIG. 11; and
FIG. 15 is a circuit diagram showing a clock and channel switch suitable for use in the system of the invention.
DESCRIPTION OF THE PREFERRED EMBODIMENT Referring to the drawings, and first to FIG. 1 thereof, a dis play screen 1 displays an intonation pattern representing the pitch frequency of a human voice which it is desired to analyze. Input signals from a microphone, into which the person whose voice is to be analyzed speaks, are connected either to microphone jacks 2 or to the auxiliary input terminals 3 via an external pre-amplifier. The auxiliary input terminal may be used to receive signals from the output of a tape recorder rather than from the output of a microphone preamplifier. An earphone jack 4 is provided to provide an output for an earphone. Two habitual pitch level controls 5, one for each channel, make it possible to shift the intonation pattern on the screen I up and down as needed. If it is desired to replace the ordinary intonation pattern by a double line pattern indicating the limits of acceptability of the response, the control for this purpose may most conveniently be provided by an auxiliary plug-in trace width control 6 which can be connected to the side of the oscilloscope unit. A loudness display represented by four light bulbs 7,8,9 and 10 is activated by depressing a loudnes switch II. The pitch display 1 may be activated by depressing a pitch control switch 12. A mode selector switch 13 provides means for selecting the various modes of operation. A reset button 14 is provided for erasing the intonation pattern on the screen and where applicable to return the circuits into the initial state preceding a new display sequence. While this reset button 14 is a single switch it includes two illuminated captions l5 and l6; 15 being labeled "START" and 16 being labeled CONT." (i.e., "continue). A mirror 17 permits monitoring the subjects facial movements as well as those of the teacher. The mirror 17 may be adjusted in such a way that the face of the teacher, the face of the student, and the display screen I can all be seen by the student without moving his head. A power switch 18 is used to switch the unit on and off.
In accordance with the invention a system of automatic and semiautomatic modes of operation are provided so as to permit a great flexibility in the desired procedures for the study and teaching of intonation patterns. A representative list of modes are listed in Table I.
Sequence of Events Trig gertng l or 2 l or 2 2 2 l or 2 Channel Trig- Trig- Channel gering gering 2' 2" l and 2 Displayed Channel Channel Fades Storage Contin- 4 sec. 2 sec. 2 sec. Variable Time uously Automat- Man., Erasure ically go to l lf man, go to 1, otherwise continue Triggering l or 2 l or 2 Channel Channel Trig. Trig. Displayed Channel Channel Storage 4 see. I see. Time Erasure automatic it man, go to 1, otherwise go to 5 Ready for new sequence Delay time after pattern has been produced. "Channel 2 can produce a double-line pattern with variable spacing between the two lines Manual erasure by pressing START-CONT button (See FIG. I).
Referring to FIG. 1 and to Table I, operation in any of the listed modes presumes that the "PITCH" switch has been activated, so that the light under the pushbutton of the switch 12 is on. When the "START light is on, the cathode-ray beam is on the left side of the screen I the equipment is ready to receive and display a pattern. As soon as a sound signal of sufficient intensity is received by the equipment, the sweep is triggered, the START light 15 is turned off, and the continue light CONT. I6 is turned on. The latter stays on for the duration of the sweep, even if the speech signal ceases; in that case, however, the beam is suppressed; during the moments of silence, no trace is visible on the screen. The sequence of events for a particular mode of operation which is set by the MODES switch 13 in FIG. 1, is best understood by following the pertinent column in Table I. For example, when operating in Mode MFST, the sweep is triggered by the speech signal of Channel 2 (usually used by the teacher), and subsequently the pattern derived from this speech signal is displayed on the screen 1 (event 2). After the pattern on the screen 1 is completed, the equipment is not ready to receive any additional pattern for a period of 2 seconds during which neither the "START" light 15 nor the CONT. light 16 is activated (event 3). This time interval of 2 seconds is usually needed to study the teacher's pattern in preparation for an imitation attempt, whenever he is ready (event 5). The pattern derived from the student's speech signal is then displayed on the screen I (event 6), and both patterns remain displayed for a period of 4 seconds (event 7). Subsequently, both patterns are automatically erased, and the equipment is ready for a new sequence as soon as the START light 15 comes on.
The various modes used in the operation of the equipment are intended to serve the following purposes (the meaning of the mode designations is evident from Table l):
CFFI: Useful in situations where the subject is monitoring continuous speech.
MFFT: Useful when it is desired to display and study a complete phrase pattern or sentence.
MFST: Suitable for automated programming, or rigidly sequenced lessons.
MFRT: This mode can also be used for automated programming; however, a greater flexibility in the number of responses per stimulus is possible.
MFSD: Useful when the teacher and the subject practice simultaneously, which is sometimes desirable for greater motivation. This mode can also be used in place of MFFI but in this case a variable length of time to study the patterns is available to the subject.
Since the speech volume of many subjects is too low, and also because it is often desirable to show to the subject the difference between intensity and intonation, an intensity display is also provided. This may consist of the four lights 7,8,9 and 10. With increasing intensity of the speech signal, more lights are progressively turned on. Thus, for example, in order to operate the equipment the intensity should generally be at least sufficient to turn on light number 7. Somewhat greater intensity will turn on both lights 7 and 8; still greater intensity will turn on lights 7,8 and 9 and still greater intensity will turn on all the lights 7,8,9 and 10. This intensity display may be used either separately or simultaneously with intonation display since it may be separately activated independently of the activation of the intonation display by means of the loudness switch marked LOUD and indicated by reference numeral ll.
Although the invention not limited to any particular type of pitch detector design, in a preferred embodiment of the invention a pitch detector is employed of a type which disregards the future wave form of speech and bases the process of establishing the value of the present pitch frequency on the past values of the wave form alone. Use of this type of pitch detector permits simple equipment and avoids excessive requirements for storage elements.
In accordance with the invention an error suppression scheme is provided based on (a) the total permissible freq uency range and (b) permissible rate of change of the fundamental frequency of the signal. It is necessary to have such a system for eliminating erroneous pitch period indications, owing to the very nature of the manner in which pitch periods are identified. The pitch period detector must operate from a signal which has a relatively complex wave form, but which has certain regularly recurring characteristics which permit the establishment of a so-called pitch frequency. However, the sound posesses certain random characteristics which give rise to occasional erroneous indications. The elimination in a simple manner of erroneous pitch period indications is therefore of great importance. It has been established that certain pitch periods cannot occur from the sound of the human voice and the invention makes use of these facts in establishing the error elimination circuit.
The first parameter used by the invention is the pitch frequency range. Many estimates have been made of the total range of pitch frequencies. These estimates vary greatly and frequencies as low as 33 (creaking) and as high as 3.1 kilohertz have been reported. However, in accordance with the invention the system is designed to accommodate only the frequency range of normal speech plus any additional range which might occur in the speech of the deaf. This frequency range is approximately 70 to 600 hertz (i.e. cycles per second).
The second parameter utilized by the invention to detect errors is the maximum rate of change of pitch frequency. This phenomenon may perhaps best be understood by reference to FIG. 2 wherein an example of the variation of pitch frequency plotted against time is shown. It is seen that in this case the maximum rate of change of pitch frequency occurred at the beginning. In general, this maximum rate of change of pitch frequency cannot exceed a certain value due to the physical limitations of the speech producing mechanism. Therefore, in apparatus constructed in accordance with the invention, a rate of change of pitch frequency in excess of: l 0 to 20 percent of the reference frequency is regarded as an error and is eliminated.
The system embodying the invention includes a logic design activated by any signal outside the limited range of the abovementioned parameters. An understanding of this operation may be had by looking at FIG. 3 wherein f(R) designates reference pitch frequency; T( HF) represents the period of highest possible pitch frequency; T(LF) represents the period of lowest possible pitch frequency: T( IR) represents the period of pitch frequency obtained with the highest possible positive rate of change, when starting from the pitch frequency of the preceding pitch period; and T(DR) represents the period of pitch frequency obtained with the highest possible negative rate of change, when starting from the pitch frequency of the preceding pitch period.
Referring now to FIG. 3, therein is shown a sequence of pulses, identifying the beginnings of pitch periods in time. The first two pulses define a pitch period of a certain duration T =l lflR). The logical decisions to be made about the acceptability of the next pitch-period indicating pulse are based on the duration of the first pitch period. In developing the mechanism for these logical decisions, certain quantities should first be defined asfollows:
T( HF) is the pitch-period duration for the highest possible pitch frequency (e.g. 600 hertz), while T(LF) corresponds to the lowest possible pitch frequency (e.g. 70 hertz). A pitchperiod indicating pulse which occurs before T(l-IF) is terminated must be in error, and is therefore always suppressed. The next condition that a new pitch-period indicating pulse must meet in order not to be disqualified as an erroneous pulse is that it must occur within the interval AT. As can be seen from FIG. 3, AT is the interval between T(IR) and T(DR). TtlR) and T( DR) are related to the maximum pennissible rate of change of the pitch frequency (e.g. +l to percent an l0 to 20 percent respectively) and to the pitch frequency corresponding to the just preceding (acceptable) pitch period as defined in FIG. 3. This test of acceptability is applied only if an acceptable pitch period preceded immediately. The limits of acceptability for the rate of change were obtained from a previous study in which an efi'ort was made to measure the maximum rate of change of pitch frequency that the subjects were capable of performing. If a pitch-period indicating pulse does not occur during the entire duration of T(LF), it is concluded that phonation has ceased and the next pitch-period indicating pulse will be considered to be the beginning of a new voiced utterance.
If an expected pitch-period indicating pulse does not occur within the interval AT, the visual display is suppressed until a sequence of three acceptable pitch periods develops. If, however, an expected indicating pulse does not occur within the interval AT or between T(DR) and T(LF), the voiced utterance is considered to be terminated, and the circuit settings return to their initial state, suitable for the beginning of a new voiced utterance.
Since various randomly occurring noises are usually present before the beginning of a voiced utterance, and since such noises give rise to erroneous pitch-period indicating pulses, the invention also comprehends means for eliminating a very short part of the initial portion of the intonation display in order to avoid confusion. Occasionally, the interfering noise pulses cause the pitch indicator to indicate a frequency which is one-half of the correct frequency (i.e. skipping every other pitch-indicating pulse). In order to prevent the perpetuation of this incorrect indication throughout the utterance, a special circuit searches for the correct pitch-indicating frequency, starting from the highest permissible frequency.
The operation of the pitch/intensity display system is best explained with the help of the block diagram shown in FIG. 4. The microphone signals are amplified by means of either of the two microphone pre-amplifiers 19,20 and fed to the corresponding mixers 21,22 which can also accept an auxiliary input signal, for example from a tape recorder. Each of these two mixers 21,22 feeds a separate pitch detection and pulseto-analog (P-A) circuit 23,24. The outputs of these two P-A circuits 23,24 are fed to a vertical amplifier via a corrunon switch 26. The output of the vertical amplifier 25 is used as the vertical deflection voltage for a cathode-ray tube. The horizontal deflection voltage for the cathode-ray tube is generated in the sweep generator 27, and amplified in the horizontal amplifier 28. The mode switch 29, indicated by dashed lines in the diagram of FIG. 4, is used to obtain the various sequencing and display operations of the system by adjusting the parameters in the blocks indicated by arrows on the dashed lines. The outputs of the mixers 21,22 of channel I and channel 2 are used to activate a trigger 30 which generates trigger signals which activate a delay and sequencer circuit 31 and a switch control 32 for the channel-selecting switch 26. In the MFSD mode the switch 26 is under the control of a clock 33.
Referring now to FIG. 5, therein is shown a preferred circuit for use as the pre-amplifier 19 (or 20), mixer 21 (or 22), and pitch detector (see 23 and 24) shown in FIG. 4. This preamplifier-pitch-detector circuit includes a microphone preamplifier 34. In addition, another input, e.g. from a tape recorder, can be fed into the terminal AUX. INPUT", identified by the reference numeral 35. Any of these two input signals pass through an amplifier-mixer stage 36 and a 4 kHz low-pass filter 37. This filter 37 is included to reduce the occurrence of erroneous indications for the speech sounds [z], .t 1; these speech sounds have a considerable high-frequency content. The filter output is fed into the cascaded pitch-period incidating stages 38,39, which operate in a manner similar to the stages of the Instantaneous Pitch-period Indicator of Dolansky described in the publication An Instantaneous Pitch-period Indicator," Journal of the Acoustical Society of America, Volume 27, No. I, pp. 67-72 (1955) by LO. Dolansky. The output pulse at 40 serves as the input signal to the pulseanalog circuit of FIG. 6.
Referring now to FIG. 6, therein is shown a pulse-rate-toanalogous-log-pitch-frequency converter, suitable for use as the P-A circuit shown at 23 and 24in FIG. 4. The pitch dctector circuit previously described in connection with FIG. 5 generates a pulse output wherein pulses occur at a frequency corresponding to the pitch frequency of the voice being analyzed. This sequence of pulses then forms an input at 4| to the converter shown in FIG. 6. During those periods wherein the logic circuits of the converter of FIG. 6 permit complete measurement and display of the pulse frequency the converter of FIG. 6 operates to deliver an output at a PA analog output terminal 42 which output is a voltage signal the magnitude whereof is proportional to the logarithm of the input pulse frequency. In accordance with the invention the system includes specific improved method of conversion of the frequency-describing voltage which approximates the desired logarithmic conversion to a high degree of accuracy. This particular aspect of the invention is shown in FIG. 7 in detail and is shown in FIG. 6 as the current switch 43, RC circuits 44 and summation amplifier 45. These three items convert the incoming pulses to a wave form consisting of a sequence of recurring peaks corresponding to the pulses but followed by an em ponential decay. The magnitude of this voltage signal from the summation amplifier 45 just before each pulse is proportional to a high degree of accuracy to the logarithm of the pulse frequency. It can be seen readily that for a relatively long pitch period the exponential decay curve will reach a relatively low value corresponding to a relatively low frequency, whereas at higher frequencies the pitch period will be shorter and the amount of decay will be les, thereby generating a relatively higher voltage magnitude. A bridge switch 46 is provided as a sample circuit to select only that portion of the output of the summation amplifier 46 which exists just prior to the succeed ing pulse. The bridge switch 46 delivers the sampled output to a follower 47 having a capacitor 48 (FIG. 7) which holds the signal sampled by the bridge switch 46 until the next signal issues therefrom. Thus, the PA analog output 42 steadily delivers a voltage corresponding to the pitch frequency represented by the period between the two latest pulses received by the apparatus, and this voltage remains the same until a new pulse comes in thereby establishing a new interpulse period.
A gate 49 is provided between the pulse input 41 and the logarithmic conversion circuits 43,44,45,46 so that erroneous pulses may be suppressed. In addition, multivibrators 50,51 are provided for converting the actual pulse input into cleaner pulses which are easier to handle in the logic circuits. The pulses produced by the multivibrators 50,51 are consistently of the same duration and amplitude. The pulse in the first multivibrator 50 is initiated at the time of the initiation of the incoming pulse whereas the pulse of the second multivibrator 51 is produced simultaneously with the conclusion of the pulse issued by the first multivibrator 50. In the discussion which follows, these pulses may be designated as an early pulse and a late pulse. The duration of these pulses is approximately 30 microseconds while the time interval between pulses is of the order of milliseconds. The logic circuit of FIG. 6 uses DC logic, so that events occur depending upon whether the various logic elements are in one state or another.
All gates are of the NAND type. The not excited condition is represented by the numeral 0 and corresponds to a voltage of minus 12.6 volts; the excited state is represented by the numeral I and represents a voltage of zero volts. The gate 49 encountered by the incoming pulse is designed to prevent passage of the pulse if a signal for this purpose is delivered either by the frequency range comparator 52 or the rate of change comparator 53. The output from this gate 49 will be 0 if all its inputs are l and this output of 0 is changed by an inverter 54 to a signal of l which serves to trigger the multivibrators 50,51. The states of the multivibrators 50,51 when the output of the gate 49 is l are such that the bridge switch 46 does not sample the output of the summing amplifier 45 and no input is delivered to the current switches 43; and this is the case if any of the inputs to the gate 49 are 0 Consequently, if no pulse comes in, or if the frequency range comparator 52 delivers a 0 to the gate 49, or if the rate of change comparator 53 delivers a logarithmic conversion to the gate 49, there is no input to the converter of circuits 43, 44, 45, 46. In this way, erroneous signals lying outside the frequency range specified or representing a greater rate of change of pitch frequency than that specified will temporarily out off further incoming signals, and other portions of the circuit of FIG. 6, to be described hereinafter, will suppress the display or the screen 1.
The lower portion of FIG. 6 represents the logic circuit which suppresses the display, in the event of an erroneous input signal, in response to a control signal generated by the frequency range comparator 52 or the rate of change comparator 53, and also at the start of the utterance. However, at the start of the utterance, the display is additionally suppressed by a delay circuit forming part of the follower 47. The function of the frequency range comparator 52 and rate of change comparator 53 is to detect an input signal outside the permissible range, which input signal is presumptively erroneous, and in response to that input signal to prevent further excitation of the display for the next three input pulses or so. The frequency range comparator 52 delivers its signal to the suppression circuit via a multivibrator 55 which produces a clean signal pulse from the relatively unclean pulse out of the frequency range comparator 52. In a similar fashion the relatively unclean output from the rate of change comparator 53 is delivered to the suppression circuit via multivibrator 56. Whether or not these signal pulses will have an effect in the suppression circuit is determined by whether or not they occur simultaneously with the input pulses. For this purpose the early pulse is delivered to certain portions of the suppression circuit and the late pulse is delivered to other portions of the suppression circuit.
The operation of the PA circuit of FIG. 6 will now be explained with reference to FIGS. 6,7,8,9 and 10. The pitch period indicating pulse from the pitch detector enters gate input terminal 57 via diode 58. Diode 58 is only required in the event certain conventional gating circuits are used. Initially gate input terminal 59 is 1, it being assumed that the output terminal 60 of control-gate 61 is l, which assumption is true if either input 62 and 63 of control-gate 61 is 0. Controlgate input 63 is the same as inverter-output 63, and this is shown as item g in FIG. 10, from which it appears that inverter-output 63 (and hence control-gate input 63) is 0 when the incoming pulse starts. The incoming pulse gets through the gate 49 is inverted once in the gate 49 and again in the inverter 54, so that it enters the first multivibrator 50 as a state I trigger pulse. When the input to the first multivibrator 50 thus goes to state I, one output 64 of the first multivibrator 50 delivers a pulse of one polarity to the bridge switch 46 and to an input 65 of a gate 66, while the corresponding pulse of opposite polarity is delivered from another output 67 of the first multivibrator 50 to the bridge switch 46 and also to the second multivibrator 51. Thus the pulse of the second multivibrator 51 starts when the pulse of the first multivibrator 50 stops. The pulse of one polarity of the second multivibrator 51 is delivered to the current switches 43 while the pulse of opposite polarity is delivered to the suppression circuit so as to determine whether or not certain activity therein will take place.
So far, we have traced the incoming pulse to the production of a pair of early pulses and a pair of late pulses all of which are well shaped. One of the late pulses goes into the current switches 43 which then activates the logarithmic signal portion of the invention. This part of the invention is shown in more detail in FIG. 7. The late pulse arrived at input 68 and turns on two current switches 69,70 each of which charges an RC circuit 71,72 with a different time constant. The exponential voltages are added in summing amplifier 45, the output of which is fed to a diode bridge switch 73,74,75,76. The output of the summing amplifier 45 is shown in FIG. 10d from which it is seen that the late pulse charges the RC circuits 71,72 to maximum potential and then the potential of these circuits declines exponentially giving the output signal of the summing amplifier 45. As each subsequent pulse from the second multivibrator 51 arrives the voltage output of the summing amplifier 45 returns to its maximum value and each time it decays over the same exponential path, so that the lowest value achieved by the exponential pattern is a measure of the distance between pulses. Now the pulse which charges the RC circuits 71,72 is the late pulse, while the pulse which causes the diode bridge switch 73,74,75,76 to take a sample is the early pulse. The diode bridge switch 73,74,75,76 thus acts as a sampler of that portion of the exponential curve which occurs just prior to the next pulse. The sampled output of the switch, 73,74,75,76 which is a logarithmic representation of the pitch frequency, is fed to a holding circuit 77. The holding circuit 77 is of conventional design and includes a capacitor 48 which receives the sampled output and a follower 47 which has a very high input impedance; the sampled output signal causes this capacitor 48 to be charged to a voltage virtually identical to the voltage of the output of the summing amplifier 46 at the moment of sampling. The follower 47 delivers a corresponding voltage signal of the same magnitude and this is the PA analog output at 42.
Associated with the follower 47 is a circuit which is used, during the initial portion of a voiced utterance, to bypass the holding capacitor 48 and thereby cause the apparatus to deliver the same type of signal as would be delivered from an excessive rate of change signal. Such a signal has the effect of preventing display and thus display is prevented during the first several pulses of any voiced utterance. At the start of any voiced utterance, the frequency range comparator 52 will be delivering the negative signal characteristic of an excessively long pitch period since the time since the last pulse will have been essentially infinite. See FIG. 10c. This means that under starting conditions the transistor 78 is on so as to connect the holding capacitor 48 to a (l2.6)-volt voltage source 79 via a 5.1 kilo-ohm resistor 80. This prevents the holding capacitor 48 from acting as such and the charge between samples will leak off through this resistor 80 thereby insuring an excessive rate of change signal from the rate of change comparator 53. After the voiced utterance commences, however, the frequency range comparator 52 no longer delivers a negative signal, so that gradually the charge on the capacitor 81 of an RC load on the collector of a transistor 82 leaks off until it attains a value sufficient to cut off the transistor 78. The result is shown in FIG. 10) wherein it is seen that the first pulse delivered to the holding capacitor 48 leaks off and the same is true of the next two pulses. Thereafter, as the transistor 78 reaches cutoff the holding capacitor 48 becomes able to serve its function of preserving the sampled voltage until the next pulse.
RC circuit 71 comprises a capacitor C, which is charged by the late pulse and which discharges between pulses through a resistor R,. RC circuit 72 comprises a capacitor C, which is also charged by the late pulse and which discharges between pulses through a resistor R,. The resistors R,, R, are connected at the ends thereof remote from the respective capacitors C,, C, so that the current discharges from the capacitors through the resistors are added and their sum delivered to the summing amplifier 45. In accordance with the invention these RC circuits 71,72 must be so designed that R,C is approximately 0.01 l3 sec., R,C, is approximately 0.00l9 sec., and R is approximately 2R for circuits which are to display voiced utterances. In a representative circuit R, might be kiloohms, R, might be I00 kilo-ohms, C, might be 0.226 microfarads, and C, might be 0.019 microfarads.
The AT comparator 53 used to compare the durations of consecutive pitch periods is shown in detail in FIG. 8. The output from the summing amplifier 45 and the signal at the P-A analog output 42 are compared in a difference amplifier 83. When the absolute value of the voltage difference between these two outputs exceeds 0.3 volts the output of the difference amplifier 53 is l2 volts. This result comes about through the use of the diode bridge 84,85,86,87. At the peak of the summing amplifier pulse the voltage output of the summing amplifier 45 will be about l2 volts. Let it be assumed that the voltage at the P-A Analog Output 42 (which represents the sample at the end of the previous summing amlifier pulse) is at 3 volts. This being the state of affairs the left hand terminal 88 of the bridge 84,85,86,87 is at a lower voltage than that of the right hand terminal 89 of the bridge 8435,8637 and therefore current flows through diode 86 from the right hand terminal 89 through a resistance 90 to one input 91 of the diflerence amplifier 83, thus creating a voltage at this point of about 3 volts. The potential of the other input 92 to the amplifier 83 will correspond to the l2 volts at the left hand terminal 88, which is transrm'tted through the diode 85 to that input 92. Thus the absolute value of the voltage difference exceeds 0.3 volts and the output of amplifier 83 is -l 2 volts. Used as the input to the interface circuit 93,94 it causes the output of the latter to go to l 2.6 volts (i.e. state 0), as follows: The l 2-volt output of amplifier 83 causes transistor 93 to be conductive and hence the collector of transistor 94 is grounded which causes transistor 94 to become conductive and puts a pulse of l 2.6 volts at the output 95 of the interface circuit, which is the input to inverter 96. This pulse cor responds to state 0.
As the output of the summing amplifier 45 approaches the voltage at the P-A Analog Output 42 it eventually comes sufficiently close thereto that the diodes 8435,8637 in the diode bridge do not conduct; thus the voltage difference between the input 92 and 91 becomes negligible. This causes the output of difference amplifier 83 to rise from its value of l2 volts to approximately zero volts. When it reaches 6 volts this causes the base of transistor 93 to have a potential equal to that of its emitter, which is maintained at 6 volts by a Zener diode 97. At this point transistor 93 becomes non-conducting; the base of transistor 94 goes to a potential of l2.6 volts and transistor 94 becomes non-conducting so that state I results at the output 95.
In order to reduce extraneous pulses which might arise due to minor noise present in the signal and the large amplification present in the circuit, transistor 93 is cut ofl' as the output voltage of amplifier 83 rises through 6 volts, and this cutoff of transistor 93 is aided by a positive feedback via inverter output 64 which feeds a signal through an RC circuit 98 and a diode 99. This positive feedback causes the emitter of transistor 93 to go negative, thereby supplementing the action otits base in becoming more positive.
Thus the AT comparator 53 causes state 1 to appear at AT output during that period of time when the summing amplifier 45 has a voltage output in the vicinity of the sampled output of the preceding pulse. The circuit is so designed that if the next late pulse does not exist at the time when the state of AT output 95 reverts to O, a reset pulse or error signal will be delivered. It can be seen that the time interval during which AT output 95 is in state 1 is longer when a longer period between input pulses is involved, so that the size of the time interval within which the next input pulse can exist while AT output 95 is instate l (and hence exist without causing a reset pulse), is roughly a percentage of the length of the preceding period. Previous studies have shown that this percentage may be in the vicinity of i 10 to 20 percent, and for the circuits shown this result is achieved by setting the difference amplifier 83 so as to react to input voltage differences greater than 0.3 volts.
A tri-level indicator is used to exclude pulses which would correspond to pitch frequencies outside of the permissible range. This tri-level indicator of T(LF)/T(HF) comparator is the frequency range comparator 52 of FIG. 6 and is shown in detail in FIG. 9. Referring thereto, the limits of this range are determined by the settings of two l-kilo-ohm adjustable resistors 100,101. The voltage at the anode of a diode 102 connected to one adjustable resistor determines the magnitude of T(LF), while the voltage at the cathode of another diode 103 connected to the other adjustable resistor 101 determines the magnitude of T(HF). (See FIG. 3). The cathode of diode 102 is connected to the anode of diode 103 by a pair of equal resistors 104,105 the junction 106 of which is connected to one input 107 of a difference amplifier 108. As long as the input signal is between the two permissible limits the voltage at imput 107 is equal to the mean of the two voltages determined by the setting of the l-kilo-ohm resistors. The voltage of the other input 109 of the difference amplifier 108 is also adjusted to the same value by the potentiometer 110. Thus, under these conditions, the output 111 of the difference amplifier 108 is equal to zero volts. In other words, there is a current path from a l2.6 volt source 112 through the two I- kilo-ohm resistors 100,101 to ground. Diodes 102 and 103 provide an additional path in parallel with the one just mentioned through the two series connected resistors 104,105. As indicated, this provides zero volts at the output 111 of the difference amplifier 108. As long as the output of the summing amplifier 45 is between the two voltages mentioned, diodes 113 and 114 block the output from the summing amplifier 45 from having any efi'ect. As soon as the output from the summing amplifier 45 exceeds either of these voltages, a signal is delivered from the difference amplifier 108. If the output voltage from the summing amplifier 45 is below the cathode potential of diode 103, diode 114 conducts and a positive output is observed at amplifier output 11 1.
Thus, as the magnitude of the output of the summing amplifier 45 falls afler each pulse, any subsequent pulse will be rejected if it occurs before the magnitude of the summing amplifier output has fallen below the high voltage end of this permissible range which might be, for example, a (negative) voltage having a magnitude of 8 volts. Again, if the period is so long that the output of the summing amplifier 45 falls below the low voltage end of this permissible range before the subsequent pulse is delivered, an error signal will also result. This might occur, for example, if the output of the summing amplifier were a (negative) voltage having a magnitude below I volt. This of course always occurs after the end of the voiced utterance.
Saturation of the ditferential amplifier 108 causes sluggish subsequent response. In order to permit a rapid response, saturation of amplifier 108 is avoided by the use of a diode bridge limiting circuit 115. This insures that amplifier 108 stays within the region of its linear operation.
In order to provide the appropriate voltage for the input 1 16 of the logic gate 49, an interface circuit 117,118 is used. The collector voltage of 118 is at ground potential (stage 1 as long as the voltage at the amplifier output 111 does not exceed +3 volts. For larger voltages, transistor 118 saturates and the input I16 to gate 49 goes to state 0. As appears from the diagram of FIG. 6, this signal only occurs when the low frequency end of the range is exceeded such as at the end of the voiced utterance.
The operation of the remaining parts of the pulse-to-analog converterwill now be explained with the help of FIGS. 6 and 10, and Table I].
TABLE II Truth Table for Gate Control Storage FFI FF2 FFZI Clock Com- Pulse ments l2l2ll2l22l32l42l52l62l72l82l9 220 In itial I 0 0 0 0 I 0 l 0 0 state bl 0 l 0 0 0 0 l 0 I 0 0 cl I I 0 0 0 0 I l 0 0 b2 I I I l 0 0 0 0 I 0 0 c 2 I l I l 0 I 0 I 0 0 0 b 3 l l I I 0 O O 0 l 0 O" r 3 I I I I 0 I 0 l 0 0 0 I24 I I I l 0 O O 0 l 0 0 cd I I l I 0 l 0 l 0 0 0 b 5 I I 1 I 0 0 0 0 1 0 0 c5 I l I I 0 I 0 l 0 0 0 b6 1 I I I 0 0 0 0 l 0 0 (6 l I l I O I 0 1 0 0 0 b 7 I I 0 (I 0 I 1 l 0 0 0 c 7 I O 0 O 0 I I I O I 0 Dis- P y 128 I 0 I l 0 l l 0 l on at! l 0 I l 0 1 0 l 0 I I No change b(N-I) I 0 1 I 0 I 0 l O l l r-(NI) l 0 I I 0 I Dis- P y I 0 0 0 0 0 l O l l 0 0H b N 1 0 0 0 0 0 l 0 I l I c N I l 0 0 0 0 I 0 I 0 0 b(N+l) I I l I 0 0 0 0 I 0 0 c(N+I) l' I I l 0 I 0 I 0 0 0 b(N+2) l I O 0 0 l I I 0 0 0 e(N+2) l 0 O 0 0 l I Dis P y b(N+3) l 0 l I 0 I l l 0 l l on '(N+3) I 0 I I 0 I I l 0 I I No change [PM I O I I 0 l l l 0 l I (M l 0 I I 0 l l I 0 I I Dis- P I 0 O 0 O O I 0 I l 0 off In: 0 1 O O O O I Timing corresponth to FIG. 10 12,0 Inverter output 119 reset due to exceeding AT ""lnverter output 1 I9 and 22I reset due to exceeding T( LF) These circuits are used to suppress the visual display for a brief period at the beginning of an utterance, or when permissible values of certain parameters are exceeded, in which case a brief interruption in the visual display occurs until conditions are normal again. In addition, the effect of the limitation in terms of the rate of change of the pitch frequency (as manifested by the output of the AT comparator $3) is delayed. A specific example of operation will now be explained with the help of the waveforms of FIG. 10.
The clock pulses for the logic are the early and late pulses generated respectively by means of the first and second multivibrators 50,51 and are presented in FIG. l0b,c. First the mechanism of the initial delay in the display (at the beginning of an utterance) will be described.
The first late pulse (FIG. 10c-l) generates a decreasing exponential at the output of the summing amplifier (FIG. 10d). Assuming that the following pulses occur at a frequency f,,,,,, or greater, the exponentials which restart at each pulse will never reach the lower dashed line in FIG. 10d; thus the T(LF)/T(HF) output will be UV or higher .(FIG. 10). Consequently the transistor 78 in FIG. 7 exponentially approaches pinch off. As a result, in about 40 msec, a staircase waveform results at the P-A analog output (FIG. 10f). During this 40 msec interval, reset pulses (FIG. 10h) are generated at inverter output 119.
As described hereinabove, AT comparator 53 causes the interface output to be in state 0 when an improper amount of time has elapsed since the last pulse. This corresponds to state I at inverter output 64, and it is inverter output 64 which is designated AT output" at FIG. 10g. However, AT comparator 53 does this by causing AT output to go to state 0 whenever the summing amplifier output is near the P-A analog output. Initially, the summing amplifier output is near the P-A analog output only during delivery of each pulse to the holding capacitor: the P-A analog output then rises and passes through the voltage being put out by the summing amplifier. However, during this initial period the AT output reverts to state I after the next late pulse has finished. Hence the output of the second multivibrator 51 which is delivered to the AT error gate is in state I when the output of the AT error multivibrator 56 which is delivered to the AT error gate 120 reverts to state I. This results in a pulse to state 0 at the output of AT error gate 120 for the duration of the output pulse of multivibrator 56. This causes a similar momentary transition of inverter output 119 to state 0, which constitutes the reset pulse.
These reset pulses delay the progression of the input I21 to the first flip-flop [22, as it progresses through the first, second, and third flip-flops 122,123, 124. For example (see Table II), state I does not remain permanently at the output of the second flip-flop 123 until pulse 6. Similarly the output of the third flip-flop 124 does not remain in state 1 until pulse '1. Therefore, the pulses occuring after pulse 6 are affected by the AT limitation. Similarly, the display begins to be presented after pulse 7 (see Table I1, column 220 An example of a suppression of an erroneous indication in the middle of an utterance is presented in the center part of FIG. 10. The circumstances shown are typical of the situation when the magnitude of the permissible rate of change of pitch frequency is temporarily exceeded. As a result of this excessive rate, a reset pulse (FIG. 10h) occurs at inverter output I19 and the AT comparator 53 and the visual display I are disabled. If the following pulses are within the permissible range, the AT comparator 53 returns to its normal operating state at pulse (N+l (FIG. 10:), and the visual display reappears on the screen at the time of pulse (N+3) (FIG. 10j). The net result is that the visual display is interrupted for a time no longer than four pitch periods but no erroneous indication appears on the screen.
At the end of a voiced utterance AT may be exceeded. In that case, the first and second flip-flops 122,123 (FIG. 6) are reset (see also FIG. 10:). However, whether AT is exceeded or not, eventually T(I..F) is exceeded. At this time, the entire logic is reset and thus becomes ready for a new utterance.
The horizontal sweep control 125 and presentation sequencer 31 are shown in FIG. I1. One function of this circuit is to start the horizontal sweep across the display upon receipt of a trigger initiated by the start of the voiced utterance. The gates 126,127,128 determine whether or not trigger signals initiated by the voiced utterance will be operative to start the sweep. It will be recalled that one of the principal differences between the various modes is whether signals from channel I or channel 2 will be operative and when. Consequently, the mode switch 29 is also shown in FIG. 11. Depending upon the position of the mode switch 29 the ability of a trigger from channel I or channel 2 to initiate a sweep is controlled.
A first flip-flop 129 will always respond to an incoming trigger so as to initiate a horizontal sweep. The signal to the first flip-flop 129 which initiates the sweep has the appearance of state at the output of the final gate 128. This can occur only if both inputs to the final gate 128 are in state I. Thus, in order for state 0 to appear at the output of the final gate 128, the input at 130 must be 1 for all signals. The intermediate gate 127 acts simply as an inverter, since it will have its output in state I if either input to the intermediate gate 127 is in state 0, and this occurs for any trigger coming from channel 2 and will also occur for a trigger coming from channel I provided the input 131 to the channel-1 gate 126 is in state 1. Thus, when it is desired to block channel 1, input 131 to channel-l gate 126 is in state 0. This never happens in the CFFI' or MFFT modes, as is apparent from an inspection of the pertinent position of switch AA, but it is the case at the beginning of the cycle in the MFST, and MFRT modes.
The function of the second flip-flop 132 is to prevent initiation of a second trace until it permits this to happen. It accomplishes this by delivering a pulse at its output 133 which is transmitted through switch SBB to the input 130 of the final gate 128. The incoming signal which causes the second flipflop 132 to again permit a trigger to initiate a sweep arrives at the second flip-flop 132 from switch SCB. The time of this signal is caused to be a certain number of seconds after substantial completion of the sweep by means of the clocking mechanism. When the sweep is substantially completed an interface I34 delivers a signal to a first multivibrator 135 which then produces a shaped pulse that is delivered to a clock 136 via the switch SDA. The clock 136 then initiates operation of various flip-flops 137,138,139 whose outputs occur at varying times after receipt of the input signal. These timed outputs of flip-flops 137,138,139 are also delivered to the elements at the left-hand side of FIG. 11 some of which serve to open channel I and some of which serve to cause automatic erasing of the display.
The details of the circuit for producing the trigger inputs to the sequencer of FIG. 11 are shown in FIG. 12. The circuit for the sweep generator of FIG. 11 is shown in FIG. 13.
The operation of the horizontal trigger and sweep circuits is as follows. Referring now to FIG. 12, with the input terminal 140 at ground potential, transistor 141 is just at the cutoff point; i.e. the base of this transistor is at approximately 6v. This adjustment is made by means of the potentiometer 142 in the input connection. In the presence of an audio signal, its positive values make this transistor I41 conduct with the result that transistor 143 is turned on. The envelope-detection circuit 144 derives a voltage from the output of transistor 143. The resulting voltage turns transistor 145 on. Thus in the presence of an audio signal, inverter output 146 is in state 1. Similarly, if an audio signal is present at the Channel 2 Audio" terminal, inverter output 147 is in state I. When an audio signal is present in either or both of the inputs, gate output 148 is in state I and transistor 149 is turned on. When an intensity display is not desired, transistor 149 is disabled by opening SW3.
The purpose of the sweep generator (FIG. 13) is to produce the required sweep voltage for the visual display. The sweepcontrol signal (in the form of the complement of waveform CONT. shown in FIG. 14) is obtained from the first flip-flop 129 (FIG. 11). When output 150 of the flip-flop 129 is in state 1, transistor 151 (FIG. 13) is on, and the voltage across the 25 [.Lf capacitor 152 is kept at 0v. With the input voltage at 150 at-l 2v, transistor 151 is cut off and the capacitor 152 charges with a constant charging current. The magnitude of this charging is determined by the voltage across transistor 153 and the setting of adjustable SO-kilo-ohm resistor 154. Stages 155,156 and 157 are used to isolate the load from the sweepvoltageproducing capacitor 152, to improve the linearity of the sweep, and to provide a low-impedance sweep-voltage source. A voltage derived from the sweep output by means of a Zener diode-resistance divider 134 (interface output) is used to reset a number of voltages in the logic circuits so that conditions required for the next sweep are obtained. This occurs when the "interface output"158 reaches about 6v.
The operation of the horizontal sweep control and presentation sequencer (FIG. 11) will be explained for the MF RT mode, used as an example. In the beginning of the sequence, the input 131 to the channel-l gate 126 is in state 0 and the input to the final gate 128 is in state 1. Thus channel I input 159 is blocked from having any efiect and the "START" light 15 is on. With a trigger in channel 2 the input to inverter 160 is in state I and this causes the output of the final gate 128 to be in state 0. This 0 signal sets the first flip-flop I29 and resets the second flip-flop 132. As a result, one output 161 of the first flip-flop 129 becomes state I, thereby turning on the CONT. light 16 and delivering an unblank output. In addition, the other output of the first flip-flop 129 goes to state 0, thereby initiating the sweep and delivering a pulse to the clock 136 at the input 162 to gate 163. In addition, the output 133 of the second flip-flop 132 causes the input 130 to the final gate 128 to become state 0, thus turning ofi the START lamp and preventing further inputs in either channel from triggering the sweep: in other words, the final gate 128 is closed. In the CFFI mode this action of the output 133 of the second flip-flop 132 is delivered directly to the final gate 128. In all other modes, this output 133 of the second flip-flop 132 is delivered to the final gate 128 via gate 164 and inverter 165, Gate 164 also inverts the output of multivibrator 189.
When the interface output 158 to the first multivibrator 135 (see also FIG. 13) increases in magnitude above -6 volts, the resulting output at 166 from the first multivibrator 135 resets the first flip-flop 129. The resulting change in state at output 161 of the first flip-flop 129 turns the CONT." light 16 off, and the resulting change in state at output 150 of the first flipflop 129 causes the beam to return to its initial position. In addition, the other output 167 of the first multivibrator 135 via switch SDA triggers the clock 136 which generates a delay sequence. In the MFRT mode the first delay equals 2 seconds and is caused by the output of the fifth flip-flop 138 while subsequent delays are I second and are produced by the output of the sixth flip-flop 139. The original rate of the clock pulses are shown in FIG. 14 and are represented by the state of the clock output at 168. This output changes state every half second so that the full cycle is one second. This rate is reduced by a factor of 2 in each of the three cascaded flip-flop stages numbered 6,5 and 4 (139,138 and 137). The outputs of these flipflops or binary counters are used to set the second flipflop 132 via the switch SCB.
In the first part of the MFRT mode input 169 to gate 170 is in state 0 and input 171 to gate 172 is in state I. Under these

Claims (8)

1. Apparatus for displaying patterns derived from voiced sounds comprising in combination means for producing a sequence of pulses whose frequency is that of the pitch of the voiced sound to be displayed; a cathode-ray variable-persistence display having gating means to block or deliver the display on the cathode-ray screen, a beam-sweep means adapted to start the sweep across the display in response to the start of the voiced utterance, and means for deflecting the beam in a direction transverse to the direction of sweep at a rate slow enough to display pitch variation within the utterances; a P-A Analog Circuit for producing an analog voltage the magnitude whereof is proportional to the logarithm of the frequency of said pulses; means for applying said voltage to said beam-deflecting means of said cathode-ray display; and means for producing a control signal to actuate said gating means to block the display until a certain time interval after completion of a sweep.
2. Apparatus according to claim 1, wherein said means for producing a signal also includes means to produce a signal blocking said pulse whenever the time interval between this pulse and the immediately preceding pulse corresponds to a pulse repetition rate greater than a certain fixed rate.
3. Apparatus according to claim 1, wherein said means for producing a signal also includes means to produce a signal blocking said display whenever the time interval between immediately successive pulses is more than a certain time interval.
4. Apparatus according to claim 1, wherein said variable-persistence display includes means for storing said display, and wherein said means for producing a control signal includes means to actuate said gating means, after said gating means has blocked the display, to deliver the display during at least one additional sweep of said beam and thereafter to block the display again until a certain time interval after completion of said additional sweep.
5. Apparatus for displaying patterns derived from voiced sounds comprising in combination a means for producing a sequence of pulses whose frequency is that of the pitch of the voiced sound to be displayed; cathode-ray variable-persistence display having gating means to block or deliver the display on the cathode-ray screen, a beam sweep means adapted to start the sweep across the display in response to the start of the voiced utterance, and means for deflecting the beam in a direction transverse to the direction of sweep at a rate slow enough to display pitch variation within the utterances; an exponential-decay circuit for producing a voltage output which decays exponentially from a fixed voltage to which each pulse restores it, so that a sample of said output at a fixed time interval prior to each succeeding pulse is a measure of the logarithm of the pulse frequency; means for storing said sample until production of the following sample, thereby producing an analog voltage the magnitude whereof is proportional to the logarithm of the frequency of said pulses; means for applying said voltage to said beam-deflecting means of said cathode-ray display; and means for producing a control signal to actuate said gating means to block the display until a certain time interval after completion of a sweep.
6. Apparatus according to claim 5, wherein said means for producing a signal comprises means to compare each sample with the stored preceding sample, and means to produce a signal whenever the samples thus compared differ by more than a predetermined amount.
7. Apparatus according to claim 5, wherein said exponential-decay circuit comprises two RC circuits each adapted to receive pulses at the junction of its resistance and capacitance, said resistances being joined at the respective ends thereof remote from said junction so as to deliver a current between pulses wHich is the sum of the discharge currents from the capacitances.
8. Apparatus according to claim 7, wherein said exponential decay circuit comprises a first RC circuit and a second RC circuit, wherein the RC constant of said first RC circuit is 0.0113 sec., wherein the RC constant of said second RC circuit is 0.0019 sec., and wherein the magnitude of said second resistance is twice that of said first resistance.
US30123A 1970-04-20 1970-04-20 Voiced sound display Expired - Lifetime US3676595A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US3012370A 1970-04-20 1970-04-20

Publications (1)

Publication Number Publication Date
US3676595A true US3676595A (en) 1972-07-11

Family

ID=21852630

Family Applications (1)

Application Number Title Priority Date Filing Date
US30123A Expired - Lifetime US3676595A (en) 1970-04-20 1970-04-20 Voiced sound display

Country Status (1)

Country Link
US (1) US3676595A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3928722A (en) * 1973-07-16 1975-12-23 Hitachi Ltd Audio message generating apparatus used for query-reply system
US4612665A (en) * 1978-08-21 1986-09-16 Victor Company Of Japan, Ltd. Graphic equalizer with spectrum analyzer and system thereof
US4969194A (en) * 1986-12-22 1990-11-06 Kabushiki Kaisha Kawai Gakki Seisakusho Apparatus for drilling pronunciation
EP0500094A2 (en) * 1991-02-20 1992-08-26 Fujitsu Limited Speech signal coding and decoding system with transmission of allowed pitch range information
US5963895A (en) * 1995-05-10 1999-10-05 U.S. Philips Corporation Transmission system with speech encoder with improved pitch detection
US20070115060A1 (en) * 2005-11-07 2007-05-24 Lawson Labs, Inc. Power conversion regulator with exponentiating feedback loop
US10019995B1 (en) * 2011-03-01 2018-07-10 Alice J. Stiebel Methods and systems for language learning based on a series of pitch patterns
US11253193B2 (en) * 2016-11-08 2022-02-22 Cochlear Limited Utilization of vocal acoustic biomarkers for assistive listening device utilization

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3928722A (en) * 1973-07-16 1975-12-23 Hitachi Ltd Audio message generating apparatus used for query-reply system
US4612665A (en) * 1978-08-21 1986-09-16 Victor Company Of Japan, Ltd. Graphic equalizer with spectrum analyzer and system thereof
US4969194A (en) * 1986-12-22 1990-11-06 Kabushiki Kaisha Kawai Gakki Seisakusho Apparatus for drilling pronunciation
EP0500094A2 (en) * 1991-02-20 1992-08-26 Fujitsu Limited Speech signal coding and decoding system with transmission of allowed pitch range information
EP0500094A3 (en) * 1991-02-20 1992-09-30 Fujitsu Limited Speech signal coding and decoding system with transmission of allowed pitch range information
US5325461A (en) * 1991-02-20 1994-06-28 Fujitsu Limited Speech signal coding and decoding system transmitting allowance range information
US5963895A (en) * 1995-05-10 1999-10-05 U.S. Philips Corporation Transmission system with speech encoder with improved pitch detection
US20070115060A1 (en) * 2005-11-07 2007-05-24 Lawson Labs, Inc. Power conversion regulator with exponentiating feedback loop
US7492221B2 (en) * 2005-11-07 2009-02-17 Lawson Labs, Inc. Power conversion regulator with exponentiating feedback loop
US10019995B1 (en) * 2011-03-01 2018-07-10 Alice J. Stiebel Methods and systems for language learning based on a series of pitch patterns
US11253193B2 (en) * 2016-11-08 2022-02-22 Cochlear Limited Utilization of vocal acoustic biomarkers for assistive listening device utilization

Similar Documents

Publication Publication Date Title
Lindau The story of/r
US3676595A (en) Voiced sound display
US20210327446A1 (en) Method and apparatus for reconstructing voice conversation
US4139732A (en) Apparatus for speech pattern derivation
Wiren et al. Electronic binary selection system for phoneme classification
Anderson An experimental pitch indicator for training deaf scholars
Fry Simple reaction-times to speech and non-speech stimuli
US3198884A (en) Sound analyzing system
US3755627A (en) Programmable feature extractor and speech recognizer
US4276445A (en) Speech analysis apparatus
Starkweather A speech rate meter for vocal behavior analysis
Ganong III et al. Measuring phoneme boundaries four ways
US3387090A (en) Method and apparatus for displaying speech
Sharf et al. Effect of forward and backward coarticulation on the identification of speech sounds
Hansen Evaluation of acoustic correlates of speech under stress for robust speech recognition
Efremova et al. Intelligibility of tonic accents
Arnold et al. The synthesis of English vowels
Niederjohn et al. Computer recognition of the continuant phonemes in connected English speech
Dolansky et al. An intonation display system for the deaf
Lindblom et al. Analysis of labial movement
Goldberg et al. Vocoded Speech in the Absence of the Laryngeal Frequency
SU1266523A1 (en) Apparatus for psychophysical examinations of operators
LANE EXPERIMENTAL ANALYSIS OF THE CONTROL OF SPEECH PRODUCTION AND PERCEPTION--II. PROGRESS REPORT 2, SEPTEMBER 1, 1961 TO FEBRUARY 1, 1962.
Anderson et al. A comparison of presentation rates using a missing item probe test of immediate memory
Sakai The Phonetic Typewriter: Its Fundamentals and Mechanism.