vosim

vosim — Simple vocal simulation based on glottal pulses with formant characteristics.

Description

This opcode produces a simple vocal simulation based on glottal pulses with formant characteristics. Output is a series of sound events, where each event is composed of a burst of squared sine pulses followed by silence. The VOSIM (VOcal SIMulation) synthesis method was developed by Kaegi and Tempelaars in the 1970's.

Syntax

ar vosim kamp, kFund, kForm, kDecay, kPulseCount, kPulseFactor, ifn [, iskip]

Intialization

ifn - a sound table, normally containing half a period of a sinewave, squared (see notes below).

iskip - (optional) Skip initialization, for tied notes.

Performance

ar - output signal. Note that the output is usually unipolar - positive only.

kamp - output amplitude, the peak amplitude of the first pulse in each burst.

kFund - fundamental pitch, in Herz. Each event is 1/kFund seconds long.

kForm - formant center frequency. Length of each pulse in the burst is 1/kForm seconds.

kDecay - a dampening factor from pulse to pulse. This is subtracted from amplitude on each new pulse.

kPulseCount - number of pulses in the burst part of each event.

kPulseFactor - the pulse width is multiplied by this value at each new pulse. This results in formant sweeping. If factor is < 1.0, the formant sweeps up, if > 1.0 each new pulse is longer, so the formant sweeps down. The final pitch of the formant is kForm * pow(kPulseFactor, kPulseCount)

The output of vosim is a series of sound events, where each event is composed of a burst of squared sine pulses followed by silence. The total duration of the events determines fundamental frequency. The length of each single pulse in the squared-sine bursts produce a formant frequency band. The width of the formant is determined by rate of silence to pulses (see below). The final result is also shaped by the dampening factor from pulse to pulse.

A small practical problem in using this opcode is that no GEN function will create a squared sine wave out of the box. Something like the following can be used to create the appropriate table from the score.

; use GEN09 to create half a sine in table 17

f 17 time size 9  0.5  1 0

; run instr 101 on table 17 for a single init-pass

i 101 0 0 17
    

It can also be done with an instrument writing to an f-table in the orchestra:

	; square each point in table #p4. This should be run as init-only, just once in the performance.

instr 101

    index tableng p4

    index = index - 1  ; start from last point
loop:

    ival table index, p4

    ival = ival * ival

    tableiw ival, index, p4

    index = index - 1

    if index < 0 igoto endloop
            igoto loop
endloop:
endin
[Note] Parameter Limits

The count of pulses multiplied by pulse width should fit in the event length (1/kFund). If this is not fulfilled, the algorithm does not break, we just do not start any pulses that would outlast the event. This might introduce a silence at end of event even if none was intended. In consequence, kForm should be higher than kFund, otherwise only silence is output.

Vosim was created to emulate voice sounds using a model of glottal pulse. Rich sounds can be created by combining several instances of vosim with different parameters. One drawback is that the signal is not band-limited. But as the authors point out, attenuation of high-pitch components is -60 dB at 6 times the fundamental frequency. The signal can also be changed by changing the source signal in the lookup table. The technique has historical interest, and can produce rich sound very cheaply (each sample requires only a table lookup and a single multiplication for attenuation).

As stated, formant bandwidth depends on the ratio between pulse burst and silence in an event. But this is not an independent parameter: The fundamental decides event length, and formant center defines the pulse length. It is therefore impossible to guarantee a specific burst/silence ratio, since the burst length has to be an integer multiple of pulse length. The decay of pulses can be used to smooth the transition from N to N+/-1 pulses, but there will still be steps in the spectral profile of output. The example code below shows one approach to this.

All input parameters are k-rate. The input parameters are only used to set up each new event (or grain). Event amplitude is fixed for each event at initialization. In normal parameter ranges, when ksmps <500, the k-rate parameters are updated more often than events are created. In any case, no wide-band noise will be injected in the system due to k-rate inputs being updated less often than they are read, but some other artefacts could be created.

The opcode should behave reasonably in the face of all user inputs. Some details:

  1. kFund < 0: This is forced to positive - no point in "reversed" events.

  2. kFund == 0: This leads to "infinite" length event, ie a pulse burst followed by very long indefinite silence.

  3. kForm == 0: This leads to infinite length pulse, so no pulses are generated (i.e. silence).

  4. kForm < 0: Table is read backward. If table is symmetric, kform and -kform should give bit-identical outputs.

  5. kPulseFactor == 0: Second pulse onwards is zero. See (c).

  6. kPulseFactor < 0: Pulses alternately read table forward and reversed.

With asymmetric pulse table there may be some use for negative kForm or negative kPulseFactor.

Examples

Here is an example of the vosim opcode. It uses the file vosim.csd.

Example 1015. Example of the vosim opcode.

See the sections Real-time Audio and Command Line Flags for more information on using command line flags.

<CsoundSynthesizer>
<CsOptions>
; Select audio/midi flags here according to platform
; Audio out   Audio in
-odac           -iadc    ;;;RT audio I/O
; For Non-realtime ouput leave only the line below:
; -o vosim.wav -W ;;; for file output any platform
</CsOptions>
<CsInstruments>
sr     = 44100
ksmps  = 100
nchnls = 1

;#################################################
; By Rasmus Ekman 2008

; Square each point in table #p4. This should only be run once in the performance.
instr 10

	index tableng p4
	index = index - 1  ; start from last point
loop:
	ival table index, p4
	ival = ival * ival
	tableiw ival, index, p4
	index = index - 1
	if index < 0 igoto endloop
		igoto loop
endloop: 
endin

;#################################################

; Main vosim instrument. Sweeps from a fund1/form1 to fund2/form2,
; trying for narrowest formant bandwidth (still quite wide by the looks of it)
; p4:     amp
; p5, p6: fund beg-end
; p7, p8: form beg-end
; p9:     amp decay (ignored)
; p10:    pulse count (ignored - calc internally)
; p11:    pulse length mod
; p12:    skip (for tied events)
; p13:    don't fade out (if followed by tied note)
instr 1
    kamp  init  p4
    ; freq start, end
    kfund  line  p5, p3, p6
    ; formant start, end
    kform  line  p7, p3, p8

	; Try for constant ratio burst/silence, and narrowest formant bandwidth
	kPulseCount  = (kform / kfund)  ;init p10
	; Attempt to smooth steps between format bandwidths,
	; increasing decay before we are forced to a lower pulse count
	kDecay = kPulseCount/(kform % kfund)  ; init p9
	if (kDecay * kPulseCount) > kamp then
		kDecay = kamp / kPulseCount
	endif
	kDecay = 0.3 * kDecay

	kPulseFactor init p11
	
;  ar	vosim	kamp, kFund, kForm, kDecay, kPulseCount, kPulseFactor, ifn [, iskip]
    ar1	vosim 	kamp, kfund, kform, kDecay, kPulseCount, kPulseFactor, 17, p12

    ; scale amplitude for 16-bit files, with quick fade out
    amp init 20000
    if (p13 != 0) goto nofade
	amp linseg 20000, p3-.02, 20000, .02, 0
nofade:
	out ar1 * amp
endin


</CsInstruments>
<CsScore>

f1       0  32768    9  1    1  0   ; sine wave
f17      0  32768    9  0.5  1  0   ; half sine wave
i10 0 0 17 ; init run only, square table 17

; Vosim score

; Picking some formants from the table in Csound manual

;      p4=amp  fund     form      decay pulses pulsemod [skip] nofade
; tenor a -> e
i1 0  .5  .5   280 240  650  400   .03   5      1
i1 .  .   .3   .   .    1080 1700  .03   5      .
i1 .  .   .2   .   .    2650 2600  .03   5      .
i1 .  .   .15  .   .    2900 3200  .03   5      .

; tenor a -> o
i1 0.6 .2  .5  300 210  650  400   .03   5      1      0      1
i1 .   .   .3  .   .    1080 800   .03   5      .      .      .
i1 .   .   .2  .   .    2650 2600  .03   5      .      .      .
i1 .   .   .15 .   .    2900 2800  .03   5      .      .      .
; tenor o -> aah
i1 .8  .3  .5  210 180  400  650   .03   5      1      1      1
i1 .   .   .3  .   .    800  1080  .03   5      .      .      .
i1 .   .   .2  .   .    2600 2650  .03   5      .      .      .
i1 .   .   .15 .   .    2800 2900  .03   5      .      .      .
; tenor aa -> i
i1 1.1 .2  .5  180 250  650  290   .03   5      1      1      1
i1 .   .   .3  .   .    1080 1870  .03   5      .      .      .
i1 .   .   .2  .   .    2650 2800  .03   5      .      .      .
i1 .   .   .15 .   .    2900 3250  .03   5      .      .      .
; tenor i -> u
i1 1.3 .3  .5  250 270  290  350   .03   5      1      1      0
i1 .   .   .3  .   .    1870 600   .03   5      .      .      .
i1 .   .   .2  .   .    2800 2700  .03   5      .      .      .
i1 .   .   .15 .   .    3250 2900  .03   5      .      .      .

e

</CsScore>
</CsoundSynthesizer>


See also

fof, fof2

Credits

Author: Rasmus Ekman
March 2008