__del__( self )

Eaten by the Python.

Python, Pitch Shifting, and the Pianoputer

| Comments

Record a sound, change its pitch 50 times and assign each new sound to a key of your computer keyboard. You get a Pianoputer !

A sound can be encoded as an array (or list) of values, like this:

To make this sound play twice faster, we remove every second value in the array:

By doing so we didn’t only halved the sound’s duration, we also doubled its frequency, making it higher-pitched than the original.

If on the contrary we repeat each value of the array twice, we produce a sound that is slower, with a longer period, and therefore lower-pitched:

Here is a simple Python function that can change the speed of a sound by any factor:

1
2
3
4
5
6
7
import numpy as np

def speedx(sound_array, factor):
    """ Multiplies the sound's speed by some `factor` """
    indices = np.round( np.arange(0, len(sound_array), factor) )
    indices = indices[indices < len(sound_array)].astype(int)
    return sound_array[ indices.astype(int) ]

What is more difficult to do is to change the duration of a sound while preserving its pitch (sound stretching), or change the pitch of a sound while preserving its duration (pitch shifting).

Sound stretching

Sound stretching can be done using the classical phase vocoder method. You first break the sound into overlapping bits, and you rearrange these bits so that they will overlap even more (if you want to shorten the sound) or less (if you want to stretch the sound), like in this figure:

The difficulty is that the reorganized bits can interfer badly with one another, and some phase-transformation is necessary so that this won’t happen. Here is the Python code, freely rewritten from there:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
def stretch(sound_array, f, window_size, h):
    """ Stretches the sound by a factor `f` """

    phase  = np.zeros(window_size)
    hanning_window = np.hanning(window_size)
    result = np.zeros( len(sound_array) /f + window_size)

    for i in np.arange(0, len(sound_array)-(window_size+h), h*f):

        # two potentially overlapping subarrays
        a1 = sound_array[i: i + window_size]
        a2 = sound_array[i + h: i + window_size + h]

        # resynchronize the second array on the first
        s1 =  np.fft.fft(hanning_window * a1)
        s2 =  np.fft.fft(hanning_window * a2)
        phase = (phase + np.angle(s2/s1)) % 2*np.pi
        a2_rephased = np.fft.ifft(np.abs(s2)*np.exp(1j*phase))

        # add to result
        i2 = int(i/f)
        result[i2 : i2 + window_size] += hanning_window*a2_rephased

    result = ((2**(16-4)) * result/result.max()) # normalize (16bit)

    return result.astype('int16')

Pitch shifting

Pitch-shifting is easy once you have sound stretching. If you want a higer pitch, you first stretch the sound while conserving the pitch, then you speed up the result, such that the final sound has the same duration as the initial one, but a higher pitch due to the speed change.

Doubling the frequency of a sound increases the pitch of one octave, which is 12 musical semitones. Therefore to increase the pitch by $n$ semitones we must multiply the frequency by a factor $2^{\frac{n}{12}}$:

1
2
3
4
5
def pitchshift(snd_array, n, window_size=2**13, h=2**11):
    """ Changes the pitch of a sound by ``n`` semitones. """
    factor = 2**(1.0 * n / 12.0)
    stretched = stretch(snd_array, 1.0/factor, window_size, h)
    return speedx(stretched[window_size:], factor)

Application: the Pianoputer

Let us play around with our new pitch-shifter. We first strike a bowl:

Then we create 50 pitch-shifted derivatives of that sound, ranging from the very low to the very high:

1
2
3
4
5
from scipy.io import wavfile

fps, bowl_sound = wavfile.read("bowl.wav")
tones = range(-25,25)
transposed = [pitchshift(bowl_sound, n) for n in tones]

We will assign each sound to a key of the computer keyboard, following the order in this file, which organizes the keyboard like this:

We simply tell the computer to play the corresponding sound when a key is pressed, and stop the sound when the key is released:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import pygame

pygame.mixer.init(fps, -16, 1, 512) # so flexible ;)
screen = pygame.display.set_mode((640,480)) # for the focus

# Get a list of the order of the keys of the keyboard in right order.
# ``keys`` is like ['Q','W','E','R' ...] 
keys = open('typewriter.kb').read().split('\n')

sounds = map(pygame.sndarray.make_sound, transposed)
key_sound = dict( zip(keys, sounds) )
is_playing = {k: False for k in keys}

while True:

    event =  pygame.event.wait()

    if event.type in (pygame.KEYDOWN, pygame.KEYUP):
        key = pygame.key.name(event.key)

    if event.type == pygame.KEYDOWN:

        if (key in key_sound.keys()) and (not is_playing[key]):
            key_sound[key].play(fade_ms=50)
            is_playing[key] = True

        elif event.key == pygame.K_ESCAPE:
            pygame.quit()
            raise KeyboardInterrupt

    elif event.type == pygame.KEYUP and key in key_sound.keys():

        key_sound[key].fadeout(50) # stops with 50ms fadeout
        is_playing[key] = False

And we have turned our computer into a piano ! Now, to thank you for reading until there, let me play a little turkish song for you:

Here are all the files you need if you want to try this at home. Since not everyone uses Python, I also coded a pianoputer in Javascript/HTML5 (here) but it is very far from good. It would be really great if an experienced HTML5/JS/elm developer improved it, or rewrote it from scratch.

What next ?

On a more general note, I find that computers have been under-used to produce performance music. I get it that it is easier to use a piano keyboard or record from an instrument directly, but look at what you can do with just a bowl and 60 lines of Python !

Even a cheap computer has so many controls that would make it a proper music station: you can sing to the microphone, make gestures to the webcam, modulate stuff using the mouse, and control the rest from your keyboard. So many ways to express yourself, and there is a Python package for each of them… Any artistic hacker wanting to make steps in that direction ?

Comments