Read and write video frames in Python using FFMPEG

This article shows how easy it is to read or write video frames with a few lines of Python, by calling the external software FFMPEG through pipes. If you want a battle-tested and more sophisticated version, check out my module MoviePy. See also this other article for the same with audio files.

Before we start, you must have FFMPEG installed on your computer and you must know the name (or path) of the FFMPEG binary. It should be one of the following:

FFMPEG_BIN = "ffmpeg" # on Linux ans Mac OS
FFMPEG_BIN = "ffmpeg.exe" # on Windows

Reading

To read the frames of the video “myHolidays.mp4” we first ask FFMPEG to open this file and to direct its output to Python:

import subprocess as sp
command = [ FFMPEG_BIN,
            '-i', 'myHolidays.mp4',
            '-f', 'image2pipe',
            '-pix_fmt', 'rgb24',
            '-vcodec', 'rawvideo', '-']
pipe = sp.Popen(command, stdout = sp.PIPE, bufsize=10**8)

In the code above -i myHolidays.mp4 indicates the input file, while rawvideo/rgb24 asks for a raw RGB output. The format image2pipe and the - at the end tell FFMPEG that it is being used with a pipe by another program. In sp.Popen, the bufsize parameter must be bigger than the size of one frame (see below). It can be omitted most of the time in Python 2 but not in Python 3 where its default value is pretty small.

Now we just have to read the output of FFMPEG. If the video has a size of 420x320 pixels, then the first 420x360x3 bytes outputed by FFMPEG will give the RGB values of the pixels of the first frame, line by line, top to bottom. The next 420x360x3 bytes afer that will represent the second frame, etc. In the next lines we extract one frame and reshape it as a 420x360x3 Numpy array:

import numpy
# read 420*360*3 bytes (= 1 frame)
raw_image = pipe.stdout.read(420*360*3)
# transform the byte read into a numpy array
image =  numpy.fromstring(raw_image, dtype='uint8')
image = image.reshape((360,420,3))
# throw away the data in the pipe's buffer.
pipe.stdout.flush()

You can now view the image with for instance Pylab’s imshow( image ). By repeating the two lines above you can read all the frames of the video one after the other. Reading one frame with this method takes 2 milliseconds on my computer.

What if you want to read the frame that is at time 01h00 in the video ? You could do as above: open the pipe, and read all the frames of the video one by one until you reach that corresponding to t=01h00. But this may be VERY long. A better solution is to call FFMPEG with arguments telling it to start reading “myHolidays.mp4” at time 01h00:

command = [FFMPEG_BIN,
            '-ss', '00:59;59',
            '-i', 'myHolidays.mp4',
            '-ss', '1',
            '-f', 'image2pipe',
            '-pix_fmt', 'rgb24',
            '-vcodec','rawvideo', '-']
pipe = sp.Popen(command, stdout=sp.PIPE, bufsize=10**8)

In the code above we ask FFMPEG to quickly (and imprecisely) reach 00:59:59, then to skip 1 second of movie with precision (-ss 1), so that it will effectively start at 01:00:00 sharp (see this page for more infos).Then you can start reading frames as previously shown. Seeking a frame with this method takes at most 0.1 second on my computer.

You can also get informations on a file (frames size, number of frames per second, etc.) by calling

command = [FFMPEG_BINARY,'-i', 'my_video.mp4', '-']
pipe = sp.Popen(command, stdout=sp.PIPE stderr=sp.PIPE)
pipe.stdout.readline()
pipe.terminate()
infos = proc.stderr.read()

Now infos contains a text describing the file, that you would need to parse to obtain the relevant informations. See the last section for a link to an implementation.

Writing

To write a series of frames of size 460x360 into the file 'my_output_videofile.mp4', we open FFMPEG and indicate that raw RGB data is going to be piped in:

command = [ FFMPEG_BIN,
        '-y', # (optional) overwrite output file if it exists
        '-f', 'rawvideo',
        '-vcodec','rawvideo',
        '-s', '420x360', # size of one frame
        '-pix_fmt', 'rgb24',
        '-r', '24', # frames per second
        '-i', '-', # The imput comes from a pipe
        '-an', # Tells FFMPEG not to expect any audio
        '-vcodec', 'mpeg'",
        'my_output_videofile.mp4' ]

pipe = sp.Popen( command, stdin=sp.PIPE, stderr=sp.PIPE)

The codec of the output video can be any valid FFMPEG codec but for many codecs you will need to provide the bitrate as an additional argument (for instance -bitrate 3000k). Now we can write raw frames one after another in the file. These will be raw frames, like the ones outputed by FFMPEG in the previous section: they should be strings of the form “RGBRGBRGB…” where R,G,B are caracters that represent a number between 0 and 255. If our frame is represented as a Numpy array, we simply write:

pipe.proc.stdin.write( image_array.tostring() )

Going further

I tried to keep the code as simple as possible here. With a few more lines you can make useful classes to manipulate video files, like FFMPEG_VideoReader and FFMPEG_VideoWriter that I wrote for my video editing software. In these files in particular how to parse the information on the video, how to save/load pictures using FFMPEG, etc.

del( self )

Eaten by the Python.

Read and Write Video Frames in Python Using FFMPEG

Reading

Writing

Going further

Comments