madmom.audio.ffmpeg

This module contains audio handling via ffmpeg functionality.

madmom.audio.ffmpeg.decode_to_disk(infile, fmt='f32le', sample_rate=None, num_channels=1, skip=None, max_len=None, outfile=None, tmp_dir=None, tmp_suffix=None, cmd='ffmpeg')[source]

Decodes the given audio file, optionally down-mixes it to mono and writes it to another file as a sequence of samples. Returns the file name of the output file.

Parameters:

infile : str

Name of the audio sound file to decode.

fmt : {‘f32le’, ‘s16le’}, optional

Format of the samples: - ‘f32le’ for float32, little-endian, - ‘s16le’ for signed 16-bit int, little-endian.

sample_rate : int, optional

Sample rate to re-sample the signal to (if set) [Hz].

num_channels : int, optional

Number of channels to reduce the signal to.

skip : float, optional

Number of seconds to skip at beginning of file.

max_len : float, optional

Maximum length in seconds to decode.

outfile : str, optional

The file to decode the sound file to; if not given, a temporary file will be created.

tmp_dir : str, optional

The directory to create the temporary file in (if no outfile is given).

tmp_suffix : str, optional

The file suffix for the temporary file if no outfile is given; e.g. ”.pcm” (including the dot).

cmd : {‘ffmpeg’, ‘avconv’}, optional

Decoding command (defaults to ffmpeg, alternatively supports avconv).

Returns:

outfile : str

The output file name.

madmom.audio.ffmpeg.decode_to_memory(infile, fmt='f32le', sample_rate=None, num_channels=1, skip=None, max_len=None, cmd='ffmpeg')[source]

Decodes the given audio file, down-mixes it to mono and returns it as a binary string of a sequence of samples.

Parameters:

infile : str

Name of the audio sound file to decode.

fmt : {‘f32le’, ‘s16le’}, optional

Format of the samples: - ‘f32le’ for float32, little-endian, - ‘s16le’ for signed 16-bit int, little-endian.

sample_rate : int, optional

Sample rate to re-sample the signal to (if set) [Hz].

num_channels : int, optional

Number of channels to reduce the signal to.

skip : float, optional

Number of seconds to skip at beginning of file.

max_len : float, optional

Maximum length in seconds to decode.

cmd : {‘ffmpeg’, ‘avconv’}, optional

Decoding command (defaults to ffmpeg, alternatively supports avconv).

Returns:

samples : str

a binary string of samples

madmom.audio.ffmpeg.decode_to_pipe(infile, fmt='f32le', sample_rate=None, num_channels=1, skip=None, max_len=None, buf_size=-1, cmd='ffmpeg')[source]

Decodes the given audio file, down-mixes it to mono and returns a file-like object for reading the samples, as well as a process object. To stop decoding the file, call close() on the returned file-like object, then call wait() on the returned process object.

Parameters:

infile : str

Name of the audio sound file to decode.

fmt : {‘f32le’, ‘s16le’}, optional

Format of the samples: - ‘f32le’ for float32, little-endian, - ‘s16le’ for signed 16-bit int, little-endian.

sample_rate : int, optional

Sample rate to re-sample the signal to (if set) [Hz].

num_channels : int, optional

Number of channels to reduce the signal to.

skip : float, optional

Number of seconds to skip at beginning of file.

max_len : float, optional

Maximum length in seconds to decode.

buf_size : int, optional

Size of buffer for the file-like object: - ‘-1’ means OS default (default), - ‘0’ means unbuffered, - ‘1’ means line-buffered, any other value is the buffer size in bytes.

cmd : {‘ffmpeg’,’avconv’}, optional

Decoding command (defaults to ffmpeg, alternatively supports avconv).

Returns:

pipe : file-like object

File-like object for reading the decoded samples.

proc : process object

Process object for the decoding process.

madmom.audio.ffmpeg.get_file_info(infile, cmd='ffprobe')[source]

Extract and return information about audio files.

Parameters:

infile : str

Name of the audio file.

cmd : {‘ffprobe’, ‘avprobe’}, optional

Probing command (defaults to ffprobe, alternatively supports avprobe).

Returns:

dict

Audio file information.

madmom.audio.ffmpeg.load_ffmpeg_file(filename, sample_rate=None, num_channels=None, start=None, stop=None, dtype=None, cmd_decode='ffmpeg', cmd_probe='ffprobe')[source]

Load the audio data from the given file and return it as a numpy array.

This uses ffmpeg (or avconv) and thus supports a lot of different file formats, resampling and channel conversions. The file will be fully decoded into memory if no start and stop positions are given.

Parameters:

filename : str

Name of the audio sound file to load.

sample_rate : int, optional

Sample rate to re-sample the signal to [Hz]; ‘None’ returns the signal in its original rate.

num_channels : int, optional

Reduce or expand the signal to num_channels channels; ‘None’ returns the signal with its original channels.

start : float, optional

Start position [seconds].

stop : float, optional

Stop position [seconds].

dtype : numpy dtype, optional

Numpy dtype to return the signal in (supports signed and unsigned 8/16/32-bit integers, and single and double precision floats, each in little or big endian). If ‘None’, np.int16 is used.

cmd_decode : {‘ffmpeg’, ‘avconv’}, optional

Decoding command (defaults to ffmpeg, alternatively supports avconv).

cmd_probe : {‘ffprobe’, ‘avprobe’}, optional

Probing command (defaults to ffprobe, alternatively supports avprobe).

Returns:

signal : numpy array

Audio samples.

sample_rate : int

Sample rate of the audio samples.