madmom.audio.ffmpeg¶
This module contains audio handling via ffmpeg functionality.
-
madmom.audio.ffmpeg.
decode_to_disk
(infile, fmt='f32le', sample_rate=None, num_channels=1, skip=None, max_len=None, outfile=None, tmp_dir=None, tmp_suffix=None, cmd='ffmpeg')[source]¶ Decodes the given audio file, optionally down-mixes it to mono and writes it to another file as a sequence of samples. Returns the file name of the output file.
Parameters: infile : str
Name of the audio sound file to decode.
fmt : {‘f32le’, ‘s16le’}, optional
Format of the samples: - ‘f32le’ for float32, little-endian, - ‘s16le’ for signed 16-bit int, little-endian.
sample_rate : int, optional
Sample rate to re-sample the signal to (if set) [Hz].
num_channels : int, optional
Number of channels to reduce the signal to.
skip : float, optional
Number of seconds to skip at beginning of file.
max_len : float, optional
Maximum length in seconds to decode.
outfile : str, optional
The file to decode the sound file to; if not given, a temporary file will be created.
tmp_dir : str, optional
The directory to create the temporary file in (if no outfile is given).
tmp_suffix : str, optional
The file suffix for the temporary file if no outfile is given; e.g. ”.pcm” (including the dot).
cmd : {‘ffmpeg’, ‘avconv’}, optional
Decoding command (defaults to ffmpeg, alternatively supports avconv).
Returns: outfile : str
The output file name.
-
madmom.audio.ffmpeg.
decode_to_memory
(infile, fmt='f32le', sample_rate=None, num_channels=1, skip=None, max_len=None, cmd='ffmpeg')[source]¶ Decodes the given audio file, down-mixes it to mono and returns it as a binary string of a sequence of samples.
Parameters: infile : str
Name of the audio sound file to decode.
fmt : {‘f32le’, ‘s16le’}, optional
Format of the samples: - ‘f32le’ for float32, little-endian, - ‘s16le’ for signed 16-bit int, little-endian.
sample_rate : int, optional
Sample rate to re-sample the signal to (if set) [Hz].
num_channels : int, optional
Number of channels to reduce the signal to.
skip : float, optional
Number of seconds to skip at beginning of file.
max_len : float, optional
Maximum length in seconds to decode.
cmd : {‘ffmpeg’, ‘avconv’}, optional
Decoding command (defaults to ffmpeg, alternatively supports avconv).
Returns: samples : str
a binary string of samples
-
madmom.audio.ffmpeg.
decode_to_pipe
(infile, fmt='f32le', sample_rate=None, num_channels=1, skip=None, max_len=None, buf_size=-1, cmd='ffmpeg')[source]¶ Decodes the given audio file, down-mixes it to mono and returns a file-like object for reading the samples, as well as a process object. To stop decoding the file, call close() on the returned file-like object, then call wait() on the returned process object.
Parameters: infile : str
Name of the audio sound file to decode.
fmt : {‘f32le’, ‘s16le’}, optional
Format of the samples: - ‘f32le’ for float32, little-endian, - ‘s16le’ for signed 16-bit int, little-endian.
sample_rate : int, optional
Sample rate to re-sample the signal to (if set) [Hz].
num_channels : int, optional
Number of channels to reduce the signal to.
skip : float, optional
Number of seconds to skip at beginning of file.
max_len : float, optional
Maximum length in seconds to decode.
buf_size : int, optional
Size of buffer for the file-like object: - ‘-1’ means OS default (default), - ‘0’ means unbuffered, - ‘1’ means line-buffered, any other value is the buffer size in bytes.
cmd : {‘ffmpeg’,’avconv’}, optional
Decoding command (defaults to ffmpeg, alternatively supports avconv).
Returns: pipe : file-like object
File-like object for reading the decoded samples.
proc : process object
Process object for the decoding process.
-
madmom.audio.ffmpeg.
get_file_info
(infile, cmd='ffprobe')[source]¶ Extract and return information about audio files.
Parameters: infile : str
Name of the audio file.
cmd : {‘ffprobe’, ‘avprobe’}, optional
Probing command (defaults to ffprobe, alternatively supports avprobe).
Returns: dict
Audio file information.
-
madmom.audio.ffmpeg.
load_ffmpeg_file
(filename, sample_rate=None, num_channels=None, start=None, stop=None, dtype=None, cmd_decode='ffmpeg', cmd_probe='ffprobe')[source]¶ Load the audio data from the given file and return it as a numpy array.
This uses ffmpeg (or avconv) and thus supports a lot of different file formats, resampling and channel conversions. The file will be fully decoded into memory if no start and stop positions are given.
Parameters: filename : str
Name of the audio sound file to load.
sample_rate : int, optional
Sample rate to re-sample the signal to [Hz]; ‘None’ returns the signal in its original rate.
num_channels : int, optional
Reduce or expand the signal to num_channels channels; ‘None’ returns the signal with its original channels.
start : float, optional
Start position [seconds].
stop : float, optional
Stop position [seconds].
dtype : numpy dtype, optional
Numpy dtype to return the signal in (supports signed and unsigned 8/16/32-bit integers, and single and double precision floats, each in little or big endian). If ‘None’, np.int16 is used.
cmd_decode : {‘ffmpeg’, ‘avconv’}, optional
Decoding command (defaults to ffmpeg, alternatively supports avconv).
cmd_probe : {‘ffprobe’, ‘avprobe’}, optional
Probing command (defaults to ffprobe, alternatively supports avprobe).
Returns: signal : numpy array
Audio samples.
sample_rate : int
Sample rate of the audio samples.