madmom.io.audio¶
This module contains audio input/output functionality.
-
exception
madmom.io.audio.
LoadAudioFileError
(value=None)[source]¶ Exception to be raised whenever an audio file could not be loaded.
-
madmom.io.audio.
decode_to_disk
(infile, fmt='f32le', sample_rate=None, num_channels=1, skip=None, max_len=None, outfile=None, tmp_dir=None, tmp_suffix=None, cmd='ffmpeg')[source]¶ Decode the given audio file to another file.
Parameters: - infile : str
Name of the audio sound file to decode.
- fmt : {‘f32le’, ‘s16le’}, optional
Format of the samples: - ‘f32le’ for float32, little-endian, - ‘s16le’ for signed 16-bit int, little-endian.
- sample_rate : int, optional
Sample rate to re-sample the signal to (if set) [Hz].
- num_channels : int, optional
Number of channels to reduce the signal to.
- skip : float, optional
Number of seconds to skip at beginning of file.
- max_len : float, optional
Maximum length in seconds to decode.
- outfile : str, optional
The file to decode the sound file to; if not given, a temporary file will be created.
- tmp_dir : str, optional
The directory to create the temporary file in (if no outfile is given).
- tmp_suffix : str, optional
The file suffix for the temporary file if no outfile is given; e.g. “.pcm” (including the dot).
- cmd : {‘ffmpeg’, ‘avconv’}, optional
Decoding command (defaults to ffmpeg, alternatively supports avconv).
Returns: - outfile : str
The output file name.
-
madmom.io.audio.
decode_to_pipe
(infile, fmt='f32le', sample_rate=None, num_channels=1, skip=None, max_len=None, buf_size=-1, cmd='ffmpeg')[source]¶ Decode the given audio and return a file-like object for reading the samples, as well as a process object.
Parameters: - infile : str
Name of the audio sound file to decode.
- fmt : {‘f32le’, ‘s16le’}, optional
Format of the samples: - ‘f32le’ for float32, little-endian, - ‘s16le’ for signed 16-bit int, little-endian.
- sample_rate : int, optional
Sample rate to re-sample the signal to (if set) [Hz].
- num_channels : int, optional
Number of channels to reduce the signal to.
- skip : float, optional
Number of seconds to skip at beginning of file.
- max_len : float, optional
Maximum length in seconds to decode.
- buf_size : int, optional
Size of buffer for the file-like object: - ‘-1’ means OS default (default), - ‘0’ means unbuffered, - ‘1’ means line-buffered, any other value is the buffer size in bytes.
- cmd : {‘ffmpeg’,’avconv’}, optional
Decoding command (defaults to ffmpeg, alternatively supports avconv).
Returns: - pipe : file-like object
File-like object for reading the decoded samples.
- proc : process object
Process object for the decoding process.
Notes
To stop decoding the file, call close() on the returned file-like object, then call wait() on the returned process object.
-
madmom.io.audio.
decode_to_memory
(infile, fmt='f32le', sample_rate=None, num_channels=1, skip=None, max_len=None, cmd='ffmpeg')[source]¶ Decode the given audio and return it as a binary string representation.
Parameters: - infile : str
Name of the audio sound file to decode.
- fmt : {‘f32le’, ‘s16le’}, optional
Format of the samples: - ‘f32le’ for float32, little-endian, - ‘s16le’ for signed 16-bit int, little-endian.
- sample_rate : int, optional
Sample rate to re-sample the signal to (if set) [Hz].
- num_channels : int, optional
Number of channels to reduce the signal to.
- skip : float, optional
Number of seconds to skip at beginning of file.
- max_len : float, optional
Maximum length in seconds to decode.
- cmd : {‘ffmpeg’, ‘avconv’}, optional
Decoding command (defaults to ffmpeg, alternatively supports avconv).
Returns: - samples : str
Binary string representation of the audio samples.
-
madmom.io.audio.
get_file_info
(infile, cmd='ffprobe')[source]¶ Extract and return information about audio files.
Parameters: - infile : str
Name of the audio file.
- cmd : {‘ffprobe’, ‘avprobe’}, optional
Probing command (defaults to ffprobe, alternatively supports avprobe).
Returns: - dict
Audio file information.
-
madmom.io.audio.
load_ffmpeg_file
(filename, sample_rate=None, num_channels=None, start=None, stop=None, dtype=None, cmd_decode='ffmpeg', cmd_probe='ffprobe')[source]¶ Load the audio data from the given file and return it as a numpy array.
This uses ffmpeg (or avconv) and thus supports a lot of different file formats, resampling and channel conversions. The file will be fully decoded into memory if no start and stop positions are given.
Parameters: - filename : str
Name of the audio sound file to load.
- sample_rate : int, optional
Sample rate to re-sample the signal to [Hz]; ‘None’ returns the signal in its original rate.
- num_channels : int, optional
Reduce or expand the signal to num_channels channels; ‘None’ returns the signal with its original channels.
- start : float, optional
Start position [seconds].
- stop : float, optional
Stop position [seconds].
- dtype : numpy dtype, optional
Numpy dtype to return the signal in (supports signed and unsigned 8/16/32-bit integers, and single and double precision floats, each in little or big endian). If ‘None’, np.int16 is used.
- cmd_decode : {‘ffmpeg’, ‘avconv’}, optional
Decoding command (defaults to ffmpeg, alternatively supports avconv).
- cmd_probe : {‘ffprobe’, ‘avprobe’}, optional
Probing command (defaults to ffprobe, alternatively supports avprobe).
Returns: - signal : numpy array
Audio samples.
- sample_rate : int
Sample rate of the audio samples.
-
madmom.io.audio.
load_wave_file
(filename, sample_rate=None, num_channels=None, start=None, stop=None, dtype=None)[source]¶ Load the audio data from the given file and return it as a numpy array.
Only supports wave files, does not support re-sampling or arbitrary channel number conversions. Reads the data as a memory-mapped file with copy-on-write semantics to defer I/O costs until needed.
Parameters: - filename : str
Name of the file.
- sample_rate : int, optional
Desired sample rate of the signal [Hz], or ‘None’ to return the signal in its original rate.
- num_channels : int, optional
Reduce or expand the signal to num_channels channels, or ‘None’ to return the signal with its original channels.
- start : float, optional
Start position [seconds].
- stop : float, optional
Stop position [seconds].
- dtype : numpy data type, optional
The data is returned with the given dtype. If ‘None’, it is returned with its original dtype, otherwise the signal gets rescaled. Integer dtypes use the complete value range, float dtypes the range [-1, +1].
Returns: - signal : numpy array
Audio signal.
- sample_rate : int
Sample rate of the signal [Hz].
Notes
The start and stop positions are rounded to the closest sample; the sample corresponding to the stop value is not returned, thus consecutive segment starting with the previous stop can be concatenated to obtain the original signal without gaps or overlaps.
-
madmom.io.audio.
write_wave_file
(signal, filename, sample_rate=None)[source]¶ Write the signal to disk as a .wav file.
Parameters: - signal : numpy array or Signal
The signal to be written to file.
- filename : str
Name of the file.
- sample_rate : int, optional
Sample rate of the signal [Hz].
Returns: - filename : str
Name of the file.
Notes
sample_rate can be ‘None’ if signal is a
Signal
instance. If set, the given sample_rate is used instead of the signal’s sample rate. Must be given if signal is a ndarray.
-
madmom.io.audio.
load_audio_file
(filename, sample_rate=None, num_channels=None, start=None, stop=None, dtype=None)[source]¶ Load the audio data from the given file and return it as a numpy array. This tries load_wave_file() load_ffmpeg_file() (for ffmpeg and avconv).
Parameters: - filename : str or file handle
Name of the file or file handle.
- sample_rate : int, optional
Desired sample rate of the signal [Hz], or ‘None’ to return the signal in its original rate.
- num_channels: int, optional
Reduce or expand the signal to num_channels channels, or ‘None’ to return the signal with its original channels.
- start : float, optional
Start position [seconds].
- stop : float, optional
Stop position [seconds].
- dtype : numpy data type, optional
The data is returned with the given dtype. If ‘None’, it is returned with its original dtype, otherwise the signal gets rescaled. Integer dtypes use the complete value range, float dtypes the range [-1, +1].
Returns: - signal : numpy array
Audio signal.
- sample_rate : int
Sample rate of the signal [Hz].
Notes
For wave files, the start and stop positions are rounded to the closest sample; the sample corresponding to the stop value is not returned, thus consecutive segment starting with the previous stop can be concatenated to obtain the original signal without gaps or overlaps. For all other audio files, this can not be guaranteed.