madmom.audio.stft¶
This module contains Short-Time Fourier Transform (STFT) related functionality.
-
madmom.audio.stft.
fft_frequencies
(num_fft_bins, sample_rate)[source]¶ Frequencies of the FFT bins.
Parameters: num_fft_bins : int
Number of FFT bins (i.e. half the FFT length).
sample_rate : float
Sample rate of the signal.
Returns: fft_frequencies : numpy array
Frequencies of the FFT bins [Hz].
-
madmom.audio.stft.
stft
(frames, window, fft_size=None, circular_shift=False)[source]¶ Calculates the complex Short-Time Fourier Transform (STFT) of the given framed signal.
Parameters: frames : numpy array or iterable, shape (num_frames, frame_size)
Framed signal (e.g.
FramedSignal
instance)window : numpy array, shape (frame_size,)
Window (function).
fft_size : int, optional
FFT size (should be a power of 2); if ‘None’, the ‘frame_size’ given by frames is used; if the given fft_size is greater than the ‘frame_size’, the frames are zero-padded accordingly.
circular_shift : bool, optional
Circular shift the individual frames before performing the FFT; needed for correct phase.
Returns: stft : numpy array, shape (num_frames, frame_size)
The complex STFT of the framed signal.
-
madmom.audio.stft.
phase
(stft)[source]¶ Returns the phase of the complex STFT of a signal.
Parameters: stft : numpy array, shape (num_frames, frame_size)
The complex STFT of a signal.
Returns: phase : numpy array
Phase of the STFT.
-
madmom.audio.stft.
local_group_delay
(phase)[source]¶ Returns the local group delay of the phase of a signal.
Parameters: phase : numpy array, shape (num_frames, frame_size)
Phase of the STFT of a signal.
Returns: lgd : numpy array
Local group delay of the phase.
-
madmom.audio.stft.
lgd
(phase)¶ Returns the local group delay of the phase of a signal.
Parameters: phase : numpy array, shape (num_frames, frame_size)
Phase of the STFT of a signal.
Returns: lgd : numpy array
Local group delay of the phase.
-
class
madmom.audio.stft.
ShortTimeFourierTransform
(frames, window=<function hanning>, fft_size=None, circular_shift=False, **kwargs)[source]¶ ShortTimeFourierTransform class.
Parameters: frames :
audio.signal.FramedSignal
instanceFramed signal.
window : numpy ufunc or numpy array, optional
Window (function); if a function (e.g. np.hanning) is given, a window with the frame size of frames and the given shape is created.
fft_size : int, optional
FFT size (should be a power of 2); if ‘None’, the frame_size given by frames is used, if the given fft_size is greater than the frame_size, the frames are zero-padded accordingly.
circular_shift : bool, optional
Circular shift the individual frames before performing the FFT; needed for correct phase.
kwargs : dict, optional
If no
audio.signal.FramedSignal
instance was given, one is instantiated with these additional keyword arguments.Notes
If the
Signal
(wrapped in theFramedSignal
) has an integer dtype, the window is automatically scaled as if the signal had a float dtype with the values being in the range [-1, 1]. This results in same valued STFTs independently of the dtype of the signal. On the other hand, this prevents extra memory consumption since the data-type of the signal does not need to be converted (and if no decoding is needed, the audio signal can be memory-mapped).Examples
Create a
ShortTimeFourierTransform
from aSignal
orFramedSignal
:>>> sig = Signal('tests/data/audio/sample.wav') >>> sig Signal([-2494, -2510, ..., 655, 639], dtype=int16) >>> frames = FramedSignal(sig, frame_size=2048, hop_size=441) >>> frames <madmom.audio.signal.FramedSignal object at 0x...> >>> stft = ShortTimeFourierTransform(frames) >>> stft ShortTimeFourierTransform([[-3.15249+0.j , 2.62216-3.02425j, ..., -0.03634-0.00005j, 0.03670+0.00029j], [-4.28429+0.j , 2.02009+2.01264j, ..., -0.01981-0.00933j, -0.00536+0.02162j], ..., [-4.92274+0.j , 4.09839-9.42525j, ..., 0.00550-0.00257j, 0.00137+0.00577j], [-9.22709+0.j , 8.76929+4.0005j , ..., 0.00981-0.00014j, -0.00984+0.00006j]], dtype=complex64)
A ShortTimeFourierTransform can be instantiated directly from a file name:
>>> stft = ShortTimeFourierTransform('tests/data/audio/sample.wav') >>> stft ShortTimeFourierTransform([[...]], dtype=complex64)
Doing the same with a Signal of float data-type will result in a STFT of same value range (rounding errors will occur of course):
>>> sig = Signal('tests/data/audio/sample.wav', dtype=np.float) >>> sig Signal([-0.07611, -0.0766 , ..., 0.01999, 0.0195 ]) >>> frames = FramedSignal(sig, frame_size=2048, hop_size=441) >>> frames <madmom.audio.signal.FramedSignal object at 0x...> >>> stft = ShortTimeFourierTransform(frames) >>> stft ShortTimeFourierTransform([[-3.15240+0.j , 2.62208-3.02415j, ..., -0.03633-0.00005j, 0.03670+0.00029j], [-4.28416+0.j , 2.02003+2.01257j, ..., -0.01981-0.00933j, -0.00536+0.02162j], ..., [-4.92259+0.j , 4.09827-9.42496j, ..., 0.00550-0.00257j, 0.00137+0.00577j], [-9.22681+0.j , 8.76902+4.00038j, ..., 0.00981-0.00014j, -0.00984+0.00006j]], dtype=complex64)
Additional arguments are passed to
FramedSignal
andSignal
respectively:>>> stft = ShortTimeFourierTransform('tests/data/audio/sample.wav', frame_size=2048, fps=100, sample_rate=22050) >>> stft.frames <madmom.audio.signal.FramedSignal object at 0x...> >>> stft.frames.frame_size 2048 >>> stft.frames.hop_size 220.5 >>> stft.frames.signal.sample_rate 22050
-
spec
(**kwargs)[source]¶ Returns the magnitude spectrogram of the STFT.
Parameters: kwargs : dict, optional
Keyword arguments passed to
audio.spectrogram.Spectrogram
.Returns: spec :
audio.spectrogram.Spectrogram
audio.spectrogram.Spectrogram
instance.
-
-
madmom.audio.stft.
STFT
¶ alias of
ShortTimeFourierTransform
-
class
madmom.audio.stft.
ShortTimeFourierTransformProcessor
(window=<function hanning>, fft_size=None, circular_shift=False, **kwargs)[source]¶ ShortTimeFourierTransformProcessor class.
Parameters: window : numpy ufunc, optional
Window function.
fft_size : int, optional
FFT size (should be a power of 2); if ‘None’, it is determined by the size of the frames; if is greater than the frame size, the frames are zero-padded accordingly.
circular_shift : bool, optional
Circular shift the individual frames before performing the FFT; needed for correct phase.
Examples
Create a
ShortTimeFourierTransformProcessor
and call it with either a file name or a the output of a (Framed-)SignalProcessor to obtain aShortTimeFourierTransform
instance.>>> proc = ShortTimeFourierTransformProcessor() >>> stft = proc('tests/data/audio/sample.wav') >>> stft ShortTimeFourierTransform([[-3.15249+0.j , 2.62216-3.02425j, ..., -0.03634-0.00005j, 0.03670+0.00029j], [-4.28429+0.j , 2.02009+2.01264j, ..., -0.01981-0.00933j, -0.00536+0.02162j], ..., [-4.92274+0.j , 4.09839-9.42525j, ..., 0.00550-0.00257j, 0.00137+0.00577j], [-9.22709+0.j , 8.76929+4.0005j , ..., 0.00981-0.00014j, -0.00984+0.00006j]], dtype=complex64)
-
process
(data, **kwargs)[source]¶ Perform FFT on a framed signal and return the STFT.
Parameters: data : numpy array
Data to be processed.
kwargs : dict, optional
Keyword arguments passed to
ShortTimeFourierTransform
.Returns: stft :
ShortTimeFourierTransform
ShortTimeFourierTransform
instance.
-
static
add_arguments
(parser, window=None, fft_size=None)[source]¶ Add STFT related arguments to an existing parser.
Parameters: parser : argparse parser instance
Existing argparse parser.
window : numpy ufunc, optional
Window function.
fft_size : int, optional
Use this size for FFT (should be a power of 2).
Returns: argparse argument group
STFT argument parser group.
Notes
Parameters are included in the group only if they are not ‘None’.
-
-
madmom.audio.stft.
STFTProcessor
¶ alias of
ShortTimeFourierTransformProcessor
-
class
madmom.audio.stft.
Phase
(stft, **kwargs)[source]¶ Phase class.
Parameters: stft :
ShortTimeFourierTransform
instanceShortTimeFourierTransform
instance.kwargs : dict, optional
If no
ShortTimeFourierTransform
instance was given, one is instantiated with these additional keyword arguments.Examples
Create a
Phase
from aShortTimeFourierTransform
(or anything it can be instantiated from:>>> stft = ShortTimeFourierTransform('tests/data/audio/sample.wav') >>> phase = Phase(stft) >>> phase Phase([[ 3.14159, -0.85649, ..., -3.14016, 0.00779], [ 3.14159, 0.78355, ..., -2.70136, 1.81393], ..., [ 3.14159, -1.16063, ..., -0.4373 , 1.33774], [ 3.14159, 0.42799, ..., -0.0142 , 3.13592]], dtype=float32)
-
local_group_delay
(**kwargs)[source]¶ Returns the local group delay of the phase.
Parameters: kwargs : dict, optional
Keyword arguments passed to
LocalGroupDelay
.Returns: lgd :
LocalGroupDelay
instanceLocalGroupDelay
instance.
-
lgd
(**kwargs)¶ Returns the local group delay of the phase.
Parameters: kwargs : dict, optional
Keyword arguments passed to
LocalGroupDelay
.Returns: lgd :
LocalGroupDelay
instanceLocalGroupDelay
instance.
-
-
class
madmom.audio.stft.
LocalGroupDelay
(phase, **kwargs)[source]¶ Local Group Delay class.
Parameters: stft :
Phase
instancePhase
instance.kwargs : dict, optional
If no
Phase
instance was given, one is instantiated with these additional keyword arguments.Examples
Create a
LocalGroupDelay
from aShortTimeFourierTransform
(or anything it can be instantiated from:>>> stft = ShortTimeFourierTransform('tests/data/audio/sample.wav') >>> lgd = LocalGroupDelay(stft) >>> lgd LocalGroupDelay([[-2.2851 , -2.25605, ..., 3.13525, 0. ], [ 2.35804, 2.53786, ..., 1.76788, 0. ], ..., [-1.98..., -2.93039, ..., -1.77505, 0. ], [ 2.7136 , 2.60925, ..., 3.13318, 0. ]])
-
madmom.audio.stft.
LGD
¶ alias of
LocalGroupDelay