madmom.audio.spectrogram¶
This module contains spectrogram related functionality.
-
madmom.audio.spectrogram.
spec
(stft)[source]¶ Computes the magnitudes of the complex Short Time Fourier Transform of a signal.
Parameters: stft : numpy array
Complex STFT of a signal.
Returns: spec : numpy array
Magnitude spectrogram.
-
class
madmom.audio.spectrogram.
Spectrogram
(stft, **kwargs)[source]¶ A
Spectrogram
represents the magnitude spectrogram of aaudio.stft.ShortTimeFourierTransform
.Parameters: stft :
audio.stft.ShortTimeFourierTransform
instanceShort Time Fourier Transform.
kwargs : dict, optional
If no
audio.stft.ShortTimeFourierTransform
instance was given, one is instantiated with these additional keyword arguments.Attributes
stft ( audio.stft.ShortTimeFourierTransform
instance) Underlying ShortTimeFourierTransform instance.frames ( audio.signal.FramedSignal
instance) Underlying FramedSignal instance.-
diff
(**kwargs)[source]¶ Return the difference of the magnitude spectrogram.
Parameters: kwargs : dict
Keyword arguments passed to
SpectrogramDifference
.Returns: diff :
SpectrogramDifference
instanceThe differences of the magnitude spectrogram.
-
filter
(**kwargs)[source]¶ Return a filtered version of the magnitude spectrogram.
Parameters: kwargs : dict
Keyword arguments passed to
FilteredSpectrogram
.Returns: filt_spec :
FilteredSpectrogram
instanceFiltered version of the magnitude spectrogram.
-
log
(**kwargs)[source]¶ Return a logarithmically scaled version of the magnitude spectrogram.
Parameters: kwargs : dict
Keyword arguments passed to
LogarithmicSpectrogram
.Returns: log_spec :
LogarithmicSpectrogram
instanceLogarithmically scaled version of the magnitude spectrogram.
-
-
class
madmom.audio.spectrogram.
SpectrogramProcessor
(**kwargs)[source]¶ SpectrogramProcessor class.
-
process
(data, **kwargs)[source]¶ Create a Spectrogram from the given data.
Parameters: data : numpy array
Data to be processed.
kwargs : dict
Keyword arguments passed to
Spectrogram
.Returns: spec :
Spectrogram
instanceSpectrogram.
-
-
class
madmom.audio.spectrogram.
FilteredSpectrogram
(spectrogram, filterbank=<class 'madmom.audio.filters.LogarithmicFilterbank'>, num_bands=12, fmin=30.0, fmax=17000.0, fref=440.0, norm_filters=True, unique_filters=True, **kwargs)[source]¶ FilteredSpectrogram class.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram.
filterbank :
audio.filters.Filterbank
, optionalFilterbank class or instance; if a class is given (rather than an instance), one will be created with the given type and parameters.
num_bands : int, optional
Number of filter bands (per octave, depending on the type of the filterbank).
fmin : float, optional
Minimum frequency of the filterbank [Hz].
fmax : float, optional
Maximum frequency of the filterbank [Hz].
fref : float, optional
Tuning frequency of the filterbank [Hz].
norm_filters : bool, optional
Normalize the filter bands of the filterbank to area 1.
unique_filters : bool, optional
Indicate if the filterbank should contain only unique filters, i.e. remove duplicate filters resulting from insufficient resolution at low frequencies.
kwargs : dict, optional
If no
Spectrogram
instance was given, one is instantiated with these additional keyword arguments.
-
class
madmom.audio.spectrogram.
FilteredSpectrogramProcessor
(filterbank=<class 'madmom.audio.filters.LogarithmicFilterbank'>, num_bands=12, fmin=30.0, fmax=17000.0, fref=440.0, norm_filters=True, unique_filters=True, **kwargs)[source]¶ FilteredSpectrogramProcessor class.
Parameters: filterbank :
audio.filters.Filterbank
Filterbank used to filter a spectrogram.
num_bands : int
Number of bands (per octave).
fmin : float, optional
Minimum frequency of the filterbank [Hz].
fmax : float, optional
Maximum frequency of the filterbank [Hz].
fref : float, optional
Tuning frequency of the filterbank [Hz].
norm_filters : bool, optional
Normalize the filter of the filterbank to area 1.
unique_filters : bool, optional
Indicate if the filterbank should contain only unique filters, i.e. remove duplicate filters resulting from insufficient resolution at low frequencies.
-
process
(data, **kwargs)[source]¶ Create a FilteredSpectrogram from the given data.
Parameters: data : numpy array
Data to be processed.
kwargs : dict
Keyword arguments passed to
FilteredSpectrogram
.Returns: filt_spec :
FilteredSpectrogram
instanceFiltered spectrogram.
-
-
class
madmom.audio.spectrogram.
LogarithmicSpectrogram
(spectrogram, mul=1.0, add=1.0, **kwargs)[source]¶ LogarithmicSpectrogram class.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram.
mul : float, optional
Multiply the magnitude spectrogram with this factor before taking the logarithm.
add : float, optional
Add this value before taking the logarithm of the magnitudes.
kwargs : dict, optional
If no
Spectrogram
instance was given, one is instantiated with these additional keyword arguments.
-
class
madmom.audio.spectrogram.
LogarithmicSpectrogramProcessor
(mul=1.0, add=1.0, **kwargs)[source]¶ Logarithmic Spectrogram Processor class.
Parameters: mul : float, optional
Multiply the magnitude spectrogram with this factor before taking the logarithm.
add : float, optional
Add this value before taking the logarithm of the magnitudes.
-
process
(data, **kwargs)[source]¶ Perform logarithmic scaling of a spectrogram.
Parameters: data : numpy array
Data to be processed.
kwargs : dict
Keyword arguments passed to
LogarithmicSpectrogram
.Returns: log_spec :
LogarithmicSpectrogram
instanceLogarithmically scaled spectrogram.
-
static
add_arguments
(parser, log=None, mul=None, add=None)[source]¶ Add spectrogram scaling related arguments to an existing parser.
Parameters: parser : argparse parser instance
Existing argparse parser object.
log : bool, optional
Take the logarithm of the spectrogram.
mul : float, optional
Multiply the magnitude spectrogram with this factor before taking the logarithm.
add : float, optional
Add this value before taking the logarithm of the magnitudes.
Returns: argparse argument group
Spectrogram scaling argument parser group.
Notes
Parameters are included in the group only if they are not ‘None’.
-
-
class
madmom.audio.spectrogram.
LogarithmicFilteredSpectrogram
(spectrogram, **kwargs)[source]¶ LogarithmicFilteredSpectrogram class.
Parameters: spectrogram :
FilteredSpectrogram
instanceFiltered spectrogram.
kwargs : dict, optional
If no
FilteredSpectrogram
instance was given, one is instantiated with these additional keyword arguments and logarithmically scaled afterwards, i.e. passed toLogarithmicSpectrogram
.See also
Notes
For the filtering and scaling parameters, please refer to
FilteredSpectrogram
andLogarithmicSpectrogram
.
-
class
madmom.audio.spectrogram.
LogarithmicFilteredSpectrogramProcessor
(filterbank=<class 'madmom.audio.filters.LogarithmicFilterbank'>, num_bands=12, fmin=30.0, fmax=17000.0, fref=440.0, norm_filters=True, unique_filters=True, mul=1.0, add=1.0, **kwargs)[source]¶ Logarithmic Filtered Spectrogram Processor class.
Parameters: filterbank :
audio.filters.Filterbank
Filterbank used to filter a spectrogram.
num_bands : int
Number of bands (per octave).
fmin : float, optional
Minimum frequency of the filterbank [Hz].
fmax : float, optional
Maximum frequency of the filterbank [Hz].
fref : float, optional
Tuning frequency of the filterbank [Hz].
norm_filters : bool, optional
Normalize the filter of the filterbank to area 1.
unique_filters : bool, optional
Indicate if the filterbank should contain only unique filters, i.e. remove duplicate filters resulting from insufficient resolution at low frequencies.
mul : float, optional
Multiply the magnitude spectrogram with this factor before taking the logarithm.
add : float, optional
Add this value before taking the logarithm of the magnitudes.
-
process
(data, **kwargs)[source]¶ Perform filtering and logarithmic scaling of a spectrogram.
Parameters: data : numpy array
Data to be processed.
kwargs : dict
Keyword arguments passed to
LogarithmicFilteredSpectrogram
.Returns: log_filt_spec :
LogarithmicFilteredSpectrogram
instanceLogarithmically scaled filtered spectrogram.
-
-
class
madmom.audio.spectrogram.
SpectrogramDifference
(spectrogram, diff_ratio=0.5, diff_frames=None, diff_max_bins=None, positive_diffs=False, **kwargs)[source]¶ SpectrogramDifference class.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram.
diff_ratio : float, optional
Calculate the difference to the frame at which the window used for the STFT yields this ratio of the maximum height.
diff_frames : int, optional
Calculate the difference to the diff_frames-th previous frame (if set, this overrides the value calculated from the diff_ratio)
diff_max_bins : int, optional
Apply a maximum filter with this width (in bins in frequency dimension) to the spectrogram the difference is calculated to.
positive_diffs : bool, optional
Keep only the positive differences, i.e. set all diff values < 0 to 0.
kwargs : dict, optional
If no
Spectrogram
instance was given, one is instantiated with these additional keyword arguments.Notes
The SuperFlux algorithm [R1] uses a maximum filtered spectrogram with 3 diff_max_bins together with a 24 band logarithmic filterbank to calculate the difference spectrogram with a diff_ratio of 0.5.
The effect of this maximum filter applied to the spectrogram is that the magnitudes are “widened” in frequency direction, i.e. the following difference calculation is less sensitive against frequency fluctuations. This effect is exploitet to suppress false positive energy fragments for onsets detection originating from vibrato.
References
[R1] (1, 2) Sebastian Böck and Gerhard Widmer “Maximum Filter Vibrato Suppression for Onset Detection” Proceedings of the 16th International Conference on Digital Audio Effects (DAFx), 2013.
-
class
madmom.audio.spectrogram.
SpectrogramDifferenceProcessor
(diff_ratio=0.5, diff_frames=None, diff_max_bins=None, positive_diffs=False, stack_diffs=None, **kwargs)[source]¶ Difference Spectrogram Processor class.
Parameters: diff_ratio : float, optional
Calculate the difference to the frame at which the window used for the STFT yields this ratio of the maximum height.
diff_frames : int, optional
Calculate the difference to the diff_frames-th previous frame (if set, this overrides the value calculated from the diff_ratio)
diff_max_bins : int, optional
Apply a maximum filter with this width (in bins in frequency dimension) to the spectrogram the difference is calculated to.
positive_diffs : bool, optional
Keep only the positive differences, i.e. set all diff values < 0 to 0.
stack_diffs : numpy stacking function, optional
If ‘None’, only the differences are returned. If set, the diffs are stacked with the underlying spectrogram data according to the stack function:
np.vstack
the differences and spectrogram are stacked vertically, i.e. in time direction,np.hstack
the differences and spectrogram are stacked horizontally, i.e. in frequency direction,np.dstack
the differences and spectrogram are stacked in depth, i.e. return them as a 3D representation with depth as the third dimension.
-
process
(data, **kwargs)[source]¶ Perform a temporal difference calculation on the given data.
Parameters: data : numpy array
Data to be processed.
kwargs : dict
Keyword arguments passed to
SpectrogramDifference
.Returns: diff :
SpectrogramDifference
instanceSpectrogram difference.
-
static
add_arguments
(parser, diff=None, diff_ratio=None, diff_frames=None, diff_max_bins=None, positive_diffs=None)[source]¶ Add spectrogram difference related arguments to an existing parser.
Parameters: parser : argparse parser instance
Existing argparse parser object.
diff : bool, optional
Take the difference of the spectrogram.
diff_ratio : float, optional
Calculate the difference to the frame at which the window used for the STFT yields this ratio of the maximum height.
diff_frames : int, optional
Calculate the difference to the diff_frames-th previous frame (if set, this overrides the value calculated from the diff_ratio)
diff_max_bins : int, optional
Apply a maximum filter with this width (in bins in frequency dimension) to the spectrogram the difference is calculated to.
positive_diffs : bool, optional
Keep only the positive differences, i.e. set all diff values < 0 to 0.
Returns: argparse argument group
Spectrogram difference argument parser group.
Notes
Parameters are included in the group only if they are not ‘None’.
Only the diff_frames parameter behaves differently, it is included if either the diff_ratio is set or a value != ‘None’ is given.
-
class
madmom.audio.spectrogram.
SuperFluxProcessor
(**kwargs)[source]¶ Spectrogram processor which sets the default values suitable for the SuperFlux algorithm.
-
class
madmom.audio.spectrogram.
MultiBandSpectrogram
(spectrogram, crossover_frequencies, fmin=30.0, fmax=17000.0, norm_filters=True, unique_filters=True, **kwargs)[source]¶ MultiBandSpectrogram class.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram.
crossover_frequencies : list or numpy array
List of crossover frequencies at which the spectrogram is split into multiple bands.
fmin : float, optional
Minimum frequency of the filterbank [Hz].
fmax : float, optional
Maximum frequency of the filterbank [Hz].
norm_filters : bool, optional
Normalize the filter bands of the filterbank to area 1.
unique_filters : bool, optional
Indicate if the filterbank should contain only unique filters, i.e. remove duplicate filters resulting from insufficient resolution at low frequencies.
kwargs : dict, optional
If no
Spectrogram
instance was given, one is instantiated with these additional keyword arguments.Notes
The MultiBandSpectrogram is implemented as a
Spectrogram
which uses aaudio.filters.RectangularFilterbank
to combine multiple frequency bins.
-
class
madmom.audio.spectrogram.
MultiBandSpectrogramProcessor
(crossover_frequencies, fmin=30.0, fmax=17000.0, norm_filters=True, unique_filters=True, **kwargs)[source]¶ Spectrogram processor which combines the spectrogram magnitudes into multiple bands.
Parameters: crossover_frequencies : list or numpy array
List of crossover frequencies at which a spectrogram is split into the individual bands.
fmin : float, optional
Minimum frequency of the filterbank [Hz].
fmax : float, optional
Maximum frequency of the filterbank [Hz].
norm_filters : bool, optional
Normalize the filter bands of the filterbank to area 1.
unique_filters : bool, optional
Indicate if the filterbank should contain only unique filters, i.e. remove duplicate filters resulting from insufficient resolution at low frequencies.
-
process
(data, **kwargs)[source]¶ Return the a multi-band representation of the given data.
Parameters: data : numpy array
Data to be processed.
kwargs : dict
Keyword arguments passed to
MultiBandSpectrogram
.Returns: multi_band_spec :
MultiBandSpectrogram
instanceSpectrogram split into multiple bands.
-
-
class
madmom.audio.spectrogram.
StackedSpectrogramProcessor
[source]¶ Deprecated in v0.13, will be removed in v0.14.
Functionality added to
SpectrogramDifferenceProcessor
as stack_diffs argument.