madmom.features.onsets¶
This module contains onset detection related functionality.
-
madmom.features.onsets.
wrap_to_pi
(phase)[source]¶ Wrap the phase information to the range -π...π.
Parameters: phase : numpy array
Phase of the STFT.
Returns: wrapped_phase : numpy array
Wrapped phase.
-
madmom.features.onsets.
correlation_diff
(spec, diff_frames=1, pos=False, diff_bins=1)[source]¶ Calculates the difference of the magnitude spectrogram relative to the N-th previous frame shifted in frequency to achieve the highest correlation between these two frames.
Parameters: spec : numpy array
Magnitude spectrogram.
diff_frames : int, optional
Calculate the difference to the diff_frames-th previous frame.
pos : bool, optional
Keep only positive values.
diff_bins : int, optional
Maximum number of bins shifted for correlation calculation.
Returns: correlation_diff : numpy array
(Positive) magnitude spectrogram differences.
Notes
This function is only because of completeness, it is not intended to be actually used, since it is extremely slow. Please consider the superflux() function, since if performs equally well but much faster.
-
madmom.features.onsets.
high_frequency_content
(spectrogram)[source]¶ High Frequency Content.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram instance.
Returns: high_frequency_content : numpy array
High frequency content onset detection function.
References
[R35] Paul Masri, “Computer Modeling of Sound for Transformation and Synthesis of Musical Signals”, PhD thesis, University of Bristol, 1996.
-
madmom.features.onsets.
spectral_diff
(spectrogram, diff_frames=None)[source]¶ Spectral Diff.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram instance.
diff_frames : int, optional
Number of frames to calculate the diff to.
Returns: spectral_diff : numpy array
Spectral diff onset detection function.
References
[R36] Chris Duxbury, Mark Sandler and Matthew Davis, “A hybrid approach to musical note onset detection”, Proceedings of the 5th International Conference on Digital Audio Effects (DAFx), 2002.
-
madmom.features.onsets.
spectral_flux
(spectrogram, diff_frames=None)[source]¶ Spectral Flux.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram instance.
diff_frames : int, optional
Number of frames to calculate the diff to.
Returns: spectral_flux : numpy array
Spectral flux onset detection function.
References
[R37] Paul Masri, “Computer Modeling of Sound for Transformation and Synthesis of Musical Signals”, PhD thesis, University of Bristol, 1996.
-
madmom.features.onsets.
superflux
(spectrogram, diff_frames=None, diff_max_bins=3)[source]¶ SuperFlux method with a maximum filter vibrato suppression stage.
Calculates the difference of bin k of the magnitude spectrogram relative to the N-th previous frame with the maximum filtered spectrogram.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram instance.
diff_frames : int, optional
Number of frames to calculate the diff to.
diff_max_bins : int, optional
Number of bins used for maximum filter.
Returns: superflux : numpy array
SuperFlux onset detection function.
Notes
This method works only properly, if the spectrogram is filtered with a filterbank of the right frequency spacing. Filter banks with 24 bands per octave (i.e. quarter-tone resolution) usually yield good results. With max_bins = 3, the maximum of the bins k-1, k, k+1 of the frame diff_frames to the left is used for the calculation of the difference.
References
[R38] Sebastian Böck and Gerhard Widmer, “Maximum Filter Vibrato Suppression for Onset Detection”, Proceedings of the 16th International Conference on Digital Audio Effects (DAFx), 2013.
-
madmom.features.onsets.
complex_flux
(spectrogram, diff_frames=None, diff_max_bins=3, temporal_filter=3, temporal_origin=0)[source]¶ ComplexFlux.
ComplexFlux is based on the SuperFlux, but adds an additional local group delay based tremolo suppression.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram
instance.diff_frames : int, optional
Number of frames to calculate the diff to.
diff_max_bins : int, optional
Number of bins used for maximum filter.
temporal_filter : int, optional
Temporal maximum filtering of the local group delay [frames].
temporal_origin : int, optional
Origin of the temporal maximum filter.
Returns: complex_flux : numpy array
ComplexFlux onset detection function.
References
[R39] Sebastian Böck and Gerhard Widmer, “Local group delay based vibrato and tremolo suppression for onset detection”, Proceedings of the 14th International Society for Music Information Retrieval Conference (ISMIR), 2013.
-
madmom.features.onsets.
modified_kullback_leibler
(spectrogram, diff_frames=1, epsilon=1e-06)[source]¶ Modified Kullback-Leibler.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram
instance.diff_frames : int, optional
Number of frames to calculate the diff to.
epsilon : float, optional
Add epsilon to the spectrogram avoid division by 0.
Returns: modified_kullback_leibler : numpy array
MKL onset detection function.
Notes
The implementation presented in [R40] is used instead of the original work presented in [R41].
References
[R40] (1, 2) Paul Brossier, “Automatic Annotation of Musical Audio for Interactive Applications”, PhD thesis, Queen Mary University of London, 2006. [R41] (1, 2) Stephen Hainsworth and Malcolm Macleod, “Onset Detection in Musical Audio Signals”, Proceedings of the International Computer Music Conference (ICMC), 2003.
-
madmom.features.onsets.
phase_deviation
(spectrogram)[source]¶ Phase Deviation.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram
instance.Returns: phase_deviation : numpy array
Phase deviation onset detection function.
References
[R42] Juan Pablo Bello, Chris Duxbury, Matthew Davies and Mark Sandler, “On the use of phase and energy for musical onset detection in the complex domain”, IEEE Signal Processing Letters, Volume 11, Number 6, 2004.
-
madmom.features.onsets.
weighted_phase_deviation
(spectrogram)[source]¶ Weighted Phase Deviation.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram
instance.Returns: weighted_phase_deviation : numpy array
Weighted phase deviation onset detection function.
References
[R43] Simon Dixon, “Onset Detection Revisited”, Proceedings of the 9th International Conference on Digital Audio Effects (DAFx), 2006.
-
madmom.features.onsets.
normalized_weighted_phase_deviation
(spectrogram, epsilon=1e-06)[source]¶ Normalized Weighted Phase Deviation.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram
instance.epsilon : float, optional
Add epsilon to the spectrogram avoid division by 0.
Returns: normalized_weighted_phase_deviation : numpy array
Normalized weighted phase deviation onset detection function.
References
[R44] Simon Dixon, “Onset Detection Revisited”, Proceedings of the 9th International Conference on Digital Audio Effects (DAFx), 2006.
-
madmom.features.onsets.
complex_domain
(spectrogram)[source]¶ Complex Domain.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram
instance.Returns: complex_domain : numpy array
Complex domain onset detection function.
References
[R45] Juan Pablo Bello, Chris Duxbury, Matthew Davies and Mark Sandler, “On the use of phase and energy for musical onset detection in the complex domain”, IEEE Signal Processing Letters, Volume 11, Number 6, 2004.
-
madmom.features.onsets.
rectified_complex_domain
(spectrogram, diff_frames=None)[source]¶ Rectified Complex Domain.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram
instance.diff_frames : int, optional
Number of frames to calculate the diff to.
Returns: rectified_complex_domain : numpy array
Rectified complex domain onset detection function.
References
[R46] Simon Dixon, “Onset Detection Revisited”, Proceedings of the 9th International Conference on Digital Audio Effects (DAFx), 2006.
-
class
madmom.features.onsets.
SpectralOnsetProcessor
(onset_method='superflux', **kwargs)[source]¶ The SpectralOnsetProcessor class implements most of the common onset detection functions based on the magnitude or phase information of a spectrogram.
Parameters: onset_method : str, optional
Onset detection function. See METHODS for possible values.
-
process
(spectrogram)[source]¶ Detect the onsets in the given activation function.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram
instance.Returns: odf : numpy array
Onset detection function.
-
classmethod
add_arguments
(parser, onset_method=None)[source]¶ Add spectral onset detection arguments to an existing parser.
Parameters: parser : argparse parser instance
Existing argparse parser object.
onset_method : str, optional
Default onset detection method.
Returns: parser_group : argparse argument group
Spectral onset detection argument parser group.
-
-
madmom.features.onsets.
peak_picking
(activations, threshold, smooth=None, pre_avg=0, post_avg=0, pre_max=1, post_max=1)[source]¶ Perform thresholding and peak-picking on the given activation function.
Parameters: activations : numpy array
Activation function.
threshold : float
Threshold for peak-picking
smooth : int or numpy array
Smooth the activation function with the kernel (size).
pre_avg : int, optional
Use pre_avg frames past information for moving average.
post_avg : int, optional
Use post_avg frames future information for moving average.
pre_max : int, optional
Use pre_max frames past information for moving maximum.
post_max : int, optional
Use post_max frames future information for moving maximum.
Returns: peak_idx : numpy array
Indices of the detected peaks.
See also
smooth()
Notes
If no moving average is needed (e.g. the activations are independent of the signal’s level as for neural network activations), set pre_avg and post_avg to 0. For peak picking of local maxima, set pre_max and post_max to 1. For online peak picking, set all post_ parameters to 0.
References
[R47] Sebastian Böck, Florian Krebs and Markus Schedl, “Evaluating the Online Capabilities of Onset Detection Methods”, Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR), 2012.
-
class
madmom.features.onsets.
PeakPickingProcessor
(threshold=0.5, smooth=0.0, pre_avg=0.0, post_avg=0.0, pre_max=0.0, post_max=0.0, combine=0.03, delay=0.0, online=False, fps=100, **kwargs)[source]¶ This class implements the onset peak-picking functionality which can be used universally. It transparently converts the chosen values from seconds to frames.
Parameters: threshold : float
Threshold for peak-picking.
smooth : float, optional
Smooth the activation function over smooth seconds.
pre_avg : float, optional
Use pre_avg seconds past information for moving average.
post_avg : float, optional
Use post_avg seconds future information for moving average.
pre_max : float, optional
Use pre_max seconds past information for moving maximum.
post_max : float, optional
Use post_max seconds future information for moving maximum.
combine : float, optional
Only report one onset within combine seconds.
delay : float, optional
Report the detected onsets delay seconds delayed.
online : bool, optional
Use online peak-picking, i.e. no future information.
fps : float, optional
Frames per second used for conversion of timings.
Returns: onsets : numpy array
Detected onsets [seconds].
Notes
If no moving average is needed (e.g. the activations are independent of the signal’s level as for neural network activations), pre_avg and post_avg should be set to 0. For peak picking of local maxima, set pre_max >= 1. / fps and post_max >= 1. / fps. For online peak picking, all post_ parameters are set to 0.
References
[R48] Sebastian Böck, Florian Krebs and Markus Schedl, “Evaluating the Online Capabilities of Onset Detection Methods”, Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR), 2012. -
process
(activations)[source]¶ Detect the onsets in the given activation function.
Parameters: activations : numpy array
Onset activation function.
Returns: onsets : numpy array
Detected onsets [seconds].
-
static
add_arguments
(parser, threshold=0.5, smooth=None, pre_avg=None, post_avg=None, pre_max=None, post_max=None, combine=0.03, delay=0.0)[source]¶ Add onset peak-picking related arguments to an existing parser.
Parameters: parser : argparse parser instance
Existing argparse parser object.
threshold : float
Threshold for peak-picking.
smooth : float, optional
Smooth the activation function over smooth seconds.
pre_avg : float, optional
Use pre_avg seconds past information for moving average.
post_avg : float, optional
Use post_avg seconds future information for moving average.
pre_max : float, optional
Use pre_max seconds past information for moving maximum.
post_max : float, optional
Use post_max seconds future information for moving maximum.
combine : float, optional
Only report one onset within combine seconds.
delay : float, optional
Report the detected onsets delay seconds delayed.
Returns: parser_group : argparse argument group
Onset peak-picking argument parser group.
Notes
Parameters are included in the group only if they are not ‘None’.
-