madmom.features.onsets¶
This module contains onset detection related functionality.
-
madmom.features.onsets.
wrap_to_pi
(phase)[source]¶ Wrap the phase information to the range -π...π.
Parameters: phase : numpy array
Phase of the STFT.
Returns: wrapped_phase : numpy array
Wrapped phase.
-
madmom.features.onsets.
correlation_diff
(spec, diff_frames=1, pos=False, diff_bins=1)[source]¶ Calculates the difference of the magnitude spectrogram relative to the N-th previous frame shifted in frequency to achieve the highest correlation between these two frames.
Parameters: spec : numpy array
Magnitude spectrogram.
diff_frames : int, optional
Calculate the difference to the diff_frames-th previous frame.
pos : bool, optional
Keep only positive values.
diff_bins : int, optional
Maximum number of bins shifted for correlation calculation.
Returns: correlation_diff : numpy array
(Positive) magnitude spectrogram differences.
Notes
This function is only because of completeness, it is not intended to be actually used, since it is extremely slow. Please consider the superflux() function, since if performs equally well but much faster.
-
madmom.features.onsets.
high_frequency_content
(spectrogram)[source]¶ High Frequency Content.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram instance.
Returns: high_frequency_content : numpy array
High frequency content onset detection function.
References
[R47] Paul Masri, “Computer Modeling of Sound for Transformation and Synthesis of Musical Signals”, PhD thesis, University of Bristol, 1996.
-
madmom.features.onsets.
spectral_diff
(spectrogram, diff_frames=None)[source]¶ Spectral Diff.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram instance.
diff_frames : int, optional
Number of frames to calculate the diff to.
Returns: spectral_diff : numpy array
Spectral diff onset detection function.
References
[R48] Chris Duxbury, Mark Sandler and Matthew Davis, “A hybrid approach to musical note onset detection”, Proceedings of the 5th International Conference on Digital Audio Effects (DAFx), 2002.
-
madmom.features.onsets.
spectral_flux
(spectrogram, diff_frames=None)[source]¶ Spectral Flux.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram instance.
diff_frames : int, optional
Number of frames to calculate the diff to.
Returns: spectral_flux : numpy array
Spectral flux onset detection function.
References
[R49] Paul Masri, “Computer Modeling of Sound for Transformation and Synthesis of Musical Signals”, PhD thesis, University of Bristol, 1996.
-
madmom.features.onsets.
superflux
(spectrogram, diff_frames=None, diff_max_bins=3)[source]¶ SuperFlux method with a maximum filter vibrato suppression stage.
Calculates the difference of bin k of the magnitude spectrogram relative to the N-th previous frame with the maximum filtered spectrogram.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram instance.
diff_frames : int, optional
Number of frames to calculate the diff to.
diff_max_bins : int, optional
Number of bins used for maximum filter.
Returns: superflux : numpy array
SuperFlux onset detection function.
Notes
This method works only properly, if the spectrogram is filtered with a filterbank of the right frequency spacing. Filter banks with 24 bands per octave (i.e. quarter-tone resolution) usually yield good results. With max_bins = 3, the maximum of the bins k-1, k, k+1 of the frame diff_frames to the left is used for the calculation of the difference.
References
[R50] Sebastian Böck and Gerhard Widmer, “Maximum Filter Vibrato Suppression for Onset Detection”, Proceedings of the 16th International Conference on Digital Audio Effects (DAFx), 2013.
-
madmom.features.onsets.
complex_flux
(spectrogram, diff_frames=None, diff_max_bins=3, temporal_filter=3, temporal_origin=0)[source]¶ ComplexFlux.
ComplexFlux is based on the SuperFlux, but adds an additional local group delay based tremolo suppression.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram
instance.diff_frames : int, optional
Number of frames to calculate the diff to.
diff_max_bins : int, optional
Number of bins used for maximum filter.
temporal_filter : int, optional
Temporal maximum filtering of the local group delay [frames].
temporal_origin : int, optional
Origin of the temporal maximum filter.
Returns: complex_flux : numpy array
ComplexFlux onset detection function.
References
[R51] Sebastian Böck and Gerhard Widmer, “Local group delay based vibrato and tremolo suppression for onset detection”, Proceedings of the 14th International Society for Music Information Retrieval Conference (ISMIR), 2013.
-
madmom.features.onsets.
modified_kullback_leibler
(spectrogram, diff_frames=1, epsilon=2.2204460492503131e-16)[source]¶ Modified Kullback-Leibler.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram
instance.diff_frames : int, optional
Number of frames to calculate the diff to.
epsilon : float, optional
Add epsilon to the spectrogram avoid division by 0.
Returns: modified_kullback_leibler : numpy array
MKL onset detection function.
Notes
The implementation presented in [R52] is used instead of the original work presented in [R53].
References
[R52] (1, 2) Paul Brossier, “Automatic Annotation of Musical Audio for Interactive Applications”, PhD thesis, Queen Mary University of London, 2006. [R53] (1, 2) Stephen Hainsworth and Malcolm Macleod, “Onset Detection in Musical Audio Signals”, Proceedings of the International Computer Music Conference (ICMC), 2003.
-
madmom.features.onsets.
phase_deviation
(spectrogram)[source]¶ Phase Deviation.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram
instance.Returns: phase_deviation : numpy array
Phase deviation onset detection function.
References
[R54] Juan Pablo Bello, Chris Duxbury, Matthew Davies and Mark Sandler, “On the use of phase and energy for musical onset detection in the complex domain”, IEEE Signal Processing Letters, Volume 11, Number 6, 2004.
-
madmom.features.onsets.
weighted_phase_deviation
(spectrogram)[source]¶ Weighted Phase Deviation.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram
instance.Returns: weighted_phase_deviation : numpy array
Weighted phase deviation onset detection function.
References
[R55] Simon Dixon, “Onset Detection Revisited”, Proceedings of the 9th International Conference on Digital Audio Effects (DAFx), 2006.
-
madmom.features.onsets.
normalized_weighted_phase_deviation
(spectrogram, epsilon=2.2204460492503131e-16)[source]¶ Normalized Weighted Phase Deviation.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram
instance.epsilon : float, optional
Add epsilon to the spectrogram avoid division by 0.
Returns: normalized_weighted_phase_deviation : numpy array
Normalized weighted phase deviation onset detection function.
References
[R56] Simon Dixon, “Onset Detection Revisited”, Proceedings of the 9th International Conference on Digital Audio Effects (DAFx), 2006.
-
madmom.features.onsets.
complex_domain
(spectrogram)[source]¶ Complex Domain.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram
instance.Returns: complex_domain : numpy array
Complex domain onset detection function.
References
[R57] Juan Pablo Bello, Chris Duxbury, Matthew Davies and Mark Sandler, “On the use of phase and energy for musical onset detection in the complex domain”, IEEE Signal Processing Letters, Volume 11, Number 6, 2004.
-
madmom.features.onsets.
rectified_complex_domain
(spectrogram, diff_frames=None)[source]¶ Rectified Complex Domain.
Parameters: spectrogram :
Spectrogram
instanceSpectrogram
instance.diff_frames : int, optional
Number of frames to calculate the diff to.
Returns: rectified_complex_domain : numpy array
Rectified complex domain onset detection function.
References
[R58] Simon Dixon, “Onset Detection Revisited”, Proceedings of the 9th International Conference on Digital Audio Effects (DAFx), 2006.
-
class
madmom.features.onsets.
SpectralOnsetProcessor
(onset_method='spectral_flux', **kwargs)[source]¶ The SpectralOnsetProcessor class implements most of the common onset detection functions based on the magnitude or phase information of a spectrogram.
Parameters: onset_method : str, optional
Onset detection function. See METHODS for possible values.
kwargs : dict, optional
Keyword arguments passed to the pre-processing chain to obtain a spectral representation of the signal.
Notes
If the spectrogram should be filtered, the filterbank parameter must contain a valid Filterbank, if it should be scaled logarithmically, log must be set accordingly.
References
[R59] (1, 2) Paul Masri, “Computer Modeling of Sound for Transformation and Synthesis of Musical Signals”, PhD thesis, University of Bristol, 1996. [R60] (1, 2) Sebastian Böck and Gerhard Widmer, “Maximum Filter Vibrato Suppression for Onset Detection”, Proceedings of the 16th International Conference on Digital Audio Effects (DAFx), 2013. Examples
Create a SpectralOnsetProcessor and pass a file through the processor to obtain an onset detection function. Per default the spectral flux [R59] is computed on a simple Spectrogram.
>>> sodf = SpectralOnsetProcessor() >>> sodf <madmom.features.onsets.SpectralOnsetProcessor object at 0x...> >>> sodf.processors[-1] <function spectral_flux at 0x...> >>> sodf('tests/data/audio/sample.wav') ... array([ 0. , 100.90121, ..., 26.30577, 20.94439], dtype=float32)
The parameters passed to the signal pre-processing chain can be set when creating the SpectralOnsetProcessor. E.g. to obtain the SuperFlux [R60] onset detection function set these parameters:
>>> from madmom.audio.filters import LogarithmicFilterbank >>> sodf = SpectralOnsetProcessor(onset_method='superflux', fps=200, ... filterbank=LogarithmicFilterbank, ... num_bands=24, log=np.log10) >>> sodf('tests/data/audio/sample.wav') ... array([ 0. , 0. , 2.0868 , 1.02404, ..., 0.29888, 0.12122], dtype=float32)
-
classmethod
add_arguments
(parser, onset_method=None)[source]¶ Add spectral onset detection arguments to an existing parser.
Parameters: parser : argparse parser instance
Existing argparse parser object.
onset_method : str, optional
Default onset detection method.
Returns: parser_group : argparse argument group
Spectral onset detection argument parser group.
-
classmethod
-
class
madmom.features.onsets.
RNNOnsetProcessor
(**kwargs)[source]¶ Processor to get a onset activation function from multiple RNNs.
Parameters: online : bool, optional
Choose networks suitable for online onset detection, i.e. use unidirectional RNNs.
Notes
This class uses either uni- or bi-directional RNNs. Contrary to [1], it uses simple tanh units as in [2]. Also the input representations changed to use logarithmically filtered and scaled spectrograms.
References
[R61] “Universal Onset Detection with bidirectional Long Short-Term Memory Neural Networks” Florian Eyben, Sebastian Böck, Björn Schuller and Alex Graves. Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR), 2010. [R62] “Online Real-time Onset Detection with Recurrent Neural Networks” Sebastian Böck, Andreas Arzt, Florian Krebs and Markus Schedl. Proceedings of the 15th International Conference on Digital Audio Effects (DAFx), 2012. Examples
Create a RNNOnsetProcessor and pass a file through the processor to obtain an onset detection function (sampled with 100 frames per second).
>>> proc = RNNOnsetProcessor() >>> proc <madmom.features.onsets.RNNOnsetProcessor object at 0x...> >>> proc('tests/data/audio/sample.wav') array([ 0.08313, 0.0024 , ..., 0.00205, 0.00527], dtype=float32)
-
class
madmom.features.onsets.
CNNOnsetProcessor
(**kwargs)[source]¶ Processor to get a onset activation function from a CNN.
Notes
The implementation follows as closely as possible the original one, but part of the signal pre-processing differs in minor aspects, so results can differ slightly, too.
References
[R63] “Musical Onset Detection with Convolutional Neural Networks” Jan Schlüter and Sebastian Böck. Proceedings of the 6th International Workshop on Machine Learning and Music, 2013. Examples
Create a CNNOnsetProcessor and pass a file through the processor to obtain an onset detection function (sampled with 100 frames per second).
>>> proc = CNNOnsetProcessor() >>> proc <madmom.features.onsets.CNNOnsetProcessor object at 0x...> >>> proc('tests/data/audio/sample.wav') array([ 0.05369, 0.04205, ..., 0.00024, 0.00014], dtype=float32)
-
madmom.features.onsets.
peak_picking
(activations, threshold, smooth=None, pre_avg=0, post_avg=0, pre_max=1, post_max=1)[source]¶ Perform thresholding and peak-picking on the given activation function.
Parameters: activations : numpy array
Activation function.
threshold : float
Threshold for peak-picking
smooth : int or numpy array, optional
Smooth the activation function with the kernel (size).
pre_avg : int, optional
Use pre_avg frames past information for moving average.
post_avg : int, optional
Use post_avg frames future information for moving average.
pre_max : int, optional
Use pre_max frames past information for moving maximum.
post_max : int, optional
Use post_max frames future information for moving maximum.
Returns: peak_idx : numpy array
Indices of the detected peaks.
See also
smooth()
Notes
If no moving average is needed (e.g. the activations are independent of the signal’s level as for neural network activations), set pre_avg and post_avg to 0. For peak picking of local maxima, set pre_max and post_max to 1. For online peak picking, set all post_ parameters to 0.
References
[R64] Sebastian Böck, Florian Krebs and Markus Schedl, “Evaluating the Online Capabilities of Onset Detection Methods”, Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR), 2012.
-
class
madmom.features.onsets.
PeakPickingProcessor
(**kwargs)[source]¶ Deprecated as of version 0.15. Will be removed in version 0.16. Use either
OnsetPeakPickingProcessor
orNotePeakPickingProcessor
instead.-
process
(activations, **kwargs)[source]¶ Detect the peaks in the given activation function.
Parameters: activations : numpy array
Onset activation function.
Returns: peaks : numpy array
Detected onsets [seconds[, frequency bin]].
-
static
add_arguments
(parser, **kwargs)[source]¶ Deprecated as of version 0.15. Will be removed in version 0.16. Use either
OnsetPeakPickingProcessor
orNotePeakPickingProcessor
instead.
-
-
class
madmom.features.onsets.
OnsetPeakPickingProcessor
(threshold=0.5, smooth=0.0, pre_avg=0.0, post_avg=0.0, pre_max=0.0, post_max=0.0, combine=0.03, delay=0.0, online=False, fps=100, **kwargs)[source]¶ This class implements the onset peak-picking functionality. It transparently converts the chosen values from seconds to frames.
Parameters: threshold : float
Threshold for peak-picking.
smooth : float, optional
Smooth the activation function over smooth seconds.
pre_avg : float, optional
Use pre_avg seconds past information for moving average.
post_avg : float, optional
Use post_avg seconds future information for moving average.
pre_max : float, optional
Use pre_max seconds past information for moving maximum.
post_max : float, optional
Use post_max seconds future information for moving maximum.
combine : float, optional
Only report one onset within combine seconds.
delay : float, optional
Report the detected onsets delay seconds delayed.
online : bool, optional
Use online peak-picking, i.e. no future information.
fps : float, optional
Frames per second used for conversion of timings.
Returns: onsets : numpy array
Detected onsets [seconds].
Notes
If no moving average is needed (e.g. the activations are independent of the signal’s level as for neural network activations), pre_avg and post_avg should be set to 0. For peak picking of local maxima, set pre_max >= 1. / fps and post_max >= 1. / fps. For online peak picking, all post_ parameters are set to 0.
References
[R65] Sebastian Böck, Florian Krebs and Markus Schedl, “Evaluating the Online Capabilities of Onset Detection Methods”, Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR), 2012. Examples
Create a PeakPickingProcessor. The returned array represents the positions of the onsets in seconds, thus the expected sampling rate has to be given.
>>> proc = OnsetPeakPickingProcessor(fps=100) >>> proc <madmom.features.onsets.OnsetPeakPickingProcessor object at 0x...>
Call this OnsetPeakPickingProcessor with the onset activation function from an RNNOnsetProcessor to obtain the onset positions.
>>> act = RNNOnsetProcessor()('tests/data/audio/sample.wav') >>> proc(act) array([ 0.09, 0.29, 0.45, ..., 2.34, 2.49, 2.67])
-
process
(activations, **kwargs)[source]¶ Detect the onsets in the given activation function.
Parameters: activations : numpy array
Onset activation function.
Returns: onsets : numpy array
Detected onsets [seconds].
-
process_sequence
(activations, **kwargs)[source]¶ Detect the onsets in the given activation function.
Parameters: activations : numpy array
Onset activation function.
Returns: onsets : numpy array
Detected onsets [seconds].
-
process_online
(activations, reset=True, **kwargs)[source]¶ Detect the onsets in the given activation function.
Parameters: activations : numpy array
Onset activation function.
Returns: onsets : numpy array
Detected onsets [seconds].
-
static
add_arguments
(parser, threshold=0.5, smooth=None, pre_avg=None, post_avg=None, pre_max=None, post_max=None, combine=0.03, delay=0.0)[source]¶ Add onset peak-picking related arguments to an existing parser.
Parameters: parser : argparse parser instance
Existing argparse parser object.
threshold : float
Threshold for peak-picking.
smooth : float, optional
Smooth the activation function over smooth seconds.
pre_avg : float, optional
Use pre_avg seconds past information for moving average.
post_avg : float, optional
Use post_avg seconds future information for moving average.
pre_max : float, optional
Use pre_max seconds past information for moving maximum.
post_max : float, optional
Use post_max seconds future information for moving maximum.
combine : float, optional
Only report one onset within combine seconds.
delay : float, optional
Report the detected onsets delay seconds delayed.
Returns: parser_group : argparse argument group
Onset peak-picking argument parser group.
Notes
Parameters are included in the group only if they are not ‘None’.
-