madmom.features.onsets¶

This module contains onset detection related functionality.

madmom.features.onsets.wrap_to_pi(phase)[source]¶

Wrap the phase information to the range -π…π.

Parameters:	phase : numpy array Phase of the STFT.
Returns:	wrapped_phase : numpy array Wrapped phase.

madmom.features.onsets.correlation_diff(spec, diff_frames=1, pos=False, diff_bins=1)[source]¶

Calculates the difference of the magnitude spectrogram relative to the N-th previous frame shifted in frequency to achieve the highest correlation between these two frames.

Parameters:	spec : numpy array Magnitude spectrogram. diff_frames : int, optional Calculate the difference to the diff_frames-th previous frame. pos : bool, optional Keep only positive values. diff_bins : int, optional Maximum number of bins shifted for correlation calculation.
Returns:	correlation_diff : numpy array (Positive) magnitude spectrogram differences.

Notes

This function is only because of completeness, it is not intended to be actually used, since it is extremely slow. Please consider the superflux() function, since if performs equally well but much faster.

madmom.features.onsets.high_frequency_content(spectrogram)[source]¶

High Frequency Content.

Parameters:	spectrogram : `Spectrogram` instance Spectrogram instance.
Returns:	high_frequency_content : numpy array High frequency content onset detection function.

References

[1]	Paul Masri, “Computer Modeling of Sound for Transformation and Synthesis of Musical Signals”, PhD thesis, University of Bristol, 1996.

madmom.features.onsets.spectral_diff(spectrogram, diff_frames=None)[source]¶

Spectral Diff.

Parameters:	spectrogram : `Spectrogram` instance Spectrogram instance. diff_frames : int, optional Number of frames to calculate the diff to.
Returns:	spectral_diff : numpy array Spectral diff onset detection function.

References

[1]	Chris Duxbury, Mark Sandler and Matthew Davis, “A hybrid approach to musical note onset detection”, Proceedings of the 5th International Conference on Digital Audio Effects (DAFx), 2002.

madmom.features.onsets.spectral_flux(spectrogram, diff_frames=None)[source]¶

Spectral Flux.

Parameters:	spectrogram : `Spectrogram` instance Spectrogram instance. diff_frames : int, optional Number of frames to calculate the diff to.
Returns:	spectral_flux : numpy array Spectral flux onset detection function.

References

[1]	Paul Masri, “Computer Modeling of Sound for Transformation and Synthesis of Musical Signals”, PhD thesis, University of Bristol, 1996.

madmom.features.onsets.superflux(spectrogram, diff_frames=None, diff_max_bins=3)[source]¶

SuperFlux method with a maximum filter vibrato suppression stage.

Calculates the difference of bin k of the magnitude spectrogram relative to the N-th previous frame with the maximum filtered spectrogram.

Parameters:	spectrogram : `Spectrogram` instance Spectrogram instance. diff_frames : int, optional Number of frames to calculate the diff to. diff_max_bins : int, optional Number of bins used for maximum filter.
Returns:	superflux : numpy array SuperFlux onset detection function.

Notes

This method works only properly, if the spectrogram is filtered with a filterbank of the right frequency spacing. Filter banks with 24 bands per octave (i.e. quarter-tone resolution) usually yield good results. With max_bins = 3, the maximum of the bins k-1, k, k+1 of the frame diff_frames to the left is used for the calculation of the difference.

References

[1]	Sebastian Böck and Gerhard Widmer, “Maximum Filter Vibrato Suppression for Onset Detection”, Proceedings of the 16th International Conference on Digital Audio Effects (DAFx), 2013.

madmom.features.onsets.complex_flux(spectrogram, diff_frames=None, diff_max_bins=3, temporal_filter=3, temporal_origin=0)[source]¶

ComplexFlux.

ComplexFlux is based on the SuperFlux, but adds an additional local group delay based tremolo suppression.

Parameters:	spectrogram : `Spectrogram` instance `Spectrogram` instance. diff_frames : int, optional Number of frames to calculate the diff to. diff_max_bins : int, optional Number of bins used for maximum filter. temporal_filter : int, optional Temporal maximum filtering of the local group delay [frames]. temporal_origin : int, optional Origin of the temporal maximum filter.
Returns:	complex_flux : numpy array ComplexFlux onset detection function.

References

[1]	Sebastian Böck and Gerhard Widmer, “Local group delay based vibrato and tremolo suppression for onset detection”, Proceedings of the 14th International Society for Music Information Retrieval Conference (ISMIR), 2013.

madmom.features.onsets.modified_kullback_leibler(spectrogram, diff_frames=1, epsilon=2.220446049250313e-16)[source]¶

Modified Kullback-Leibler.

Parameters:	spectrogram : `Spectrogram` instance `Spectrogram` instance. diff_frames : int, optional Number of frames to calculate the diff to. epsilon : float, optional Add epsilon to the spectrogram avoid division by 0.
Returns:	modified_kullback_leibler : numpy array MKL onset detection function.

Notes

The implementation presented in [1] is used instead of the original work presented in [2].

References

[1]	(1, 2) Paul Brossier, “Automatic Annotation of Musical Audio for Interactive Applications”, PhD thesis, Queen Mary University of London, 2006.

[2]	(1, 2) Stephen Hainsworth and Malcolm Macleod, “Onset Detection in Musical Audio Signals”, Proceedings of the International Computer Music Conference (ICMC), 2003.

madmom.features.onsets.phase_deviation(spectrogram)[source]¶

Phase Deviation.

Parameters:	spectrogram : `Spectrogram` instance `Spectrogram` instance.
Returns:	phase_deviation : numpy array Phase deviation onset detection function.

References

[1]	Juan Pablo Bello, Chris Duxbury, Matthew Davies and Mark Sandler, “On the use of phase and energy for musical onset detection in the complex domain”, IEEE Signal Processing Letters, Volume 11, Number 6, 2004.

madmom.features.onsets.weighted_phase_deviation(spectrogram)[source]¶

Weighted Phase Deviation.

Parameters:	spectrogram : `Spectrogram` instance `Spectrogram` instance.
Returns:	weighted_phase_deviation : numpy array Weighted phase deviation onset detection function.

References

[1]	Simon Dixon, “Onset Detection Revisited”, Proceedings of the 9th International Conference on Digital Audio Effects (DAFx), 2006.

madmom.features.onsets.normalized_weighted_phase_deviation(spectrogram, epsilon=2.220446049250313e-16)[source]¶

Normalized Weighted Phase Deviation.

Parameters:	spectrogram : `Spectrogram` instance `Spectrogram` instance. epsilon : float, optional Add epsilon to the spectrogram avoid division by 0.
Returns:	normalized_weighted_phase_deviation : numpy array Normalized weighted phase deviation onset detection function.

References

[1]	Simon Dixon, “Onset Detection Revisited”, Proceedings of the 9th International Conference on Digital Audio Effects (DAFx), 2006.

madmom.features.onsets.complex_domain(spectrogram)[source]¶

Complex Domain.

Parameters:	spectrogram : `Spectrogram` instance `Spectrogram` instance.
Returns:	complex_domain : numpy array Complex domain onset detection function.

References

[1]	Juan Pablo Bello, Chris Duxbury, Matthew Davies and Mark Sandler, “On the use of phase and energy for musical onset detection in the complex domain”, IEEE Signal Processing Letters, Volume 11, Number 6, 2004.

madmom.features.onsets.rectified_complex_domain(spectrogram, diff_frames=None)[source]¶

Rectified Complex Domain.

Parameters:	spectrogram : `Spectrogram` instance `Spectrogram` instance. diff_frames : int, optional Number of frames to calculate the diff to.
Returns:	rectified_complex_domain : numpy array Rectified complex domain onset detection function.

References

[1]	Simon Dixon, “Onset Detection Revisited”, Proceedings of the 9th International Conference on Digital Audio Effects (DAFx), 2006.

class madmom.features.onsets.SpectralOnsetProcessor(onset_method='spectral_flux', **kwargs)[source]¶

The SpectralOnsetProcessor class implements most of the common onset detection functions based on the magnitude or phase information of a spectrogram.

Parameters:	onset_method : str, optional Onset detection function. See METHODS for possible values. kwargs : dict, optional Keyword arguments passed to the pre-processing chain to obtain a spectral representation of the signal.

Notes

If the spectrogram should be filtered, the filterbank parameter must contain a valid Filterbank, if it should be scaled logarithmically, log must be set accordingly.

References

[1]	(1, 2) Paul Masri, “Computer Modeling of Sound for Transformation and Synthesis of Musical Signals”, PhD thesis, University of Bristol, 1996.

[2]	(1, 2) Sebastian Böck and Gerhard Widmer, “Maximum Filter Vibrato Suppression for Onset Detection”, Proceedings of the 16th International Conference on Digital Audio Effects (DAFx), 2013.

Examples

Create a SpectralOnsetProcessor and pass a file through the processor to obtain an onset detection function. Per default the spectral flux [1] is computed on a simple Spectrogram.

>>> sodf = SpectralOnsetProcessor()
>>> sodf  
<madmom.features.onsets.SpectralOnsetProcessor object at 0x...>
>>> sodf.processors[-1]  
<function spectral_flux at 0x...>
>>> sodf('tests/data/audio/sample.wav')
... 
array([ 0. , 100.90121, ..., 26.30577, 20.94439], dtype=float32)

The parameters passed to the signal pre-processing chain can be set when creating the SpectralOnsetProcessor. E.g. to obtain the SuperFlux [2] onset detection function set these parameters:

>>> from madmom.audio.filters import LogarithmicFilterbank
>>> sodf = SpectralOnsetProcessor(onset_method='superflux', fps=200,
...                               filterbank=LogarithmicFilterbank,
...                               num_bands=24, log=np.log10)
>>> sodf('tests/data/audio/sample.wav')
... 
array([ 0. , 0. , 2.0868 , 1.02404, ..., 0.29888, 0.12122], dtype=float32)

classmethod add_arguments(parser, onset_method=None)[source]¶

Add spectral onset detection arguments to an existing parser.

Parameters:	parser : argparse parser instance Existing argparse parser object. onset_method : str, optional Default onset detection method.
Returns:	parser_group : argparse argument group Spectral onset detection argument parser group.

class madmom.features.onsets.RNNOnsetProcessor(**kwargs)[source]¶

Processor to get a onset activation function from multiple RNNs.

Parameters:	online : bool, optional Choose networks suitable for online onset detection, i.e. use unidirectional RNNs.

Notes

This class uses either uni- or bi-directional RNNs. Contrary to [1], it uses simple tanh units as in [2]. Also the input representations changed to use logarithmically filtered and scaled spectrograms.

References

[1]	“Universal Onset Detection with bidirectional Long Short-Term Memory Neural Networks” Florian Eyben, Sebastian Böck, Björn Schuller and Alex Graves. Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR), 2010.

[2]	“Online Real-time Onset Detection with Recurrent Neural Networks” Sebastian Böck, Andreas Arzt, Florian Krebs and Markus Schedl. Proceedings of the 15th International Conference on Digital Audio Effects (DAFx), 2012.

Examples

Create a RNNOnsetProcessor and pass a file through the processor to obtain an onset detection function (sampled with 100 frames per second).

>>> proc = RNNOnsetProcessor()
>>> proc  
<madmom.features.onsets.RNNOnsetProcessor object at 0x...>
>>> proc('tests/data/audio/sample.wav') 
array([0.08313, 0.0024 , ... 0.00527], dtype=float32)

class madmom.features.onsets.CNNOnsetProcessor(**kwargs)[source]¶

Processor to get a onset activation function from a CNN.

Notes

The implementation follows as closely as possible the original one, but part of the signal pre-processing differs in minor aspects, so results can differ slightly, too.

References

[1]	“Musical Onset Detection with Convolutional Neural Networks” Jan Schlüter and Sebastian Böck. Proceedings of the 6th International Workshop on Machine Learning and Music, 2013.

Examples

Create a CNNOnsetProcessor and pass a file through the processor to obtain an onset detection function (sampled with 100 frames per second).

>>> proc = CNNOnsetProcessor()
>>> proc  
<madmom.features.onsets.CNNOnsetProcessor object at 0x...>
>>> proc('tests/data/audio/sample.wav')  
array([0.05369, 0.04205, ... 0.00014], dtype=float32)

madmom.features.onsets.peak_picking(activations, threshold, smooth=None, pre_avg=0, post_avg=0, pre_max=1, post_max=1)[source]¶

Perform thresholding and peak-picking on the given activation function.

Parameters:

activations : numpy array: Activation function.
threshold : float: Threshold for peak-picking
smooth : int or numpy array, optional: Smooth the activation function with the kernel (size).
pre_avg : int, optional: Use pre_avg frames past information for moving average.
post_avg : int, optional: Use post_avg frames future information for moving average.
pre_max : int, optional: Use pre_max frames past information for moving maximum.
post_max : int, optional: Use post_max frames future information for moving maximum.

Returns:

peak_idx : numpy array: Indices of the detected peaks.

See also

smooth()

Notes

If no moving average is needed (e.g. the activations are independent of the signal’s level as for neural network activations), set pre_avg and post_avg to 0. For peak picking of local maxima, set pre_max and post_max to 1. For online peak picking, set all post_ parameters to 0.

References

[1]	Sebastian Böck, Florian Krebs and Markus Schedl, “Evaluating the Online Capabilities of Onset Detection Methods”, Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR), 2012.

class madmom.features.onsets.PeakPickingProcessor(**kwargs)[source]¶

Deprecated as of version 0.15. Will be removed in version 0.16. Use either OnsetPeakPickingProcessor or NotePeakPickingProcessor instead.

process(activations, **kwargs)[source]¶

Detect the peaks in the given activation function.

Parameters:	activations : numpy array Onset activation function.
Returns:	peaks : numpy array Detected onsets [seconds[, frequency bin]].

static add_arguments(parser, **kwargs)[source]¶: Deprecated as of version 0.15. Will be removed in version 0.16. Use either OnsetPeakPickingProcessor or NotePeakPickingProcessor instead.

class madmom.features.onsets.OnsetPeakPickingProcessor(threshold=0.5, smooth=0.0, pre_avg=0.0, post_avg=0.0, pre_max=0.0, post_max=0.0, combine=0.03, delay=0.0, online=False, fps=100, **kwargs)[source]¶

This class implements the onset peak-picking functionality. It transparently converts the chosen values from seconds to frames.

Parameters:

threshold : float: Threshold for peak-picking.
smooth : float, optional: Smooth the activation function over smooth seconds.
pre_avg : float, optional: Use pre_avg seconds past information for moving average.
post_avg : float, optional: Use post_avg seconds future information for moving average.
pre_max : float, optional: Use pre_max seconds past information for moving maximum.
post_max : float, optional: Use post_max seconds future information for moving maximum.
combine : float, optional: Only report one onset within combine seconds.
delay : float, optional: Report the detected onsets delay seconds delayed.
online : bool, optional: Use online peak-picking, i.e. no future information.
fps : float, optional: Frames per second used for conversion of timings.

Returns:

onsets : numpy array: Detected onsets [seconds].

Notes

If no moving average is needed (e.g. the activations are independent of the signal’s level as for neural network activations), pre_avg and post_avg should be set to 0. For peak picking of local maxima, set pre_max >= 1. / fps and post_max >= 1. / fps. For online peak picking, all post_ parameters are set to 0.

References

[1]	Sebastian Böck, Florian Krebs and Markus Schedl, “Evaluating the Online Capabilities of Onset Detection Methods”, Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR), 2012.

Examples

Create a PeakPickingProcessor. The returned array represents the positions of the onsets in seconds, thus the expected sampling rate has to be given.

>>> proc = OnsetPeakPickingProcessor(fps=100)
>>> proc  
<madmom.features.onsets.OnsetPeakPickingProcessor object at 0x...>

Call this OnsetPeakPickingProcessor with the onset activation function from an RNNOnsetProcessor to obtain the onset positions.

>>> act = RNNOnsetProcessor()('tests/data/audio/sample.wav')
>>> proc(act)  
array([0.09, 0.29, 0.45, ..., 2.34, 2.49, 2.67])

reset()[source]¶: Reset OnsetPeakPickingProcessor.

process_offline(activations, **kwargs)[source]¶

Detect the onsets in the given activation function.

Parameters:	activations : numpy array Onset activation function.
Returns:	onsets : numpy array Detected onsets [seconds].

process_online(activations, reset=True, **kwargs)[source]¶

Detect the onsets in the given activation function.

Parameters:	activations : numpy array Onset activation function. reset : bool, optional Reset the processor to its initial state before processing.
Returns:	onsets : numpy array Detected onsets [seconds].

process_sequence(activations, **kwargs)¶

Detect the onsets in the given activation function.

Parameters:	activations : numpy array Onset activation function.
Returns:	onsets : numpy array Detected onsets [seconds].

static add_arguments(parser, threshold=0.5, smooth=None, pre_avg=None, post_avg=None, pre_max=None, post_max=None, combine=0.03, delay=0.0)[source]¶

Add onset peak-picking related arguments to an existing parser.

Parameters:

parser : argparse parser instance: Existing argparse parser object.
threshold : float: Threshold for peak-picking.
smooth : float, optional: Smooth the activation function over smooth seconds.
pre_avg : float, optional: Use pre_avg seconds past information for moving average.
post_avg : float, optional: Use post_avg seconds future information for moving average.
pre_max : float, optional: Use pre_max seconds past information for moving maximum.
post_max : float, optional: Use post_max seconds future information for moving maximum.
combine : float, optional: Only report one onset within combine seconds.
delay : float, optional: Report the detected onsets delay seconds delayed.

Returns:

parser_group : argparse argument group: Onset peak-picking argument parser group.

Notes

Parameters are included in the group only if they are not ‘None’.