madmom.features.onsets

This module contains onset detection related functionality.

madmom.features.onsets.wrap_to_pi(phase)[source]

Wrap the phase information to the range -π…π.

Parameters:
phase : numpy array

Phase of the STFT.

Returns:
wrapped_phase : numpy array

Wrapped phase.

madmom.features.onsets.correlation_diff(spec, diff_frames=1, pos=False, diff_bins=1)[source]

Calculates the difference of the magnitude spectrogram relative to the N-th previous frame shifted in frequency to achieve the highest correlation between these two frames.

Parameters:
spec : numpy array

Magnitude spectrogram.

diff_frames : int, optional

Calculate the difference to the diff_frames-th previous frame.

pos : bool, optional

Keep only positive values.

diff_bins : int, optional

Maximum number of bins shifted for correlation calculation.

Returns:
correlation_diff : numpy array

(Positive) magnitude spectrogram differences.

Notes

This function is only because of completeness, it is not intended to be actually used, since it is extremely slow. Please consider the superflux() function, since if performs equally well but much faster.

madmom.features.onsets.high_frequency_content(spectrogram)[source]

High Frequency Content.

Parameters:
spectrogram : Spectrogram instance

Spectrogram instance.

Returns:
high_frequency_content : numpy array

High frequency content onset detection function.

References

[1]Paul Masri, “Computer Modeling of Sound for Transformation and Synthesis of Musical Signals”, PhD thesis, University of Bristol, 1996.
madmom.features.onsets.spectral_diff(spectrogram, diff_frames=None)[source]

Spectral Diff.

Parameters:
spectrogram : Spectrogram instance

Spectrogram instance.

diff_frames : int, optional

Number of frames to calculate the diff to.

Returns:
spectral_diff : numpy array

Spectral diff onset detection function.

References

[1]Chris Duxbury, Mark Sandler and Matthew Davis, “A hybrid approach to musical note onset detection”, Proceedings of the 5th International Conference on Digital Audio Effects (DAFx), 2002.
madmom.features.onsets.spectral_flux(spectrogram, diff_frames=None)[source]

Spectral Flux.

Parameters:
spectrogram : Spectrogram instance

Spectrogram instance.

diff_frames : int, optional

Number of frames to calculate the diff to.

Returns:
spectral_flux : numpy array

Spectral flux onset detection function.

References

[1]Paul Masri, “Computer Modeling of Sound for Transformation and Synthesis of Musical Signals”, PhD thesis, University of Bristol, 1996.
madmom.features.onsets.superflux(spectrogram, diff_frames=None, diff_max_bins=3)[source]

SuperFlux method with a maximum filter vibrato suppression stage.

Calculates the difference of bin k of the magnitude spectrogram relative to the N-th previous frame with the maximum filtered spectrogram.

Parameters:
spectrogram : Spectrogram instance

Spectrogram instance.

diff_frames : int, optional

Number of frames to calculate the diff to.

diff_max_bins : int, optional

Number of bins used for maximum filter.

Returns:
superflux : numpy array

SuperFlux onset detection function.

Notes

This method works only properly, if the spectrogram is filtered with a filterbank of the right frequency spacing. Filter banks with 24 bands per octave (i.e. quarter-tone resolution) usually yield good results. With max_bins = 3, the maximum of the bins k-1, k, k+1 of the frame diff_frames to the left is used for the calculation of the difference.

References

[1]Sebastian Böck and Gerhard Widmer, “Maximum Filter Vibrato Suppression for Onset Detection”, Proceedings of the 16th International Conference on Digital Audio Effects (DAFx), 2013.
madmom.features.onsets.complex_flux(spectrogram, diff_frames=None, diff_max_bins=3, temporal_filter=3, temporal_origin=0)[source]

ComplexFlux.

ComplexFlux is based on the SuperFlux, but adds an additional local group delay based tremolo suppression.

Parameters:
spectrogram : Spectrogram instance

Spectrogram instance.

diff_frames : int, optional

Number of frames to calculate the diff to.

diff_max_bins : int, optional

Number of bins used for maximum filter.

temporal_filter : int, optional

Temporal maximum filtering of the local group delay [frames].

temporal_origin : int, optional

Origin of the temporal maximum filter.

Returns:
complex_flux : numpy array

ComplexFlux onset detection function.

References

[1]Sebastian Böck and Gerhard Widmer, “Local group delay based vibrato and tremolo suppression for onset detection”, Proceedings of the 14th International Society for Music Information Retrieval Conference (ISMIR), 2013.
madmom.features.onsets.modified_kullback_leibler(spectrogram, diff_frames=1, epsilon=2.220446049250313e-16)[source]

Modified Kullback-Leibler.

Parameters:
spectrogram : Spectrogram instance

Spectrogram instance.

diff_frames : int, optional

Number of frames to calculate the diff to.

epsilon : float, optional

Add epsilon to the spectrogram avoid division by 0.

Returns:
modified_kullback_leibler : numpy array

MKL onset detection function.

Notes

The implementation presented in [1] is used instead of the original work presented in [2].

References

[1](1, 2) Paul Brossier, “Automatic Annotation of Musical Audio for Interactive Applications”, PhD thesis, Queen Mary University of London, 2006.
[2](1, 2) Stephen Hainsworth and Malcolm Macleod, “Onset Detection in Musical Audio Signals”, Proceedings of the International Computer Music Conference (ICMC), 2003.
madmom.features.onsets.phase_deviation(spectrogram)[source]

Phase Deviation.

Parameters:
spectrogram : Spectrogram instance

Spectrogram instance.

Returns:
phase_deviation : numpy array

Phase deviation onset detection function.

References

[1]Juan Pablo Bello, Chris Duxbury, Matthew Davies and Mark Sandler, “On the use of phase and energy for musical onset detection in the complex domain”, IEEE Signal Processing Letters, Volume 11, Number 6, 2004.
madmom.features.onsets.weighted_phase_deviation(spectrogram)[source]

Weighted Phase Deviation.

Parameters:
spectrogram : Spectrogram instance

Spectrogram instance.

Returns:
weighted_phase_deviation : numpy array

Weighted phase deviation onset detection function.

References

[1]Simon Dixon, “Onset Detection Revisited”, Proceedings of the 9th International Conference on Digital Audio Effects (DAFx), 2006.
madmom.features.onsets.normalized_weighted_phase_deviation(spectrogram, epsilon=2.220446049250313e-16)[source]

Normalized Weighted Phase Deviation.

Parameters:
spectrogram : Spectrogram instance

Spectrogram instance.

epsilon : float, optional

Add epsilon to the spectrogram avoid division by 0.

Returns:
normalized_weighted_phase_deviation : numpy array

Normalized weighted phase deviation onset detection function.

References

[1]Simon Dixon, “Onset Detection Revisited”, Proceedings of the 9th International Conference on Digital Audio Effects (DAFx), 2006.
madmom.features.onsets.complex_domain(spectrogram)[source]

Complex Domain.

Parameters:
spectrogram : Spectrogram instance

Spectrogram instance.

Returns:
complex_domain : numpy array

Complex domain onset detection function.

References

[1]Juan Pablo Bello, Chris Duxbury, Matthew Davies and Mark Sandler, “On the use of phase and energy for musical onset detection in the complex domain”, IEEE Signal Processing Letters, Volume 11, Number 6, 2004.
madmom.features.onsets.rectified_complex_domain(spectrogram, diff_frames=None)[source]

Rectified Complex Domain.

Parameters:
spectrogram : Spectrogram instance

Spectrogram instance.

diff_frames : int, optional

Number of frames to calculate the diff to.

Returns:
rectified_complex_domain : numpy array

Rectified complex domain onset detection function.

References

[1]Simon Dixon, “Onset Detection Revisited”, Proceedings of the 9th International Conference on Digital Audio Effects (DAFx), 2006.
class madmom.features.onsets.SpectralOnsetProcessor(onset_method='spectral_flux', **kwargs)[source]

The SpectralOnsetProcessor class implements most of the common onset detection functions based on the magnitude or phase information of a spectrogram.

Parameters:
onset_method : str, optional

Onset detection function. See METHODS for possible values.

kwargs : dict, optional

Keyword arguments passed to the pre-processing chain to obtain a spectral representation of the signal.

Notes

If the spectrogram should be filtered, the filterbank parameter must contain a valid Filterbank, if it should be scaled logarithmically, log must be set accordingly.

References

[1](1, 2) Paul Masri, “Computer Modeling of Sound for Transformation and Synthesis of Musical Signals”, PhD thesis, University of Bristol, 1996.
[2](1, 2) Sebastian Böck and Gerhard Widmer, “Maximum Filter Vibrato Suppression for Onset Detection”, Proceedings of the 16th International Conference on Digital Audio Effects (DAFx), 2013.

Examples

Create a SpectralOnsetProcessor and pass a file through the processor to obtain an onset detection function. Per default the spectral flux [1] is computed on a simple Spectrogram.

>>> sodf = SpectralOnsetProcessor()
>>> sodf  
<madmom.features.onsets.SpectralOnsetProcessor object at 0x...>
>>> sodf.processors[-1]  
<function spectral_flux at 0x...>
>>> sodf('tests/data/audio/sample.wav')
... 
array([ 0. , 100.90121, ..., 26.30577, 20.94439], dtype=float32)

The parameters passed to the signal pre-processing chain can be set when creating the SpectralOnsetProcessor. E.g. to obtain the SuperFlux [2] onset detection function set these parameters:

>>> from madmom.audio.filters import LogarithmicFilterbank
>>> sodf = SpectralOnsetProcessor(onset_method='superflux', fps=200,
...                               filterbank=LogarithmicFilterbank,
...                               num_bands=24, log=np.log10)
>>> sodf('tests/data/audio/sample.wav')
... 
array([ 0. , 0. , 2.0868 , 1.02404, ..., 0.29888, 0.12122], dtype=float32)
classmethod add_arguments(parser, onset_method=None)[source]

Add spectral onset detection arguments to an existing parser.

Parameters:
parser : argparse parser instance

Existing argparse parser object.

onset_method : str, optional

Default onset detection method.

Returns:
parser_group : argparse argument group

Spectral onset detection argument parser group.

class madmom.features.onsets.RNNOnsetProcessor(**kwargs)[source]

Processor to get a onset activation function from multiple RNNs.

Parameters:
online : bool, optional

Choose networks suitable for online onset detection, i.e. use unidirectional RNNs.

Notes

This class uses either uni- or bi-directional RNNs. Contrary to [1], it uses simple tanh units as in [2]. Also the input representations changed to use logarithmically filtered and scaled spectrograms.

References

[1]“Universal Onset Detection with bidirectional Long Short-Term Memory Neural Networks” Florian Eyben, Sebastian Böck, Björn Schuller and Alex Graves. Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR), 2010.
[2]“Online Real-time Onset Detection with Recurrent Neural Networks” Sebastian Böck, Andreas Arzt, Florian Krebs and Markus Schedl. Proceedings of the 15th International Conference on Digital Audio Effects (DAFx), 2012.

Examples

Create a RNNOnsetProcessor and pass a file through the processor to obtain an onset detection function (sampled with 100 frames per second).

>>> proc = RNNOnsetProcessor()
>>> proc  
<madmom.features.onsets.RNNOnsetProcessor object at 0x...>
>>> proc('tests/data/audio/sample.wav') 
array([0.08313, 0.0024 , ... 0.00527], dtype=float32)
class madmom.features.onsets.CNNOnsetProcessor(**kwargs)[source]

Processor to get a onset activation function from a CNN.

Notes

The implementation follows as closely as possible the original one, but part of the signal pre-processing differs in minor aspects, so results can differ slightly, too.

References

[1]“Musical Onset Detection with Convolutional Neural Networks” Jan Schlüter and Sebastian Böck. Proceedings of the 6th International Workshop on Machine Learning and Music, 2013.

Examples

Create a CNNOnsetProcessor and pass a file through the processor to obtain an onset detection function (sampled with 100 frames per second).

>>> proc = CNNOnsetProcessor()
>>> proc  
<madmom.features.onsets.CNNOnsetProcessor object at 0x...>
>>> proc('tests/data/audio/sample.wav')  
array([0.05369, 0.04205, ... 0.00014], dtype=float32)
madmom.features.onsets.peak_picking(activations, threshold, smooth=None, pre_avg=0, post_avg=0, pre_max=1, post_max=1)[source]

Perform thresholding and peak-picking on the given activation function.

Parameters:
activations : numpy array

Activation function.

threshold : float

Threshold for peak-picking

smooth : int or numpy array, optional

Smooth the activation function with the kernel (size).

pre_avg : int, optional

Use pre_avg frames past information for moving average.

post_avg : int, optional

Use post_avg frames future information for moving average.

pre_max : int, optional

Use pre_max frames past information for moving maximum.

post_max : int, optional

Use post_max frames future information for moving maximum.

Returns:
peak_idx : numpy array

Indices of the detected peaks.

See also

smooth()

Notes

If no moving average is needed (e.g. the activations are independent of the signal’s level as for neural network activations), set pre_avg and post_avg to 0. For peak picking of local maxima, set pre_max and post_max to 1. For online peak picking, set all post_ parameters to 0.

References

[1]Sebastian Böck, Florian Krebs and Markus Schedl, “Evaluating the Online Capabilities of Onset Detection Methods”, Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR), 2012.
class madmom.features.onsets.PeakPickingProcessor(**kwargs)[source]

Deprecated as of version 0.15. Will be removed in version 0.16. Use either OnsetPeakPickingProcessor or NotePeakPickingProcessor instead.

process(activations, **kwargs)[source]

Detect the peaks in the given activation function.

Parameters:
activations : numpy array

Onset activation function.

Returns:
peaks : numpy array

Detected onsets [seconds[, frequency bin]].

static add_arguments(parser, **kwargs)[source]

Deprecated as of version 0.15. Will be removed in version 0.16. Use either OnsetPeakPickingProcessor or NotePeakPickingProcessor instead.

class madmom.features.onsets.OnsetPeakPickingProcessor(threshold=0.5, smooth=0.0, pre_avg=0.0, post_avg=0.0, pre_max=0.0, post_max=0.0, combine=0.03, delay=0.0, online=False, fps=100, **kwargs)[source]

This class implements the onset peak-picking functionality. It transparently converts the chosen values from seconds to frames.

Parameters:
threshold : float

Threshold for peak-picking.

smooth : float, optional

Smooth the activation function over smooth seconds.

pre_avg : float, optional

Use pre_avg seconds past information for moving average.

post_avg : float, optional

Use post_avg seconds future information for moving average.

pre_max : float, optional

Use pre_max seconds past information for moving maximum.

post_max : float, optional

Use post_max seconds future information for moving maximum.

combine : float, optional

Only report one onset within combine seconds.

delay : float, optional

Report the detected onsets delay seconds delayed.

online : bool, optional

Use online peak-picking, i.e. no future information.

fps : float, optional

Frames per second used for conversion of timings.

Returns:
onsets : numpy array

Detected onsets [seconds].

Notes

If no moving average is needed (e.g. the activations are independent of the signal’s level as for neural network activations), pre_avg and post_avg should be set to 0. For peak picking of local maxima, set pre_max >= 1. / fps and post_max >= 1. / fps. For online peak picking, all post_ parameters are set to 0.

References

[1]Sebastian Böck, Florian Krebs and Markus Schedl, “Evaluating the Online Capabilities of Onset Detection Methods”, Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR), 2012.

Examples

Create a PeakPickingProcessor. The returned array represents the positions of the onsets in seconds, thus the expected sampling rate has to be given.

>>> proc = OnsetPeakPickingProcessor(fps=100)
>>> proc  
<madmom.features.onsets.OnsetPeakPickingProcessor object at 0x...>

Call this OnsetPeakPickingProcessor with the onset activation function from an RNNOnsetProcessor to obtain the onset positions.

>>> act = RNNOnsetProcessor()('tests/data/audio/sample.wav')
>>> proc(act)  
array([0.09, 0.29, 0.45, ..., 2.34, 2.49, 2.67])
reset()[source]

Reset OnsetPeakPickingProcessor.

process_offline(activations, **kwargs)[source]

Detect the onsets in the given activation function.

Parameters:
activations : numpy array

Onset activation function.

Returns:
onsets : numpy array

Detected onsets [seconds].

process_online(activations, reset=True, **kwargs)[source]

Detect the onsets in the given activation function.

Parameters:
activations : numpy array

Onset activation function.

reset : bool, optional

Reset the processor to its initial state before processing.

Returns:
onsets : numpy array

Detected onsets [seconds].

process_sequence(activations, **kwargs)

Detect the onsets in the given activation function.

Parameters:
activations : numpy array

Onset activation function.

Returns:
onsets : numpy array

Detected onsets [seconds].

static add_arguments(parser, threshold=0.5, smooth=None, pre_avg=None, post_avg=None, pre_max=None, post_max=None, combine=0.03, delay=0.0)[source]

Add onset peak-picking related arguments to an existing parser.

Parameters:
parser : argparse parser instance

Existing argparse parser object.

threshold : float

Threshold for peak-picking.

smooth : float, optional

Smooth the activation function over smooth seconds.

pre_avg : float, optional

Use pre_avg seconds past information for moving average.

post_avg : float, optional

Use post_avg seconds future information for moving average.

pre_max : float, optional

Use pre_max seconds past information for moving maximum.

post_max : float, optional

Use post_max seconds future information for moving maximum.

combine : float, optional

Only report one onset within combine seconds.

delay : float, optional

Report the detected onsets delay seconds delayed.

Returns:
parser_group : argparse argument group

Onset peak-picking argument parser group.

Notes

Parameters are included in the group only if they are not ‘None’.