madmom.features.notes¶

This module contains note transcription related functionality.

Notes are stored as numpy arrays with the following column definition:

‘note_time’ ‘MIDI_note’ [‘duration’ [‘MIDI_velocity’]]

class madmom.features.notes.RNNPianoNoteProcessor(**kwargs)[source]¶

Processor to get a (piano) note activation function from a RNN.

Examples

Create a RNNPianoNoteProcessor and pass a file through the processor to obtain a note onset activation function (sampled with 100 frames per second).

>>> proc = RNNPianoNoteProcessor()
>>> proc  
<madmom.features.notes.RNNPianoNoteProcessor object at 0x...>
>>> act = proc('tests/data/audio/sample.wav')
>>> act.shape
(281, 88)
>>> act  
array([[-0.00014,  0.0002 , ..., -0.     ,  0.     ],
       [ 0.00008,  0.0001 , ...,  0.00006, -0.00001],
       ...,
       [-0.00005, -0.00011, ...,  0.00005, -0.00001],
       [-0.00017,  0.00002, ...,  0.00009, -0.00009]], dtype=float32)

class madmom.features.notes.NotePeakPickingProcessor(threshold=0.5, smooth=0.0, pre_avg=0.0, post_avg=0.0, pre_max=0.0, post_max=0.0, combine=0.03, delay=0.0, online=False, fps=100, **kwargs)[source]¶

This class implements the note peak-picking functionality.

Parameters:

threshold : float: Threshold for peak-picking.
smooth : float, optional: Smooth the activation function over smooth seconds.
pre_avg : float, optional: Use pre_avg seconds past information for moving average.
post_avg : float, optional: Use post_avg seconds future information for moving average.
pre_max : float, optional: Use pre_max seconds past information for moving maximum.
post_max : float, optional: Use post_max seconds future information for moving maximum.
combine : float, optional: Only report one note per pitch within combine seconds.
delay : float, optional: Report the detected notes delay seconds delayed.
online : bool, optional: Use online peak-picking, i.e. no future information.
fps : float, optional: Frames per second used for conversion of timings.

Returns:

notes : numpy array: Detected notes [seconds, pitch].

Notes

If no moving average is needed (e.g. the activations are independent of the signal’s level as for neural network activations), pre_avg and post_avg should be set to 0. For peak picking of local maxima, set pre_max >= 1. / fps and post_max >= 1. / fps. For online peak picking, all post_ parameters are set to 0.

Examples

Create a PeakPickingProcessor. The returned array represents the positions of the onsets in seconds, thus the expected sampling rate has to be given.

>>> proc = NotePeakPickingProcessor(fps=100)
>>> proc  
<madmom.features.notes.NotePeakPickingProcessor object at 0x...>

Call this NotePeakPickingProcessor with the note activations from an RNNPianoNoteProcessor.

>>> act = RNNPianoNoteProcessor()('tests/data/audio/stereo_sample.wav')
>>> proc(act)  
array([[ 0.14, 72.  ],
       [ 1.56, 41.  ],
       [ 3.37, 75.  ]])

process(activations, **kwargs)[source]¶

Detect the notes in the given activation function.

Parameters:	activations : numpy array Note activation function.
Returns:	onsets : numpy array Detected notes [seconds, pitches].