madmom.features.notes

This module contains note transcription related functionality.

Notes are stored as numpy arrays with the following column definition:

‘note_time’ ‘MIDI_note’ [‘duration’ [‘MIDI_velocity’]]

class madmom.features.notes.RNNPianoNoteProcessor(**kwargs)[source]

Processor to get a (piano) note activation function from a RNN.

Examples

Create a RNNPianoNoteProcessor and pass a file through the processor to obtain a note onset activation function (sampled with 100 frames per second).

>>> proc = RNNPianoNoteProcessor()
>>> proc  
<madmom.features.notes.RNNPianoNoteProcessor object at 0x...>
>>> act = proc('tests/data/audio/sample.wav')
>>> act.shape
(281, 88)
>>> act  
array([[-0.00014,  0.0002 , ..., -0.     ,  0.     ],
       [ 0.00008,  0.0001 , ...,  0.00006, -0.00001],
       ...,
       [-0.00005, -0.00011, ...,  0.00005, -0.00001],
       [-0.00017,  0.00002, ...,  0.00009, -0.00009]], dtype=float32)
class madmom.features.notes.NotePeakPickingProcessor(threshold=0.5, smooth=0.0, pre_avg=0.0, post_avg=0.0, pre_max=0.0, post_max=0.0, combine=0.03, delay=0.0, online=False, fps=100, **kwargs)[source]

This class implements the note peak-picking functionality.

Parameters:
threshold : float

Threshold for peak-picking.

smooth : float, optional

Smooth the activation function over smooth seconds.

pre_avg : float, optional

Use pre_avg seconds past information for moving average.

post_avg : float, optional

Use post_avg seconds future information for moving average.

pre_max : float, optional

Use pre_max seconds past information for moving maximum.

post_max : float, optional

Use post_max seconds future information for moving maximum.

combine : float, optional

Only report one note per pitch within combine seconds.

delay : float, optional

Report the detected notes delay seconds delayed.

online : bool, optional

Use online peak-picking, i.e. no future information.

fps : float, optional

Frames per second used for conversion of timings.

Returns:
notes : numpy array

Detected notes [seconds, pitch].

Notes

If no moving average is needed (e.g. the activations are independent of the signal’s level as for neural network activations), pre_avg and post_avg should be set to 0. For peak picking of local maxima, set pre_max >= 1. / fps and post_max >= 1. / fps. For online peak picking, all post_ parameters are set to 0.

Examples

Create a PeakPickingProcessor. The returned array represents the positions of the onsets in seconds, thus the expected sampling rate has to be given.

>>> proc = NotePeakPickingProcessor(fps=100)
>>> proc  
<madmom.features.notes.NotePeakPickingProcessor object at 0x...>

Call this NotePeakPickingProcessor with the note activations from an RNNPianoNoteProcessor.

>>> act = RNNPianoNoteProcessor()('tests/data/audio/stereo_sample.wav')
>>> proc(act)  
array([[ 0.14, 72.  ],
       [ 1.56, 41.  ],
       [ 3.37, 75.  ]])
process(activations, **kwargs)[source]

Detect the notes in the given activation function.

Parameters:
activations : numpy array

Note activation function.

Returns:
onsets : numpy array

Detected notes [seconds, pitches].