madmom.audio.filters

This module contains filter and filterbank related functionality.

madmom.audio.filters.hz2mel(f)[source]

Convert Hz frequencies to Mel.

Parameters:
f : numpy array

Input frequencies [Hz].

Returns:
m : numpy array

Frequencies in Mel [Mel].

madmom.audio.filters.mel2hz(m)[source]

Convert Mel frequencies to Hz.

Parameters:
m : numpy array

Input frequencies [Mel].

Returns:
f: numpy array

Frequencies in Hz [Hz].

madmom.audio.filters.mel_frequencies(num_bands, fmin, fmax)[source]

Returns frequencies aligned on the Mel scale.

Parameters:
num_bands : int

Number of bands.

fmin : float

Minimum frequency [Hz].

fmax : float

Maximum frequency [Hz].

Returns:
mel_frequencies: numpy array

Frequencies with Mel spacing [Hz].

madmom.audio.filters.log_frequencies(bands_per_octave, fmin, fmax, fref=440.0)[source]

Returns frequencies aligned on a logarithmic frequency scale.

Parameters:
bands_per_octave : int

Number of filter bands per octave.

fmin : float

Minimum frequency [Hz].

fmax : float

Maximum frequency [Hz].

fref : float, optional

Tuning frequency [Hz].

Returns:
log_frequencies : numpy array

Logarithmically spaced frequencies [Hz].

Notes

If bands_per_octave = 12 and fref = 440 are used, the frequencies are equivalent to MIDI notes.

madmom.audio.filters.semitone_frequencies(fmin, fmax, fref=440.0)[source]

Returns frequencies separated by semitones.

Parameters:
fmin : float

Minimum frequency [Hz].

fmax : float

Maximum frequency [Hz].

fref : float, optional

Tuning frequency of A4 [Hz].

Returns:
semitone_frequencies : numpy array

Semitone frequencies [Hz].

madmom.audio.filters.hz2midi(f, fref=440.0)[source]

Convert frequencies to the corresponding MIDI notes.

Parameters:
f : numpy array

Input frequencies [Hz].

fref : float, optional

Tuning frequency of A4 [Hz].

Returns:
m : numpy array

MIDI notes

Notes

For details see: at http://www.phys.unsw.edu.au/jw/notes.html This function does not necessarily return a valid MIDI Note, you may need to round it to the nearest integer.

madmom.audio.filters.midi2hz(m, fref=440.0)[source]

Convert MIDI notes to corresponding frequencies.

Parameters:
m : numpy array

Input MIDI notes.

fref : float, optional

Tuning frequency of A4 [Hz].

Returns:
f : numpy array

Corresponding frequencies [Hz].

madmom.audio.filters.midi_frequencies(fmin, fmax, fref=440.0)

Returns frequencies separated by semitones.

Parameters:
fmin : float

Minimum frequency [Hz].

fmax : float

Maximum frequency [Hz].

fref : float, optional

Tuning frequency of A4 [Hz].

Returns:
semitone_frequencies : numpy array

Semitone frequencies [Hz].

madmom.audio.filters.hz2erb(f)[source]

Convert Hz to ERB.

Parameters:
f : numpy array

Input frequencies [Hz].

Returns:
e : numpy array

Frequencies in ERB [ERB].

Notes

Information about the ERB scale can be found at: https://ccrma.stanford.edu/~jos/bbt/Equivalent_Rectangular_Bandwidth.html

madmom.audio.filters.erb2hz(e)[source]

Convert ERB scaled frequencies to Hz.

Parameters:
e : numpy array

Input frequencies [ERB].

Returns:
f : numpy array

Frequencies in Hz [Hz].

Notes

Information about the ERB scale can be found at: https://ccrma.stanford.edu/~jos/bbt/Equivalent_Rectangular_Bandwidth.html

madmom.audio.filters.frequencies2bins(frequencies, bin_frequencies, unique_bins=False)[source]

Map frequencies to the closest corresponding bins.

Parameters:
frequencies : numpy array

Input frequencies [Hz].

bin_frequencies : numpy array

Frequencies of the (FFT) bins [Hz].

unique_bins : bool, optional

Return only unique bins, i.e. remove all duplicate bins resulting from insufficient resolution at low frequencies.

Returns:
bins : numpy array

Corresponding (unique) bins.

Notes

It can be important to return only unique bins, otherwise the lower frequency bins can be given too much weight if all bins are simply summed up (as in the spectral flux onset detection).

madmom.audio.filters.bins2frequencies(bins, bin_frequencies)[source]

Convert bins to the corresponding frequencies.

Parameters:
bins : numpy array

Bins (e.g. FFT bins).

bin_frequencies : numpy array

Frequencies of the (FFT) bins [Hz].

Returns:
f : numpy array

Corresponding frequencies [Hz].

class madmom.audio.filters.Filter(data, start=0, norm=False)[source]

Generic Filter class.

Parameters:
data : 1D numpy array

Filter data.

start : int, optional

Start position (see notes).

norm : bool, optional

Normalize the filter area to 1.

Notes

The start position is mandatory if a Filter should be used for the creation of a Filterbank.

classmethod band_bins(bins, **kwargs)[source]

Must yield the center/crossover bins needed for filter creation.

Parameters:
bins : numpy array

Center/crossover bins used for the creation of filters.

kwargs : dict, optional

Additional parameters for for the creation of filters (e.g. if the filters should overlap or not).

classmethod filters(bins, norm, **kwargs)[source]

Create a list with filters for the given bins.

Parameters:
bins : list or numpy array

Center/crossover bins of the filters.

norm : bool

Normalize the area of the filter(s) to 1.

kwargs : dict, optional

Additional parameters passed to band_bins() (e.g. if the filters should overlap or not).

Returns:
filters : list

Filter(s) for the given bins.

class madmom.audio.filters.TriangularFilter(start, center, stop, norm=False)[source]

Triangular filter class.

Create a triangular shaped filter with length stop, height 1 (unless normalized) with indices <= start set to 0.

Parameters:
start : int

Start bin of the filter.

center : int

Center bin of the filter.

stop : int

Stop bin of the filter.

norm : bool, optional

Normalize the area of the filter to 1.

classmethod band_bins(bins, overlap=True)[source]

Yields start, center and stop bins for creation of triangular filters.

Parameters:
bins : list or numpy array

Center bins of filters.

overlap : bool, optional

Filters should overlap (see notes).

Yields:
start : int

Start bin of the filter.

center : int

Center bin of the filter.

stop : int

Stop bin of the filter.

Notes

If overlap is ‘False’, the start and stop bins of the filters are interpolated between the centre bins, normal rounding applies.

class madmom.audio.filters.RectangularFilter(start, stop, norm=False)[source]

Rectangular filter class.

Create a rectangular shaped filter with length stop, height 1 (unless normalized) with indices < start set to 0.

Parameters:
start : int

Start bin of the filter.

stop : int

Stop bin of the filter.

norm : bool, optional

Normalize the area of the filter to 1.

classmethod band_bins(bins, overlap=False)[source]

Yields start and stop bins and normalization info for creation of rectangular filters.

Parameters:
bins : list or numpy array

Crossover bins of filters.

overlap : bool, optional

Filters should overlap.

Yields:
start : int

Start bin of the filter.

stop : int

Stop bin of the filter.

class madmom.audio.filters.Filterbank(data, bin_frequencies)[source]

Generic filterbank class.

A Filterbank is a simple numpy array enhanced with several additional attributes, e.g. number of bands.

A Filterbank has a shape of (num_bins, num_bands) and can be used to filter a spectrogram of shape (num_frames, num_bins) to (num_frames, num_bands).

Parameters:
data : numpy array, shape (num_bins, num_bands)

Data of the filterbank .

bin_frequencies : numpy array, shape (num_bins, )

Frequencies of the bins [Hz].

Notes

The length of bin_frequencies must be equal to the first dimension of the given data array.

classmethod from_filters(filters, bin_frequencies)[source]

Create a filterbank with possibly multiple filters per band.

Parameters:
filters : list (of lists) of Filters

List of Filters (per band); if multiple filters per band are desired, they should be also contained in a list, resulting in a list of lists of Filters.

bin_frequencies : numpy array

Frequencies of the bins (needed to determine the expected size of the filterbank).

Returns:
filterbank : Filterbank instance

Filterbank with respective filter elements.

num_bins

Number of bins.

num_bands

Number of bands.

corner_frequencies

Corner frequencies of the filter bands.

center_frequencies

Center frequencies of the filter bands.

fmin

Minimum frequency of the filterbank.

fmax

Maximum frequency of the filterbank.

class madmom.audio.filters.FilterbankProcessor(data, bin_frequencies)[source]

Generic filterbank processor class.

A FilterbankProcessor is a simple wrapper for Filterbank which adds a process() method.

See also

Filterbank

process(data)[source]

Filter the given data with the Filterbank.

Parameters:
data : 2D numpy array

Data to be filtered.

Returns
——-
filt_data : numpy array

Filtered data.

Notes

This method makes the Filterbank act as a Processor.

static add_arguments(parser, filterbank=None, num_bands=None, crossover_frequencies=None, fmin=None, fmax=None, norm_filters=None, unique_filters=None)[source]

Add filterbank related arguments to an existing parser.

Parameters:
parser : argparse parser instance

Existing argparse parser object.

filterbank : audio.filters.Filterbank, optional

Use a filterbank of that type.

num_bands : int or list, optional

Number of bands (per octave).

crossover_frequencies : list or numpy array, optional

List of crossover frequencies at which the spectrogram is split into bands.

fmin : float, optional

Minimum frequency of the filterbank [Hz].

fmax : float, optional

Maximum frequency of the filterbank [Hz].

norm_filters : bool, optional

Normalize the filters of the filterbank to area 1.

unique_filters : bool, optional

Indicate if the filterbank should contain only unique filters, i.e. remove duplicate filters resulting from insufficient resolution at low frequencies.

Returns:
argparse argument group

Filterbank argument parser group.

Notes

Parameters are included in the group only if they are not ‘None’. Depending on the type of the filterbank, either num_bands or crossover_frequencies should be used.

class madmom.audio.filters.MelFilterbank(bin_frequencies, num_bands=40, fmin=20.0, fmax=17000.0, norm_filters=True, unique_filters=True, **kwargs)[source]

Mel filterbank class.

Parameters:
bin_frequencies : numpy array

Frequencies of the bins [Hz].

num_bands : int, optional

Number of filter bands.

fmin : float, optional

Minimum frequency of the filterbank [Hz].

fmax : float, optional

Maximum frequency of the filterbank [Hz].

norm_filters : bool, optional

Normalize the filters to area 1.

unique_filters : bool, optional

Keep only unique filters, i.e. remove duplicate filters resulting from insufficient resolution at low frequencies.

Notes

Because of rounding and mapping of frequencies to bins and back to frequencies, the actual minimum, maximum and center frequencies do not necessarily match the parameters given.

class madmom.audio.filters.LogarithmicFilterbank(bin_frequencies, num_bands=12, fmin=30.0, fmax=17000.0, fref=440.0, norm_filters=True, unique_filters=True, bands_per_octave=True)[source]

Logarithmic filterbank class.

Parameters:
bin_frequencies : numpy array

Frequencies of the bins [Hz].

num_bands : int, optional

Number of filter bands (per octave).

fmin : float, optional

Minimum frequency of the filterbank [Hz].

fmax : float, optional

Maximum frequency of the filterbank [Hz].

fref : float, optional

Tuning frequency of the filterbank [Hz].

norm_filters : bool, optional

Normalize the filters to area 1.

unique_filters : bool, optional

Keep only unique filters, i.e. remove duplicate filters resulting from insufficient resolution at low frequencies.

bands_per_octave : bool, optional

Indicates whether num_bands is given as number of bands per octave (‘True’, default) or as an absolute number of bands (‘False’).

Notes

num_bands sets either the number of bands per octave or the total number of bands, depending on the setting of bands_per_octave. num_bands is used to set also the number of bands per octave to keep the argument for all classes the same. If 12 bands per octave are used, a filterbank with semitone spacing is created.

madmom.audio.filters.LogFilterbank

alias of madmom.audio.filters.LogarithmicFilterbank

class madmom.audio.filters.RectangularFilterbank(bin_frequencies, crossover_frequencies, fmin=30.0, fmax=17000.0, norm_filters=True, unique_filters=True)[source]

Rectangular filterbank class.

Parameters:
bin_frequencies : numpy array

Frequencies of the bins [Hz].

crossover_frequencies : list or numpy array

Crossover frequencies of the bands [Hz].

fmin : float, optional

Minimum frequency of the filterbank [Hz].

fmax : float, optional

Maximum frequency of the filterbank [Hz].

norm_filters : bool, optional

Normalize the filters to area 1.

unique_filters : bool, optional

Keep only unique filters, i.e. remove duplicate filters resulting from insufficient resolution at low frequencies.

class madmom.audio.filters.SemitoneBandpassFilterbank(order=4, passband_ripple=1, stopband_rejection=50, q_factor=25, fmin=27.5, fmax=4200.0, fref=440.0)[source]

Time domain semitone filterbank of elliptic filters as proposed in [1].

Parameters:
order : int, optional

Order of elliptic filters.

passband_ripple : float, optional

Maximum ripple allowed below unity gain in the passband [dB].

stopband_rejection : float, optional

Minimum attenuation required in the stop band [dB].

q_factor : int, optional

Q-factor of the filters.

fmin : float, optional

Minimum frequency of the filterbank [Hz].

fmax : float, optional

Maximum frequency of the filterbank [Hz].

fref : float, optional

Reference frequency for the first bandpass filter [Hz].

Notes

This is a time domain filterbank, thus it cannot be used as the other time-frequency filterbanks of this module. Instead of np.dot() use scipy.signal.filtfilt() to filter a signal.

References

[1](1, 2) Meinard Müller, “Information retrieval for music and motion”, Springer, 2007.
num_bands

Number of bands.

fmin

Minimum frequency of the filterbank.

fmax

Maximum frequency of the filterbank.