madmom.ml.nn¶

Neural Network package.

madmom.ml.nn.average_predictions(predictions)[source]¶

Returns the average of all predictions.

Parameters:

predictions : list

Predictions (i.e. NN activation functions).

Returns:

numpy array

Averaged prediction.

class madmom.ml.nn.NeuralNetwork(layers)[source]¶

Neural Network class.

Parameters:

layers : list

Layers of the Neural Network.

Examples

Create a NeuralNetwork from the given layers.

>>> from madmom.ml.nn.layers import FeedForwardLayer
>>> from madmom.ml.nn.activations import tanh, sigmoid
>>> l1_weights = [[0.5, -1., -0.3 , -0.2]]
>>> l1_bias = [0.05, 0., 0.8, -0.5]
>>> l1 = FeedForwardLayer(l1_weights, l1_bias, activation_fn=tanh)
>>> l2_weights = [-1, 0.9, -0.2 , 0.4]
>>> l2_bias = [0.5]
>>> l2 = FeedForwardLayer(l2_weights, l2_bias, activation_fn=sigmoid)
>>> nn = NeuralNetwork([l1, l2])
>>> nn  
<madmom.ml.nn.NeuralNetwork object at 0x...>
>>> nn(np.array([0, 0.5, 1, 0, 1, 2, 0]))  
array([ 0.53305, 0.36903, 0.265 , 0.53305, 0.265 , 0.18612, 0.53305])

process(data)[source]¶

Process the given data with the neural network.

Parameters:

data : numpy array

Activate the network with this data.

Returns:

numpy array

Network predictions for this data.

class madmom.ml.nn.NeuralNetworkEnsemble(networks, ensemble_fn=<function average_predictions>, num_threads=None)[source]¶

Neural Network ensemble class.

Parameters:

networks : list

List of the Neural Networks.

ensemble_fn : function or callable, optional

Ensemble function to be applied to the predictions of the neural network ensemble (default: average predictions).

num_threads : int, optional

Number of parallel working threads.

Notes

If ensemble_fn is set to ‘None’, the predictions are returned as a list with the same length as the number of networks given.

Examples

Create a NeuralNetworkEnsemble from the networks. Instead of supplying the neural networks as parameter, they can also be loaded from file:

>>> from madmom.models import ONSETS_BRNN_PP
>>> nn = NeuralNetworkEnsemble.load(ONSETS_BRNN_PP)
>>> nn  
<madmom.ml.nn.NeuralNetworkEnsemble object at 0x...>
>>> nn(np.array([0, 0.5, 1, 0, 1, 2, 0]))  
array([ 0.00116, 0.00213, 0.01428, 0.00729, 0.0088 , 0.21965, 0.00532])

classmethod load(nn_files, **kwargs)[source]¶

Parameters:

nn_files : list

List of neural network model file names.

kwargs : dict, optional

Keyword arguments passed to NeuralNetworkEnsemble.

Returns:

NeuralNetworkEnsemble

NeuralNetworkEnsemble instance.

static add_arguments(parser, nn_files)[source]¶

Add neural network options to an existing parser.

Parameters:

parser : argparse parser instance

Existing argparse parser object.

nn_files : list

Neural network model files.

Returns:

argparse argument group

Neural network argument parser group.

madmom.ml.nn.layers¶

This module contains neural network layers for the ml.nn module.

class madmom.ml.nn.layers.BatchNormLayer¶

Batch normalization layer with activation function. The previous layer is usually linear with no bias - the BatchNormLayer’s beta parameter replaces it. See [R64] for a detailed understanding of the parameters.

Parameters:

beta : numpy array

Values for the beta parameter. Must be broadcastable to the incoming shape.

gamma : numpy array

Values for the gamma parameter. Must be broadcastable to the incoming shape.

mean : numpy array

Mean values of incoming data. Must be broadcastable to the incoming shape.

inv_std : numpy array

Inverse standard deviation of incoming data. Must be broadcastable to the incoming shape.

activation_fn : numpy ufunc

Activation function.

References

[R64]

(1, 2) “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift” Sergey Ioffe and Christian Szegedy. http://arxiv.org/abs/1502.03167, 2015.

activate()¶

Activate the layer.

Parameters:

data : numpy array

Activate with this data.

Returns:

numpy array

Activations for this data.

class madmom.ml.nn.layers.BidirectionalLayer¶

Bidirectional network layer.

Parameters:

fwd_layer : Layer instance

Forward layer.

bwd_layer : Layer instance

Backward layer.

activate()¶

Activate the layer.

After activating the fwd_layer with the data and the bwd_layer with the data in reverse temporal order, the two activations are stacked and returned.

Parameters:

data : numpy array

Activate with this data.

Returns:

numpy array

Activations for this data.

class madmom.ml.nn.layers.Cell¶

Cell as used by LSTM layers.

Parameters:

weights : numpy array, shape ()

Weights.

bias : scalar or numpy array, shape ()

Bias.

recurrent_weights : numpy array, shape ()

Recurrent weights.

activation_fn : numpy ufunc, optional

Activation function.

Notes

A Cell is the same as a Gate except it misses peephole connections and has a tanh activation function.

class madmom.ml.nn.layers.ConvolutionalLayer¶

Convolutional network layer.

Parameters:

weights : numpy array, shape (num_feature_maps, num_channels, <kernel>)

Weights.

bias : scalar or numpy array, shape (num_filters,)

Bias.

stride : int, optional

Stride of the convolution.

pad : {‘valid’, ‘same’, ‘full’}

A string indicating the size of the output:

full

The output is the full discrete linear convolution of the inputs.

valid

The output consists only of those elements that do not rely on the zero-padding.

same

The output is the same size as the input, centered with respect to the ‘full’ output.

activation_fn : numpy ufunc

Activation function.

activate()¶

Activate the layer.

Parameters:

data : numpy array (num_frames, num_bins, num_channels)

Activate with this data.

Returns:

numpy array

Activations for this data.

class madmom.ml.nn.layers.FeedForwardLayer¶

Feed-forward network layer.

Parameters:

weights : numpy array, shape ()

Weights.

bias : scalar or numpy array, shape ()

Bias.

activation_fn : numpy ufunc

Activation function.

activate()¶

Activate the layer.

Parameters:

data : numpy array

Activate with this data.

Returns:

numpy array

Activations for this data.

class madmom.ml.nn.layers.GRUCell¶

Cell as used by GRU layers proposed in [R65]. The cell output is computed by

\[h = tanh(W_{xh} * x_t + W_{hh} * h_{t-1} + b).\]

Parameters:

weights : numpy array, shape (num_inputs, num_hiddens)

Weights of the connections between inputs and cell.

recurrent_weights : numpy array, shape (num_hiddens, num_hiddens)

Weights of the connections between cell and cell output of the previous time step.

bias : scalar or numpy array, shape (num_hiddens,)

Bias.

activation_fn : numpy ufunc, optional

Activation function.

Notes

There are two formulations of the GRUCell in the literature. Here, we adopted the (slightly older) one proposed in [R65], which is also implemented in the Lasagne toolbox.

References

[R65]

(1, 2, 3) Kyunghyun Cho, Bart Van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio, “On the properties of neural machine translation: Encoder-decoder approaches”, http://arxiv.org/abs/1409.1259, 2014.

activate()¶

Activate the gate with the given input, reset_gate and the previous output.

Parameters:

data : scalar or numpy array, shape (num_frames, num_inputs)

Input data for the cell.

reset_gate : scalar or numpy array, shape (num_hiddens,)

Activation of the reset gate.

prev : scalar or numpy array, shape (num_hiddens,)

Cell output of the previous time step.

Returns:

numpy array, shape (num_frames, num_hiddens)

Activations of the gate for this data.

class madmom.ml.nn.layers.GRULayer¶

Recurrent network layer with Gated Recurrent Units (GRU) as proposed in [R66].

Parameters:

reset_gate : Gate

Reset gate.

update_gate : Gate

Update gate.

cell : GRUCell

GRU cell

hid_init : numpy array, shape (num_hiddens,), optional

Initial state of hidden units.

Notes

There are two formulations of the GRUCell in the literature. Here, we adopted the (slightly older) one proposed in [1], which is also implemented in the Lasagne toolbox.

References

[R66]

(1, 2) Kyunghyun Cho, Bart Van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio, “On the properties of neural machine translation: Encoder-decoder approaches”, http://arxiv.org/abs/1409.1259, 2014.

activate()¶

Activate the GRU layer.

Parameters:

data : numpy array, shape (num_frames, num_inputs)

Activate with this data.

Returns:

numpy array, shape (num_frames, num_hiddens)

Activations for this data.

class madmom.ml.nn.layers.Gate¶

Gate as used by LSTM layers.

Parameters:

weights : numpy array, shape ()

Weights.

bias : scalar or numpy array, shape ()

Bias.

recurrent_weights : numpy array, shape ()

Recurrent weights.

peephole_weights : numpy array, optional, shape ()

Peephole weights.

activation_fn : numpy ufunc, optional

Activation function.

activate()¶

Activate the gate with the given data, state (if peephole connections are used) and the previous output (if recurrent connections are used).

Parameters:

data : scalar or numpy array, shape ()

Input data for the cell.

prev : scalar or numpy array, shape ()

Output data of the previous time step.

state : scalar or numpy array, shape ()

State data of the {current | previous} time step.

Returns:

numpy array

Activations of the gate for this data.

class madmom.ml.nn.layers.LSTMLayer¶

Recurrent network layer with Long Short-Term Memory units.

Parameters:

input_gate : Gate

Input gate.

forget_gate : Gate

Forget gate.

cell : Cell

Cell (i.e. a Gate without peephole connections).

output_gate : Gate

Output gate.

activation_fn : numpy ufunc, optional

Activation function.

activate()¶

Activate the LSTM layer.

Parameters:

data : numpy array

Activate with this data.

Returns:

numpy array

Activations for this data.

class madmom.ml.nn.layers.Layer¶

Generic callable network layer.

activate()¶

Activate the layer.

Parameters:

data : numpy array

Activate with this data.

Returns:

numpy array

Activations for this data.

class madmom.ml.nn.layers.MaxPoolLayer¶

2D max-pooling network layer.

Parameters:

size : tuple

The size of the pooling region in each dimension.

stride : tuple, optional

The strides between successive pooling regions in each dimension. If None stride = size.

activate()¶

Activate the layer.

Parameters:

data : numpy array

Activate with this data.

Returns:

numpy array

Activations for this data.

class madmom.ml.nn.layers.RecurrentLayer¶

Recurrent network layer.

Parameters:

weights : numpy array, shape ()

Weights.

bias : scalar or numpy array, shape ()

Bias.

recurrent_weights : numpy array, shape ()

Recurrent weights.

activation_fn : numpy ufunc

Activation function.

activate()¶

Activate the layer.

Parameters:

data : numpy array

Activate with this data.

Returns:

numpy array

Activations for this data.

class madmom.ml.nn.layers.StrideLayer¶

Stride network layer.

Parameters:

block_size : int

Re-arrange (stride) the data in blocks of given size.

activate()¶

Activate the layer.

Parameters:

data : numpy array

Activate with this data.

Returns:

numpy array

Strided data.

madmom.ml.nn.layers.convolve¶

Convolve the data with the kernel in ‘valid’ mode, i.e. only where kernel and data fully overlaps.

Parameters:

data : numpy array

Data to be convolved.

kernel : numpy array

Convolution kernel

Returns:

numpy array

Convolved data

madmom.ml.nn.activations¶

This module contains neural network activation functions for the ml.nn module.

madmom.ml.nn.activations.linear(x, out=None)[source]¶

Linear function.

Parameters:

x : numpy array

Input data.

out : numpy array, optional

Array to hold the output data.

Returns:

numpy array

Unaltered input data.

madmom.ml.nn.activations.tanh(x, out=None)[source]¶

Hyperbolic tangent function.

Parameters:

x : numpy array

Input data.

out : numpy array, optional

Array to hold the output data.

Returns:

numpy array

Hyperbolic tangent of input data.

madmom.ml.nn.activations.sigmoid(x, out=None)[source]¶

Logistic sigmoid function.

Parameters:

x : numpy array

Input data.

out : numpy array, optional

Array to hold the output data.

Returns:

numpy array

Logistic sigmoid of input data.

madmom.ml.nn.activations.relu(x, out=None)[source]¶

Rectified linear (unit) transfer function.

Parameters:

x : numpy array

Input data.

out : numpy array, optional

Array to hold the output data.

Returns:

numpy array

Rectified linear of input data.

madmom.ml.nn.activations.softmax(x, out=None)[source]¶

Softmax transfer function.

Parameters:

x : numpy array

Input data.

out : numpy array, optional

Array to hold the output data.

Returns:

numpy array

Softmax of input data.