madmom.ml.nn¶

Neural Network package.

madmom.ml.nn.average_predictions(predictions)[source]¶

Returns the average of all predictions.

Parameters:

predictions : list

Predictions (i.e. NN activation functions).

Returns:

numpy array

Averaged prediction.

class madmom.ml.nn.NeuralNetwork(layers)[source]¶

Neural Network class.

Parameters:

layers : list

Layers of the Neural Network.

Examples

Create a NeuralNetwork from the given layers.

>>> from madmom.ml.nn.layers import FeedForwardLayer
>>> from madmom.ml.nn.activations import tanh, sigmoid
>>> l1_weights = np.array([[0.5, -1., -0.3 , -0.2]])
>>> l1_bias = np.array([0.05, 0., 0.8, -0.5])
>>> l1 = FeedForwardLayer(l1_weights, l1_bias, activation_fn=tanh)
>>> l2_weights = np.array([-1, 0.9, -0.2 , 0.4])
>>> l2_bias = np.array([0.5])
>>> l2 = FeedForwardLayer(l2_weights, l2_bias, activation_fn=sigmoid)
>>> nn = NeuralNetwork([l1, l2])
>>> nn  
<madmom.ml.nn.NeuralNetwork object at 0x...>
>>> nn(np.array([[0], [0.5], [1], [0], [1], [2], [0]]))
... 
array([ 0.53305, 0.36903, 0.265 , 0.53305, 0.265 , 0.18612, 0.53305])

process(data, reset=True, **kwargs)[source]¶

Process the given data with the neural network.

Parameters:

data : numpy array, shape (num_frames, num_inputs)

Activate the network with this data.

reset : bool, optional

Reset the network to its initial state before activating it.

Returns:

numpy array, shape (num_frames, num_outputs)

Network predictions for this data.

reset()[source]¶: Reset the neural network to its initial state.

class madmom.ml.nn.NeuralNetworkEnsemble(networks, ensemble_fn=<function average_predictions>, num_threads=None, **kwargs)[source]¶

Neural Network ensemble class.

Parameters:

networks : list

List of the Neural Networks.

ensemble_fn : function or callable, optional

Ensemble function to be applied to the predictions of the neural network ensemble (default: average predictions).

num_threads : int, optional

Number of parallel working threads.

Notes

If ensemble_fn is set to ‘None’, the predictions are returned as a list with the same length as the number of networks given.

Examples

Create a NeuralNetworkEnsemble from the networks. Instead of supplying the neural networks as parameter, they can also be loaded from file:

>>> from madmom.models import ONSETS_BRNN_PP
>>> nn = NeuralNetworkEnsemble.load(ONSETS_BRNN_PP)
>>> nn  
<madmom.ml.nn.NeuralNetworkEnsemble object at 0x...>
>>> nn(np.array([[0], [0.5], [1], [0], [1], [2], [0]]))
... 
array([ 0.00116, 0.00213, 0.01428, 0.00729, 0.0088 , 0.21965, 0.00532])

classmethod load(nn_files, **kwargs)[source]¶

Parameters:

nn_files : list

List of neural network model file names.

kwargs : dict, optional

Keyword arguments passed to NeuralNetworkEnsemble.

Returns:

NeuralNetworkEnsemble

NeuralNetworkEnsemble instance.

static add_arguments(parser, nn_files)[source]¶

Add neural network options to an existing parser.

Parameters:

parser : argparse parser instance

Existing argparse parser object.

nn_files : list

Neural network model files.

Returns:

argparse argument group

Neural network argument parser group.

madmom.ml.nn.layers¶

This module contains neural network layers for the ml.nn module.

class madmom.ml.nn.layers.BatchNormLayer¶

Batch normalization layer with activation function. The previous layer is usually linear with no bias - the BatchNormLayer’s beta parameter replaces it. See [R68] for a detailed understanding of the parameters.

Parameters:

beta : numpy array

Values for the beta parameter. Must be broadcastable to the incoming shape.

gamma : numpy array

Values for the gamma parameter. Must be broadcastable to the incoming shape.

mean : numpy array

Mean values of incoming data. Must be broadcastable to the incoming shape.

inv_std : numpy array

Inverse standard deviation of incoming data. Must be broadcastable to the incoming shape.

activation_fn : numpy ufunc

Activation function.

References

[R68]

(1, 2) “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift” Sergey Ioffe and Christian Szegedy. http://arxiv.org/abs/1502.03167, 2015.

activate()¶

Activate the layer.

Parameters:

data : numpy array

Activate with this data.

Returns:

numpy array

Activations for this data.

class madmom.ml.nn.layers.BidirectionalLayer¶

Bidirectional network layer.

Parameters:

fwd_layer : Layer instance

Forward layer.

bwd_layer : Layer instance

Backward layer.

activate()¶

Activate the layer.

After activating the fwd_layer with the data and the bwd_layer with the data in reverse temporal order, the two activations are stacked and returned.

Parameters:

data : numpy array, shape (num_frames, num_inputs)

Activate with this data.

Returns:

numpy array, shape (num_frames, num_hiddens)

Activations for this data.

class madmom.ml.nn.layers.Cell¶

Cell as used by LSTM layers.

Parameters:

weights : numpy array, shape (num_inputs, num_hiddens)

Weights.

bias : scalar or numpy array, shape (num_hiddens,)

Bias.

recurrent_weights : numpy array, shape (num_hiddens, num_hiddens)

Recurrent weights.

activation_fn : numpy ufunc, optional

Activation function.

Notes

A Cell is the same as a Gate except it misses peephole connections and has a tanh activation function. It should not be used directly, only inside an LSTMLayer.

class madmom.ml.nn.layers.ConvolutionalLayer¶

Convolutional network layer.

Parameters:

weights : numpy array, shape (num_feature_maps, num_channels, <kernel>)

Weights.

bias : scalar or numpy array, shape (num_filters,)

Bias.

stride : int, optional

Stride of the convolution.

pad : {‘valid’, ‘same’, ‘full’}

A string indicating the size of the output:

full

The output is the full discrete linear convolution of the inputs.

valid

The output consists only of those elements that do not rely on the zero-padding.

same

The output is the same size as the input, centered with respect to the ‘full’ output.

activation_fn : numpy ufunc

Activation function.

activate()¶

Activate the layer.

Parameters:

data : numpy array (num_frames, num_bins, num_channels)

Activate with this data.

Returns:

numpy array

Activations for this data.

class madmom.ml.nn.layers.FeedForwardLayer¶

Feed-forward network layer.

Parameters:

weights : numpy array, shape (num_inputs, num_hiddens)

Weights.

bias : scalar or numpy array, shape (num_hiddens,)

Bias.

activation_fn : numpy ufunc

Activation function.

activate()¶

Activate the layer.

Parameters:

data : numpy array, shape (num_frames, num_inputs)

Activate with this data.

Returns:

numpy array, shape (num_frames, num_hiddens)

Activations for this data.

class madmom.ml.nn.layers.GRUCell¶

Cell as used by GRU layers proposed in [R69]. The cell output is computed by

\[h = tanh(W_{xh} * x_t + W_{hh} * h_{t-1} + b).\]

Parameters:

weights : numpy array, shape (num_inputs, num_hiddens)

Weights of the connections between inputs and cell.

bias : scalar or numpy array, shape (num_hiddens,)

Bias.

recurrent_weights : numpy array, shape (num_hiddens, num_hiddens)

Weights of the connections between cell and cell output of the previous time step.

activation_fn : numpy ufunc, optional

Activation function.

Notes

There are two formulations of the GRUCell in the literature. Here, we adopted the (slightly older) one proposed in [R69], which is also implemented in the Lasagne toolbox.

It should not be used directly, only inside a GRULayer.

References

[R69]

(1, 2, 3) Kyunghyun Cho, Bart Van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio, “On the properties of neural machine translation: Encoder-decoder approaches”, http://arxiv.org/abs/1409.1259, 2014.

activate()¶

Activate the cell with the given input, previous output and reset gate.

Parameters:

data : numpy array, shape (num_inputs,)

Input data for the cell.

prev : numpy array, shape (num_hiddens,)

Output of the previous time step.

reset_gate : numpy array, shape (num_hiddens,)

Activation of the reset gate.

Returns:

numpy array, shape (num_hiddens,)

Activations of the cell for this data.

class madmom.ml.nn.layers.GRULayer¶

Recurrent network layer with Gated Recurrent Units (GRU) as proposed in [R70].

Parameters:

reset_gate : Gate

Reset gate.

update_gate : Gate

Update gate.

cell : GRUCell

GRU cell.

init : numpy array, shape (num_hiddens,), optional

Initial state of hidden units.

Notes

There are two formulations of the GRUCell in the literature. Here, we adopted the (slightly older) one proposed in [1], which is also implemented in the Lasagne toolbox.

References

[R70]

(1, 2) Kyunghyun Cho, Bart van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio, “On the properties of neural machine translation: Encoder-decoder approaches”, http://arxiv.org/abs/1409.1259, 2014.

activate()¶

Activate the GRU layer.

Parameters:

data : numpy array, shape (num_frames, num_inputs)

Activate with this data.

reset : bool, optional

Reset the layer to its initial state before activating it.

Returns:

numpy array, shape (num_frames, num_hiddens)

Activations for this data.

class madmom.ml.nn.layers.Gate¶

Gate as used by LSTM layers.

Parameters:

weights : numpy array, shape (num_inputs, num_hiddens)

Weights.

bias : scalar or numpy array, shape (num_hiddens,)

Bias.

recurrent_weights : numpy array, shape (num_hiddens, num_hiddens)

Recurrent weights.

peephole_weights : numpy array, shape (num_hiddens,), optional

Peephole weights.

activation_fn : numpy ufunc, optional

Activation function.

Notes

Gate should not be used directly, only inside an LSTMLayer.

activate()¶

Activate the gate with the given data, state (if peephole connections are used) and the previous output (if recurrent connections are used).

Parameters:

data : scalar or numpy array, shape (num_hiddens,)

Input data for the cell.

prev : scalar or numpy array, shape (num_hiddens,)

Output data of the previous time step.

state : scalar or numpy array, shape (num_hiddens,)

State data of the {current | previous} time step.

Returns:

numpy array, shape (num_hiddens,)

Activations of the gate for this data.

class madmom.ml.nn.layers.LSTMLayer¶

Recurrent network layer with Long Short-Term Memory units.

Parameters:

input_gate : Gate

Input gate.

forget_gate : Gate

Forget gate.

cell : Cell

Cell (i.e. a Gate without peephole connections).

output_gate : Gate

Output gate.

activation_fn : numpy ufunc, optional

Activation function.

init : numpy array, shape (num_hiddens, ), optional

Initial state of the layer.

cell_init : numpy array, shape (num_hiddens, ), optional

Initial state of the cell.

activate()¶

Activate the LSTM layer.

Parameters:

data : numpy array, shape (num_frames, num_inputs)

Activate with this data.

reset : bool, optional

Reset the layer to its initial state before activating it.

Returns:

numpy array, shape (num_frames, num_hiddens)

Activations for this data.

reset()¶

Reset the layer to its initial state.

Parameters:

init : numpy array, shape (num_hiddens,), optional

Reset the hidden units to this initial state.

cell_init : numpy array, shape (num_hiddens,), optional

Reset the cells to this initial state.

class madmom.ml.nn.layers.Layer¶

Generic callable network layer.

activate()¶

Activate the layer.

Parameters:

data : numpy array

Activate with this data.

Returns:

numpy array

Activations for this data.

reset()¶: Reset the layer to its initial state.

class madmom.ml.nn.layers.MaxPoolLayer¶

2D max-pooling network layer.

Parameters:

size : tuple

The size of the pooling region in each dimension.

stride : tuple, optional

The strides between successive pooling regions in each dimension. If None stride = size.

activate()¶

Activate the layer.

Parameters:

data : numpy array

Activate with this data.

Returns:

numpy array

Activations for this data.

class madmom.ml.nn.layers.RecurrentLayer¶

Recurrent network layer.

Parameters:

weights : numpy array, shape (num_inputs, num_hiddens)

Weights.

bias : scalar or numpy array, shape (num_hiddens,)

Bias.

recurrent_weights : numpy array, shape (num_hiddens, num_hiddens)

Recurrent weights.

activation_fn : numpy ufunc

Activation function.

init : numpy array, shape (num_hiddens,), optional

Initial state of hidden units.

activate()¶

Activate the layer.

Parameters:

data : numpy array, shape (num_frames, num_inputs)

Activate with this data.

reset : bool, optional

Reset the layer to its initial state before activating it.

Returns:

numpy array, shape (num_frames, num_hiddens)

Activations for this data.

reset()¶

Reset the layer to its initial state.

Parameters:

init : numpy array, shape (num_hiddens,), optional

Reset the hidden units to this initial state.

class madmom.ml.nn.layers.StrideLayer¶

Stride network layer.

Parameters:

block_size : int

Re-arrange (stride) the data in blocks of given size.

activate()¶

Activate the layer.

Parameters:

data : numpy array

Activate with this data.

Returns:

numpy array

Strided data.

madmom.ml.nn.layers.convolve¶

Convolve the data with the kernel in ‘valid’ mode, i.e. only where kernel and data fully overlaps.

Parameters:

data : numpy array

Data to be convolved.

kernel : numpy array

Convolution kernel

Returns:

numpy array

Convolved data

madmom.ml.nn.activations¶

This module contains neural network activation functions for the ml.nn module.

madmom.ml.nn.activations.linear(x, out=None)[source]¶

Linear function.

Parameters:

x : numpy array

Input data.

out : numpy array, optional

Array to hold the output data.

Returns:

numpy array

Unaltered input data.

madmom.ml.nn.activations.tanh(x, out=None)[source]¶

Hyperbolic tangent function.

Parameters:

x : numpy array

Input data.

out : numpy array, optional

Array to hold the output data.

Returns:

numpy array

Hyperbolic tangent of input data.

madmom.ml.nn.activations.sigmoid(x, out=None)[source]¶

Logistic sigmoid function.

Parameters:

x : numpy array

Input data.

out : numpy array, optional

Array to hold the output data.

Returns:

numpy array

Logistic sigmoid of input data.

madmom.ml.nn.activations.relu(x, out=None)[source]¶

Rectified linear (unit) transfer function.

Parameters:

x : numpy array

Input data.

out : numpy array, optional

Array to hold the output data.

Returns:

numpy array

Rectified linear of input data.

madmom.ml.nn.activations.elu(x, out=None)[source]¶

Exponential linear (unit) transfer function.

Parameters:

x : numpy array

Input data.

out : numpy array, optional

Array to hold the output data.

Returns:

numpy array

Exponential linear of input data

References

[R71]

Djork-Arné Clevert, Thomas Unterthiner, Sepp Hochreiter (2015): Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs), http://arxiv.org/abs/1511.07289

madmom.ml.nn.activations.softmax(x, out=None)[source]¶

Softmax transfer function.

Parameters:

x : numpy array

Input data.

out : numpy array, optional

Array to hold the output data.

Returns:

numpy array

Softmax of input data.