madmom.ml.nn¶

Neural Network package.

madmom.ml.nn.average_predictions(predictions)[source]¶

Returns the average of all predictions.

Parameters:	predictions : list Predictions (i.e. NN activation functions).
Returns:	numpy array Averaged prediction.

class madmom.ml.nn.NeuralNetwork(layers)[source]¶

Neural Network class.

Parameters:	layers : list Layers of the Neural Network.

Examples

Create a NeuralNetwork from the given layers.

>>> from madmom.ml.nn.layers import FeedForwardLayer
>>> from madmom.ml.nn.activations import tanh, sigmoid
>>> l1_weights = np.array([[0.5, -1., -0.3 , -0.2]])
>>> l1_bias = np.array([0.05, 0., 0.8, -0.5])
>>> l1 = FeedForwardLayer(l1_weights, l1_bias, activation_fn=tanh)
>>> l2_weights = np.array([-1, 0.9, -0.2 , 0.4])
>>> l2_bias = np.array([0.5])
>>> l2 = FeedForwardLayer(l2_weights, l2_bias, activation_fn=sigmoid)
>>> nn = NeuralNetwork([l1, l2])
>>> nn  
<madmom.ml.nn.NeuralNetwork object at 0x...>
>>> nn(np.array([[0], [0.5], [1], [0], [1], [2], [0]]))
... 
array([0.53305, 0.36903, 0.265 , 0.53305, 0.265 , 0.18612, 0.53305])

process(data, reset=True, **kwargs)[source]¶

Process the given data with the neural network.

Parameters:	data : numpy array, shape (num_frames, num_inputs) Activate the network with this data. reset : bool, optional Reset the network to its initial state before activating it.
Returns:	numpy array, shape (num_frames, num_outputs) Network predictions for this data.

reset()[source]¶: Reset the neural network to its initial state.

class madmom.ml.nn.NeuralNetworkEnsemble(networks, ensemble_fn=<function average_predictions>, num_threads=None, **kwargs)[source]¶

Neural Network ensemble class.

Parameters:	networks : list List of the Neural Networks. ensemble_fn : function or callable, optional Ensemble function to be applied to the predictions of the neural network ensemble (default: average predictions). num_threads : int, optional Number of parallel working threads.

Notes

If ensemble_fn is set to ‘None’, the predictions are returned as a list with the same length as the number of networks given.

Examples

Create a NeuralNetworkEnsemble from the networks. Instead of supplying the neural networks as parameter, they can also be loaded from file:

>>> from madmom.models import ONSETS_BRNN_PP
>>> nn = NeuralNetworkEnsemble.load(ONSETS_BRNN_PP)
>>> nn  
<madmom.ml.nn.NeuralNetworkEnsemble object at 0x...>
>>> nn(np.array([[0], [0.5], [1], [0], [1], [2], [0]]))
... 
array([0.00116, 0.00213, 0.01428, 0.00729, 0.0088 , 0.21965, 0.00532])

classmethod load(nn_files, **kwargs)[source]¶

Instantiate a new Neural Network ensemble from a list of files.

Parameters:	nn_files : list List of neural network model file names. kwargs : dict, optional Keyword arguments passed to NeuralNetworkEnsemble.
Returns:	NeuralNetworkEnsemble NeuralNetworkEnsemble instance.

static add_arguments(parser, nn_files)[source]¶

Add neural network options to an existing parser.

Parameters:	parser : argparse parser instance Existing argparse parser object. nn_files : list Neural network model files.
Returns:	argparse argument group Neural network argument parser group.

madmom.ml.nn.layers¶

This module contains neural network layers for the ml.nn module.

class madmom.ml.nn.layers.AverageLayer¶

Average layer.

Parameters:

axis : None or int or tuple of ints, optional: Axis or axes along which the means are computed. The default is to compute the mean of the flattened array.
dtype : data-type, optional: Type to use in computing the mean. For integer inputs, the default is float64; for floating point inputs, it is the same as the input dtype.
keepdims : bool, optional: If this is set to True, the axes which are reduced are left in the result as dimensions with size one.

activate()¶

Activate the layer.

Parameters:	data : numpy array Activate with this data.
Returns:	numpy array Averaged data.

class madmom.ml.nn.layers.BatchNormLayer¶

Batch normalization layer with activation function. The previous layer is usually linear with no bias - the BatchNormLayer’s beta parameter replaces it. See [1] for a detailed understanding of the parameters.

Parameters:

beta : numpy array: Values for the beta parameter. Must be broadcastable to the incoming shape.
gamma : numpy array: Values for the gamma parameter. Must be broadcastable to the incoming shape.
mean : numpy array: Mean values of incoming data. Must be broadcastable to the incoming shape.
inv_std : numpy array: Inverse standard deviation of incoming data. Must be broadcastable to the incoming shape.
activation_fn : numpy ufunc: Activation function.

References

[1]	(1, 2) “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift” Sergey Ioffe and Christian Szegedy. http://arxiv.org/abs/1502.03167, 2015.

activate()¶

Activate the layer.

Parameters:	data : numpy array Activate with this data.
Returns:	numpy array Normalized data.

class madmom.ml.nn.layers.BidirectionalLayer¶

Bidirectional network layer.

Parameters:	fwd_layer : Layer instance Forward layer. bwd_layer : Layer instance Backward layer.

activate()¶

Activate the layer.

After activating the fwd_layer with the data and the bwd_layer with the data in reverse temporal order, the two activations are stacked and returned.

Parameters:	data : numpy array, shape (num_frames, num_inputs) Activate with this data.
Returns:	numpy array, shape (num_frames, num_hiddens) Activations for this data.

class madmom.ml.nn.layers.Cell¶

Cell as used by LSTM layers.

Parameters:	weights : numpy array, shape (num_inputs, num_hiddens) Weights. bias : scalar or numpy array, shape (num_hiddens,) Bias. recurrent_weights : numpy array, shape (num_hiddens, num_hiddens) Recurrent weights. activation_fn : numpy ufunc, optional Activation function.

Notes

A Cell is the same as a Gate except it misses peephole connections and has a tanh activation function. It should not be used directly, only inside an LSTMLayer.

class madmom.ml.nn.layers.ConvolutionalLayer¶

Convolutional network layer.

Parameters:

weights : numpy array, shape (num_feature_maps, num_channels, <kernel>)

Weights.

bias : scalar or numpy array, shape (num_filters,)

Bias.

stride : int, optional

Stride of the convolution.

pad : {‘valid’, ‘same’, ‘full’}

A string indicating the size of the output:

full

The output is the full discrete linear convolution of the inputs.
valid

The output consists only of those elements that do not rely on the zero-padding.
same

The output is the same size as the input, centered with respect to the ‘full’ output.

activation_fn : numpy ufunc

Activation function.

activate()¶

Activate the layer.

Parameters:	data : numpy array (num_frames, num_bins, num_channels) Activate with this data.
Returns:	numpy array Activations for this data.

class madmom.ml.nn.layers.FeedForwardLayer¶

Feed-forward network layer.

Parameters:	weights : numpy array, shape (num_inputs, num_hiddens) Weights. bias : scalar or numpy array, shape (num_hiddens,) Bias. activation_fn : numpy ufunc Activation function.

activate()¶

Activate the layer.

Parameters:	data : numpy array, shape (num_frames, num_inputs) Activate with this data.
Returns:	numpy array, shape (num_frames, num_hiddens) Activations for this data.

class madmom.ml.nn.layers.GRUCell¶

Cell as used by GRU layers proposed in [1]. The cell output is computed by

\[h = tanh(W_{xh} * x_t + W_{hh} * h_{t-1} + b).\]

Parameters:	weights : numpy array, shape (num_inputs, num_hiddens) Weights of the connections between inputs and cell. bias : scalar or numpy array, shape (num_hiddens,) Bias. recurrent_weights : numpy array, shape (num_hiddens, num_hiddens) Weights of the connections between cell and cell output of the previous time step. activation_fn : numpy ufunc, optional Activation function.

Notes

There are two formulations of the GRUCell in the literature. Here, we adopted the (slightly older) one proposed in [1], which is also implemented in the Lasagne toolbox.

It should not be used directly, only inside a GRULayer.

References

[1]	(1, 2, 3) Kyunghyun Cho, Bart Van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio, “On the properties of neural machine translation: Encoder-decoder approaches”, http://arxiv.org/abs/1409.1259, 2014.

activate()¶

Activate the cell with the given input, previous output and reset gate.

Parameters:	data : numpy array, shape (num_inputs,) Input data for the cell. prev : numpy array, shape (num_hiddens,) Output of the previous time step. reset_gate : numpy array, shape (num_hiddens,) Activation of the reset gate.
Returns:	numpy array, shape (num_hiddens,) Activations of the cell for this data.

class madmom.ml.nn.layers.GRULayer¶

Recurrent network layer with Gated Recurrent Units (GRU) as proposed in [1].

Parameters:	reset_gate : `Gate` Reset gate. update_gate : `Gate` Update gate. cell : `GRUCell` GRU cell. init : numpy array, shape (num_hiddens,), optional Initial state of hidden units.

Notes

There are two formulations of the GRUCell in the literature. Here, we adopted the (slightly older) one proposed in [1], which is also implemented in the Lasagne toolbox.

References

[1]	(1, 2) Kyunghyun Cho, Bart van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio, “On the properties of neural machine translation: Encoder-decoder approaches”, http://arxiv.org/abs/1409.1259, 2014.

activate()¶

Activate the GRU layer.

Parameters:	data : numpy array, shape (num_frames, num_inputs) Activate with this data. reset : bool, optional Reset the layer to its initial state before activating it.
Returns:	numpy array, shape (num_frames, num_hiddens) Activations for this data.

reset()¶

Reset the layer to its initial state.

Parameters:	init : numpy array, shape (num_hiddens,), optional Reset the hidden units to this initial state.

class madmom.ml.nn.layers.Gate¶

Gate as used by LSTM layers.

Parameters:	weights : numpy array, shape (num_inputs, num_hiddens) Weights. bias : scalar or numpy array, shape (num_hiddens,) Bias. recurrent_weights : numpy array, shape (num_hiddens, num_hiddens) Recurrent weights. peephole_weights : numpy array, shape (num_hiddens,), optional Peephole weights. activation_fn : numpy ufunc, optional Activation function.

Notes

Gate should not be used directly, only inside an LSTMLayer.

activate()¶

Activate the gate with the given data, state (if peephole connections are used) and the previous output (if recurrent connections are used).

Parameters:	data : scalar or numpy array, shape (num_hiddens,) Input data for the cell. prev : scalar or numpy array, shape (num_hiddens,) Output data of the previous time step. state : scalar or numpy array, shape (num_hiddens,) State data of the {current \| previous} time step.
Returns:	numpy array, shape (num_hiddens,) Activations of the gate for this data.

class madmom.ml.nn.layers.LSTMLayer¶

Recurrent network layer with Long Short-Term Memory units.

Parameters:

input_gate : Gate: Input gate.
forget_gate : Gate: Forget gate.
cell : Cell: Cell (i.e. a Gate without peephole connections).
output_gate : Gate: Output gate.
activation_fn : numpy ufunc, optional: Activation function.
init : numpy array, shape (num_hiddens, ), optional: Initial state of the layer.
cell_init : numpy array, shape (num_hiddens, ), optional: Initial state of the cell.

activate()¶

Activate the LSTM layer.

Parameters:	data : numpy array, shape (num_frames, num_inputs) Activate with this data. reset : bool, optional Reset the layer to its initial state before activating it.
Returns:	numpy array, shape (num_frames, num_hiddens) Activations for this data.

reset()¶

Reset the layer to its initial state.

Parameters:	init : numpy array, shape (num_hiddens,), optional Reset the hidden units to this initial state. cell_init : numpy array, shape (num_hiddens,), optional Reset the cells to this initial state.

class madmom.ml.nn.layers.Layer¶

Generic callable network layer.

activate()¶

Activate the layer.

Parameters:	data : numpy array Activate with this data.
Returns:	numpy array Activations for this data.

reset()¶: Reset the layer to its initial state.

class madmom.ml.nn.layers.MaxPoolLayer¶

2D max-pooling network layer.

Parameters:	size : tuple The size of the pooling region in each dimension. stride : tuple, optional The strides between successive pooling regions in each dimension. If None stride = size.

activate()¶

Activate the layer.

Parameters:	data : numpy array Activate with this data.
Returns:	numpy array Max pooled data.

class madmom.ml.nn.layers.PadLayer¶

Padding layer that pads the input with a constant value.

Parameters:	width : int Width of the padding (only one value for all dimensions) axes : iterable Indices of axes to be padded value : float Value to be used for padding.

activate()¶

Activate the layer.

Parameters:	data : numpy array Activate with this data.
Returns:	numpy array Padded data.

class madmom.ml.nn.layers.RecurrentLayer¶

Recurrent network layer.

Parameters:	weights : numpy array, shape (num_inputs, num_hiddens) Weights. bias : scalar or numpy array, shape (num_hiddens,) Bias. recurrent_weights : numpy array, shape (num_hiddens, num_hiddens) Recurrent weights. activation_fn : numpy ufunc Activation function. init : numpy array, shape (num_hiddens,), optional Initial state of hidden units.

activate()¶

Activate the layer.

Parameters:	data : numpy array, shape (num_frames, num_inputs) Activate with this data. reset : bool, optional Reset the layer to its initial state before activating it.
Returns:	numpy array, shape (num_frames, num_hiddens) Activations for this data.

reset()¶

Reset the layer to its initial state.

Parameters:	init : numpy array, shape (num_hiddens,), optional Reset the hidden units to this initial state.

class madmom.ml.nn.layers.ReshapeLayer¶

Reshape Layer.

Parameters:

newshape : int or tuple of ints: The new shape should be compatible with the original shape. If an integer, then the result will be a 1-D array of that length. One shape dimension can be -1. In this case, the value is inferred from the length of the array and remaining dimensions.
order : {‘C’, ‘F’, ‘A’}, optional: Index order or the input. See np.reshape for a detailed description.

activate()¶

Activate the layer.

Parameters:	data : numpy array Activate with this data.
Returns:	numpy array Reshaped data.

class madmom.ml.nn.layers.StrideLayer¶

Stride network layer.

Parameters:	block_size : int Re-arrange (stride) the data in blocks of given size.

activate()¶

Activate the layer.

Parameters:	data : numpy array Activate with this data.
Returns:	numpy array Strided data.

class madmom.ml.nn.layers.TransposeLayer¶

Transpose layer.

Parameters:	axes : list of ints, optional By default, reverse the dimensions of the input, otherwise permute the axes of the input according to the values given.

activate()¶

Activate the layer.

Parameters:	data : numpy array Activate with this data.
Returns:	numpy array Transposed data.

madmom.ml.nn.layers.convolve¶

Convolve the data with the kernel in ‘valid’ mode, i.e. only where kernel and data fully overlaps.

Parameters:	data : numpy array Data to be convolved. kernel : numpy array Convolution kernel
Returns:	numpy array Convolved data

madmom.ml.nn.activations¶

This module contains neural network activation functions for the ml.nn module.

madmom.ml.nn.activations.linear(x, out=None)[source]¶

Linear function.

Parameters:	x : numpy array Input data. out : numpy array, optional Array to hold the output data.
Returns:	numpy array Unaltered input data.

madmom.ml.nn.activations.tanh(x, out=None)[source]¶

Hyperbolic tangent function.

Parameters:	x : numpy array Input data. out : numpy array, optional Array to hold the output data.
Returns:	numpy array Hyperbolic tangent of input data.

madmom.ml.nn.activations.sigmoid(x, out=None)[source]¶

Logistic sigmoid function.

Parameters:	x : numpy array Input data. out : numpy array, optional Array to hold the output data.
Returns:	numpy array Logistic sigmoid of input data.

madmom.ml.nn.activations.relu(x, out=None)[source]¶

Rectified linear (unit) transfer function.

Parameters:	x : numpy array Input data. out : numpy array, optional Array to hold the output data.
Returns:	numpy array Rectified linear of input data.

madmom.ml.nn.activations.elu(x, out=None)[source]¶

Exponential linear (unit) transfer function.

Parameters:	x : numpy array Input data. out : numpy array, optional Array to hold the output data.
Returns:	numpy array Exponential linear of input data

References

[1]	Djork-Arné Clevert, Thomas Unterthiner, Sepp Hochreiter (2015): Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs), http://arxiv.org/abs/1511.07289

madmom.ml.nn.activations.softmax(x, out=None)[source]¶

Softmax transfer function.

Parameters:	x : numpy array Input data. out : numpy array, optional Array to hold the output data.
Returns:	numpy array Softmax of input data.