madmom.ml.nn

Neural Network package.

madmom.ml.nn.average_predictions(predictions)[source]

Returns the average of all predictions.

Parameters:
predictions : list

Predictions (i.e. NN activation functions).

Returns:
numpy array

Averaged prediction.

class madmom.ml.nn.NeuralNetwork(layers)[source]

Neural Network class.

Parameters:
layers : list

Layers of the Neural Network.

Examples

Create a NeuralNetwork from the given layers.

>>> from madmom.ml.nn.layers import FeedForwardLayer
>>> from madmom.ml.nn.activations import tanh, sigmoid
>>> l1_weights = np.array([[0.5, -1., -0.3 , -0.2]])
>>> l1_bias = np.array([0.05, 0., 0.8, -0.5])
>>> l1 = FeedForwardLayer(l1_weights, l1_bias, activation_fn=tanh)
>>> l2_weights = np.array([-1, 0.9, -0.2 , 0.4])
>>> l2_bias = np.array([0.5])
>>> l2 = FeedForwardLayer(l2_weights, l2_bias, activation_fn=sigmoid)
>>> nn = NeuralNetwork([l1, l2])
>>> nn  
<madmom.ml.nn.NeuralNetwork object at 0x...>
>>> nn(np.array([[0], [0.5], [1], [0], [1], [2], [0]]))
... 
array([0.53305, 0.36903, 0.265 , 0.53305, 0.265 , 0.18612, 0.53305])
process(data, reset=True, **kwargs)[source]

Process the given data with the neural network.

Parameters:
data : numpy array, shape (num_frames, num_inputs)

Activate the network with this data.

reset : bool, optional

Reset the network to its initial state before activating it.

Returns:
numpy array, shape (num_frames, num_outputs)

Network predictions for this data.

reset()[source]

Reset the neural network to its initial state.

class madmom.ml.nn.NeuralNetworkEnsemble(networks, ensemble_fn=<function average_predictions>, num_threads=None, **kwargs)[source]

Neural Network ensemble class.

Parameters:
networks : list

List of the Neural Networks.

ensemble_fn : function or callable, optional

Ensemble function to be applied to the predictions of the neural network ensemble (default: average predictions).

num_threads : int, optional

Number of parallel working threads.

Notes

If ensemble_fn is set to ‘None’, the predictions are returned as a list with the same length as the number of networks given.

Examples

Create a NeuralNetworkEnsemble from the networks. Instead of supplying the neural networks as parameter, they can also be loaded from file:

>>> from madmom.models import ONSETS_BRNN_PP
>>> nn = NeuralNetworkEnsemble.load(ONSETS_BRNN_PP)
>>> nn  
<madmom.ml.nn.NeuralNetworkEnsemble object at 0x...>
>>> nn(np.array([[0], [0.5], [1], [0], [1], [2], [0]]))
... 
array([0.00116, 0.00213, 0.01428, 0.00729, 0.0088 , 0.21965, 0.00532])
classmethod load(nn_files, **kwargs)[source]

Instantiate a new Neural Network ensemble from a list of files.

Parameters:
nn_files : list

List of neural network model file names.

kwargs : dict, optional

Keyword arguments passed to NeuralNetworkEnsemble.

Returns:
NeuralNetworkEnsemble

NeuralNetworkEnsemble instance.

static add_arguments(parser, nn_files)[source]

Add neural network options to an existing parser.

Parameters:
parser : argparse parser instance

Existing argparse parser object.

nn_files : list

Neural network model files.

Returns:
argparse argument group

Neural network argument parser group.

madmom.ml.nn.layers

This module contains neural network layers for the ml.nn module.

class madmom.ml.nn.layers.AverageLayer

Average layer.

Parameters:
axis : None or int or tuple of ints, optional

Axis or axes along which the means are computed. The default is to compute the mean of the flattened array.

dtype : data-type, optional

Type to use in computing the mean. For integer inputs, the default is float64; for floating point inputs, it is the same as the input dtype.

keepdims : bool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one.

activate()

Activate the layer.

Parameters:
data : numpy array

Activate with this data.

Returns:
numpy array

Averaged data.

class madmom.ml.nn.layers.BatchNormLayer

Batch normalization layer with activation function. The previous layer is usually linear with no bias - the BatchNormLayer’s beta parameter replaces it. See [1] for a detailed understanding of the parameters.

Parameters:
beta : numpy array

Values for the beta parameter. Must be broadcastable to the incoming shape.

gamma : numpy array

Values for the gamma parameter. Must be broadcastable to the incoming shape.

mean : numpy array

Mean values of incoming data. Must be broadcastable to the incoming shape.

inv_std : numpy array

Inverse standard deviation of incoming data. Must be broadcastable to the incoming shape.

activation_fn : numpy ufunc

Activation function.

References

[1](1, 2) “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift” Sergey Ioffe and Christian Szegedy. http://arxiv.org/abs/1502.03167, 2015.
activate()

Activate the layer.

Parameters:
data : numpy array

Activate with this data.

Returns:
numpy array

Normalized data.

class madmom.ml.nn.layers.BidirectionalLayer

Bidirectional network layer.

Parameters:
fwd_layer : Layer instance

Forward layer.

bwd_layer : Layer instance

Backward layer.

activate()

Activate the layer.

After activating the fwd_layer with the data and the bwd_layer with the data in reverse temporal order, the two activations are stacked and returned.

Parameters:
data : numpy array, shape (num_frames, num_inputs)

Activate with this data.

Returns:
numpy array, shape (num_frames, num_hiddens)

Activations for this data.

class madmom.ml.nn.layers.Cell

Cell as used by LSTM layers.

Parameters:
weights : numpy array, shape (num_inputs, num_hiddens)

Weights.

bias : scalar or numpy array, shape (num_hiddens,)

Bias.

recurrent_weights : numpy array, shape (num_hiddens, num_hiddens)

Recurrent weights.

activation_fn : numpy ufunc, optional

Activation function.

Notes

A Cell is the same as a Gate except it misses peephole connections and has a tanh activation function. It should not be used directly, only inside an LSTMLayer.

class madmom.ml.nn.layers.ConvolutionalLayer

Convolutional network layer.

Parameters:
weights : numpy array, shape (num_feature_maps, num_channels, <kernel>)

Weights.

bias : scalar or numpy array, shape (num_filters,)

Bias.

stride : int, optional

Stride of the convolution.

pad : {‘valid’, ‘same’, ‘full’}

A string indicating the size of the output:

  • full
    The output is the full discrete linear convolution of the inputs.
  • valid
    The output consists only of those elements that do not rely on the zero-padding.
  • same
    The output is the same size as the input, centered with respect to the ‘full’ output.
activation_fn : numpy ufunc

Activation function.

activate()

Activate the layer.

Parameters:
data : numpy array (num_frames, num_bins, num_channels)

Activate with this data.

Returns:
numpy array

Activations for this data.

class madmom.ml.nn.layers.FeedForwardLayer

Feed-forward network layer.

Parameters:
weights : numpy array, shape (num_inputs, num_hiddens)

Weights.

bias : scalar or numpy array, shape (num_hiddens,)

Bias.

activation_fn : numpy ufunc

Activation function.

activate()

Activate the layer.

Parameters:
data : numpy array, shape (num_frames, num_inputs)

Activate with this data.

Returns:
numpy array, shape (num_frames, num_hiddens)

Activations for this data.

class madmom.ml.nn.layers.GRUCell

Cell as used by GRU layers proposed in [1]. The cell output is computed by

\[h = tanh(W_{xh} * x_t + W_{hh} * h_{t-1} + b).\]
Parameters:
weights : numpy array, shape (num_inputs, num_hiddens)

Weights of the connections between inputs and cell.

bias : scalar or numpy array, shape (num_hiddens,)

Bias.

recurrent_weights : numpy array, shape (num_hiddens, num_hiddens)

Weights of the connections between cell and cell output of the previous time step.

activation_fn : numpy ufunc, optional

Activation function.

Notes

There are two formulations of the GRUCell in the literature. Here, we adopted the (slightly older) one proposed in [1], which is also implemented in the Lasagne toolbox.

It should not be used directly, only inside a GRULayer.

References

[1](1, 2, 3) Kyunghyun Cho, Bart Van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio, “On the properties of neural machine translation: Encoder-decoder approaches”, http://arxiv.org/abs/1409.1259, 2014.
activate()

Activate the cell with the given input, previous output and reset gate.

Parameters:
data : numpy array, shape (num_inputs,)

Input data for the cell.

prev : numpy array, shape (num_hiddens,)

Output of the previous time step.

reset_gate : numpy array, shape (num_hiddens,)

Activation of the reset gate.

Returns:
numpy array, shape (num_hiddens,)

Activations of the cell for this data.

class madmom.ml.nn.layers.GRULayer

Recurrent network layer with Gated Recurrent Units (GRU) as proposed in [1].

Parameters:
reset_gate : Gate

Reset gate.

update_gate : Gate

Update gate.

cell : GRUCell

GRU cell.

init : numpy array, shape (num_hiddens,), optional

Initial state of hidden units.

Notes

There are two formulations of the GRUCell in the literature. Here, we adopted the (slightly older) one proposed in [1], which is also implemented in the Lasagne toolbox.

References

[1](1, 2) Kyunghyun Cho, Bart van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio, “On the properties of neural machine translation: Encoder-decoder approaches”, http://arxiv.org/abs/1409.1259, 2014.
activate()

Activate the GRU layer.

Parameters:
data : numpy array, shape (num_frames, num_inputs)

Activate with this data.

reset : bool, optional

Reset the layer to its initial state before activating it.

Returns:
numpy array, shape (num_frames, num_hiddens)

Activations for this data.

reset()

Reset the layer to its initial state.

Parameters:
init : numpy array, shape (num_hiddens,), optional

Reset the hidden units to this initial state.

class madmom.ml.nn.layers.Gate

Gate as used by LSTM layers.

Parameters:
weights : numpy array, shape (num_inputs, num_hiddens)

Weights.

bias : scalar or numpy array, shape (num_hiddens,)

Bias.

recurrent_weights : numpy array, shape (num_hiddens, num_hiddens)

Recurrent weights.

peephole_weights : numpy array, shape (num_hiddens,), optional

Peephole weights.

activation_fn : numpy ufunc, optional

Activation function.

Notes

Gate should not be used directly, only inside an LSTMLayer.

activate()

Activate the gate with the given data, state (if peephole connections are used) and the previous output (if recurrent connections are used).

Parameters:
data : scalar or numpy array, shape (num_hiddens,)

Input data for the cell.

prev : scalar or numpy array, shape (num_hiddens,)

Output data of the previous time step.

state : scalar or numpy array, shape (num_hiddens,)

State data of the {current | previous} time step.

Returns:
numpy array, shape (num_hiddens,)

Activations of the gate for this data.

class madmom.ml.nn.layers.LSTMLayer

Recurrent network layer with Long Short-Term Memory units.

Parameters:
input_gate : Gate

Input gate.

forget_gate : Gate

Forget gate.

cell : Cell

Cell (i.e. a Gate without peephole connections).

output_gate : Gate

Output gate.

activation_fn : numpy ufunc, optional

Activation function.

init : numpy array, shape (num_hiddens, ), optional

Initial state of the layer.

cell_init : numpy array, shape (num_hiddens, ), optional

Initial state of the cell.

activate()

Activate the LSTM layer.

Parameters:
data : numpy array, shape (num_frames, num_inputs)

Activate with this data.

reset : bool, optional

Reset the layer to its initial state before activating it.

Returns:
numpy array, shape (num_frames, num_hiddens)

Activations for this data.

reset()

Reset the layer to its initial state.

Parameters:
init : numpy array, shape (num_hiddens,), optional

Reset the hidden units to this initial state.

cell_init : numpy array, shape (num_hiddens,), optional

Reset the cells to this initial state.

class madmom.ml.nn.layers.Layer

Generic callable network layer.

activate()

Activate the layer.

Parameters:
data : numpy array

Activate with this data.

Returns:
numpy array

Activations for this data.

reset()

Reset the layer to its initial state.

class madmom.ml.nn.layers.MaxPoolLayer

2D max-pooling network layer.

Parameters:
size : tuple

The size of the pooling region in each dimension.

stride : tuple, optional

The strides between successive pooling regions in each dimension. If None stride = size.

activate()

Activate the layer.

Parameters:
data : numpy array

Activate with this data.

Returns:
numpy array

Max pooled data.

class madmom.ml.nn.layers.PadLayer

Padding layer that pads the input with a constant value.

Parameters:
width : int

Width of the padding (only one value for all dimensions)

axes : iterable

Indices of axes to be padded

value : float

Value to be used for padding.

activate()

Activate the layer.

Parameters:
data : numpy array

Activate with this data.

Returns:
numpy array

Padded data.

class madmom.ml.nn.layers.RecurrentLayer

Recurrent network layer.

Parameters:
weights : numpy array, shape (num_inputs, num_hiddens)

Weights.

bias : scalar or numpy array, shape (num_hiddens,)

Bias.

recurrent_weights : numpy array, shape (num_hiddens, num_hiddens)

Recurrent weights.

activation_fn : numpy ufunc

Activation function.

init : numpy array, shape (num_hiddens,), optional

Initial state of hidden units.

activate()

Activate the layer.

Parameters:
data : numpy array, shape (num_frames, num_inputs)

Activate with this data.

reset : bool, optional

Reset the layer to its initial state before activating it.

Returns:
numpy array, shape (num_frames, num_hiddens)

Activations for this data.

reset()

Reset the layer to its initial state.

Parameters:
init : numpy array, shape (num_hiddens,), optional

Reset the hidden units to this initial state.

class madmom.ml.nn.layers.ReshapeLayer

Reshape Layer.

Parameters:
newshape : int or tuple of ints

The new shape should be compatible with the original shape. If an integer, then the result will be a 1-D array of that length. One shape dimension can be -1. In this case, the value is inferred from the length of the array and remaining dimensions.

order : {‘C’, ‘F’, ‘A’}, optional

Index order or the input. See np.reshape for a detailed description.

activate()

Activate the layer.

Parameters:
data : numpy array

Activate with this data.

Returns:
numpy array

Reshaped data.

class madmom.ml.nn.layers.StrideLayer

Stride network layer.

Parameters:
block_size : int

Re-arrange (stride) the data in blocks of given size.

activate()

Activate the layer.

Parameters:
data : numpy array

Activate with this data.

Returns:
numpy array

Strided data.

class madmom.ml.nn.layers.TransposeLayer

Transpose layer.

Parameters:
axes : list of ints, optional

By default, reverse the dimensions of the input, otherwise permute the axes of the input according to the values given.

activate()

Activate the layer.

Parameters:
data : numpy array

Activate with this data.

Returns:
numpy array

Transposed data.

madmom.ml.nn.layers.convolve

Convolve the data with the kernel in ‘valid’ mode, i.e. only where kernel and data fully overlaps.

Parameters:
data : numpy array

Data to be convolved.

kernel : numpy array

Convolution kernel

Returns:
numpy array

Convolved data

madmom.ml.nn.activations

This module contains neural network activation functions for the ml.nn module.

madmom.ml.nn.activations.linear(x, out=None)[source]

Linear function.

Parameters:
x : numpy array

Input data.

out : numpy array, optional

Array to hold the output data.

Returns:
numpy array

Unaltered input data.

madmom.ml.nn.activations.tanh(x, out=None)[source]

Hyperbolic tangent function.

Parameters:
x : numpy array

Input data.

out : numpy array, optional

Array to hold the output data.

Returns:
numpy array

Hyperbolic tangent of input data.

madmom.ml.nn.activations.sigmoid(x, out=None)[source]

Logistic sigmoid function.

Parameters:
x : numpy array

Input data.

out : numpy array, optional

Array to hold the output data.

Returns:
numpy array

Logistic sigmoid of input data.

madmom.ml.nn.activations.relu(x, out=None)[source]

Rectified linear (unit) transfer function.

Parameters:
x : numpy array

Input data.

out : numpy array, optional

Array to hold the output data.

Returns:
numpy array

Rectified linear of input data.

madmom.ml.nn.activations.elu(x, out=None)[source]

Exponential linear (unit) transfer function.

Parameters:
x : numpy array

Input data.

out : numpy array, optional

Array to hold the output data.

Returns:
numpy array

Exponential linear of input data

References

[1]Djork-Arné Clevert, Thomas Unterthiner, Sepp Hochreiter (2015): Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs), http://arxiv.org/abs/1511.07289
madmom.ml.nn.activations.softmax(x, out=None)[source]

Softmax transfer function.

Parameters:
x : numpy array

Input data.

out : numpy array, optional

Array to hold the output data.

Returns:
numpy array

Softmax of input data.