madmom.ml.nn¶
Neural Network package.
-
madmom.ml.nn.
average_predictions
(predictions)[source]¶ Returns the average of all predictions.
Parameters: - predictions : list
Predictions (i.e. NN activation functions).
Returns: - numpy array
Averaged prediction.
-
class
madmom.ml.nn.
NeuralNetwork
(layers)[source]¶ Neural Network class.
Parameters: - layers : list
Layers of the Neural Network.
Examples
Create a NeuralNetwork from the given layers.
>>> from madmom.ml.nn.layers import FeedForwardLayer >>> from madmom.ml.nn.activations import tanh, sigmoid >>> l1_weights = np.array([[0.5, -1., -0.3 , -0.2]]) >>> l1_bias = np.array([0.05, 0., 0.8, -0.5]) >>> l1 = FeedForwardLayer(l1_weights, l1_bias, activation_fn=tanh) >>> l2_weights = np.array([-1, 0.9, -0.2 , 0.4]) >>> l2_bias = np.array([0.5]) >>> l2 = FeedForwardLayer(l2_weights, l2_bias, activation_fn=sigmoid) >>> nn = NeuralNetwork([l1, l2]) >>> nn <madmom.ml.nn.NeuralNetwork object at 0x...> >>> nn(np.array([[0], [0.5], [1], [0], [1], [2], [0]])) ... array([0.53305, 0.36903, 0.265 , 0.53305, 0.265 , 0.18612, 0.53305])
-
process
(data, reset=True, **kwargs)[source]¶ Process the given data with the neural network.
Parameters: - data : numpy array, shape (num_frames, num_inputs)
Activate the network with this data.
- reset : bool, optional
Reset the network to its initial state before activating it.
Returns: - numpy array, shape (num_frames, num_outputs)
Network predictions for this data.
-
class
madmom.ml.nn.
NeuralNetworkEnsemble
(networks, ensemble_fn=<function average_predictions>, num_threads=None, **kwargs)[source]¶ Neural Network ensemble class.
Parameters: - networks : list
List of the Neural Networks.
- ensemble_fn : function or callable, optional
Ensemble function to be applied to the predictions of the neural network ensemble (default: average predictions).
- num_threads : int, optional
Number of parallel working threads.
Notes
If ensemble_fn is set to ‘None’, the predictions are returned as a list with the same length as the number of networks given.
Examples
Create a NeuralNetworkEnsemble from the networks. Instead of supplying the neural networks as parameter, they can also be loaded from file:
>>> from madmom.models import ONSETS_BRNN_PP >>> nn = NeuralNetworkEnsemble.load(ONSETS_BRNN_PP) >>> nn <madmom.ml.nn.NeuralNetworkEnsemble object at 0x...> >>> nn(np.array([[0], [0.5], [1], [0], [1], [2], [0]])) ... array([0.00116, 0.00213, 0.01428, 0.00729, 0.0088 , 0.21965, 0.00532])
-
classmethod
load
(nn_files, **kwargs)[source]¶ Instantiate a new Neural Network ensemble from a list of files.
Parameters: - nn_files : list
List of neural network model file names.
- kwargs : dict, optional
Keyword arguments passed to NeuralNetworkEnsemble.
Returns: - NeuralNetworkEnsemble
NeuralNetworkEnsemble instance.
madmom.ml.nn.layers¶
This module contains neural network layers for the ml.nn module.
-
class
madmom.ml.nn.layers.
AverageLayer
¶ Average layer.
Parameters: - axis : None or int or tuple of ints, optional
Axis or axes along which the means are computed. The default is to compute the mean of the flattened array.
- dtype : data-type, optional
Type to use in computing the mean. For integer inputs, the default is float64; for floating point inputs, it is the same as the input dtype.
- keepdims : bool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one.
-
activate
()¶ Activate the layer.
Parameters: - data : numpy array
Activate with this data.
Returns: - numpy array
Averaged data.
-
class
madmom.ml.nn.layers.
BatchNormLayer
¶ Batch normalization layer with activation function. The previous layer is usually linear with no bias - the BatchNormLayer’s beta parameter replaces it. See [1] for a detailed understanding of the parameters.
Parameters: - beta : numpy array
Values for the beta parameter. Must be broadcastable to the incoming shape.
- gamma : numpy array
Values for the gamma parameter. Must be broadcastable to the incoming shape.
- mean : numpy array
Mean values of incoming data. Must be broadcastable to the incoming shape.
- inv_std : numpy array
Inverse standard deviation of incoming data. Must be broadcastable to the incoming shape.
- activation_fn : numpy ufunc
Activation function.
References
[1] (1, 2) “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift” Sergey Ioffe and Christian Szegedy. http://arxiv.org/abs/1502.03167, 2015. -
activate
()¶ Activate the layer.
Parameters: - data : numpy array
Activate with this data.
Returns: - numpy array
Normalized data.
-
class
madmom.ml.nn.layers.
BidirectionalLayer
¶ Bidirectional network layer.
Parameters: - fwd_layer : Layer instance
Forward layer.
- bwd_layer : Layer instance
Backward layer.
-
activate
()¶ Activate the layer.
After activating the fwd_layer with the data and the bwd_layer with the data in reverse temporal order, the two activations are stacked and returned.
Parameters: - data : numpy array, shape (num_frames, num_inputs)
Activate with this data.
Returns: - numpy array, shape (num_frames, num_hiddens)
Activations for this data.
-
class
madmom.ml.nn.layers.
Cell
¶ Cell as used by LSTM layers.
Parameters: - weights : numpy array, shape (num_inputs, num_hiddens)
Weights.
- bias : scalar or numpy array, shape (num_hiddens,)
Bias.
- recurrent_weights : numpy array, shape (num_hiddens, num_hiddens)
Recurrent weights.
- activation_fn : numpy ufunc, optional
Activation function.
Notes
A Cell is the same as a Gate except it misses peephole connections and has a tanh activation function. It should not be used directly, only inside an LSTMLayer.
-
class
madmom.ml.nn.layers.
ConvolutionalLayer
¶ Convolutional network layer.
Parameters: - weights : numpy array, shape (num_feature_maps, num_channels, <kernel>)
Weights.
- bias : scalar or numpy array, shape (num_filters,)
Bias.
- stride : int, optional
Stride of the convolution.
- pad : {‘valid’, ‘same’, ‘full’}
A string indicating the size of the output:
- full
- The output is the full discrete linear convolution of the inputs.
- valid
- The output consists only of those elements that do not rely on the zero-padding.
- same
- The output is the same size as the input, centered with respect to the ‘full’ output.
- activation_fn : numpy ufunc
Activation function.
-
activate
()¶ Activate the layer.
Parameters: - data : numpy array (num_frames, num_bins, num_channels)
Activate with this data.
Returns: - numpy array
Activations for this data.
-
class
madmom.ml.nn.layers.
FeedForwardLayer
¶ Feed-forward network layer.
Parameters: - weights : numpy array, shape (num_inputs, num_hiddens)
Weights.
- bias : scalar or numpy array, shape (num_hiddens,)
Bias.
- activation_fn : numpy ufunc
Activation function.
-
activate
()¶ Activate the layer.
Parameters: - data : numpy array, shape (num_frames, num_inputs)
Activate with this data.
Returns: - numpy array, shape (num_frames, num_hiddens)
Activations for this data.
-
class
madmom.ml.nn.layers.
GRUCell
¶ Cell as used by GRU layers proposed in [1]. The cell output is computed by
\[h = tanh(W_{xh} * x_t + W_{hh} * h_{t-1} + b).\]Parameters: - weights : numpy array, shape (num_inputs, num_hiddens)
Weights of the connections between inputs and cell.
- bias : scalar or numpy array, shape (num_hiddens,)
Bias.
- recurrent_weights : numpy array, shape (num_hiddens, num_hiddens)
Weights of the connections between cell and cell output of the previous time step.
- activation_fn : numpy ufunc, optional
Activation function.
Notes
There are two formulations of the GRUCell in the literature. Here, we adopted the (slightly older) one proposed in [1], which is also implemented in the Lasagne toolbox.
It should not be used directly, only inside a GRULayer.
References
[1] (1, 2, 3) Kyunghyun Cho, Bart Van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio, “On the properties of neural machine translation: Encoder-decoder approaches”, http://arxiv.org/abs/1409.1259, 2014. -
activate
()¶ Activate the cell with the given input, previous output and reset gate.
Parameters: - data : numpy array, shape (num_inputs,)
Input data for the cell.
- prev : numpy array, shape (num_hiddens,)
Output of the previous time step.
- reset_gate : numpy array, shape (num_hiddens,)
Activation of the reset gate.
Returns: - numpy array, shape (num_hiddens,)
Activations of the cell for this data.
-
class
madmom.ml.nn.layers.
GRULayer
¶ Recurrent network layer with Gated Recurrent Units (GRU) as proposed in [1].
Parameters: Notes
There are two formulations of the GRUCell in the literature. Here, we adopted the (slightly older) one proposed in [1], which is also implemented in the Lasagne toolbox.
References
[1] (1, 2) Kyunghyun Cho, Bart van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio, “On the properties of neural machine translation: Encoder-decoder approaches”, http://arxiv.org/abs/1409.1259, 2014. -
activate
()¶ Activate the GRU layer.
Parameters: - data : numpy array, shape (num_frames, num_inputs)
Activate with this data.
- reset : bool, optional
Reset the layer to its initial state before activating it.
Returns: - numpy array, shape (num_frames, num_hiddens)
Activations for this data.
-
reset
()¶ Reset the layer to its initial state.
Parameters: - init : numpy array, shape (num_hiddens,), optional
Reset the hidden units to this initial state.
-
-
class
madmom.ml.nn.layers.
Gate
¶ Gate as used by LSTM layers.
Parameters: - weights : numpy array, shape (num_inputs, num_hiddens)
Weights.
- bias : scalar or numpy array, shape (num_hiddens,)
Bias.
- recurrent_weights : numpy array, shape (num_hiddens, num_hiddens)
Recurrent weights.
- peephole_weights : numpy array, shape (num_hiddens,), optional
Peephole weights.
- activation_fn : numpy ufunc, optional
Activation function.
Notes
Gate should not be used directly, only inside an LSTMLayer.
-
activate
()¶ Activate the gate with the given data, state (if peephole connections are used) and the previous output (if recurrent connections are used).
Parameters: - data : scalar or numpy array, shape (num_hiddens,)
Input data for the cell.
- prev : scalar or numpy array, shape (num_hiddens,)
Output data of the previous time step.
- state : scalar or numpy array, shape (num_hiddens,)
State data of the {current | previous} time step.
Returns: - numpy array, shape (num_hiddens,)
Activations of the gate for this data.
-
class
madmom.ml.nn.layers.
LSTMLayer
¶ Recurrent network layer with Long Short-Term Memory units.
Parameters: - input_gate :
Gate
Input gate.
- forget_gate :
Gate
Forget gate.
- cell :
Cell
Cell (i.e. a Gate without peephole connections).
- output_gate :
Gate
Output gate.
- activation_fn : numpy ufunc, optional
Activation function.
- init : numpy array, shape (num_hiddens, ), optional
Initial state of the layer.
- cell_init : numpy array, shape (num_hiddens, ), optional
Initial state of the cell.
-
activate
()¶ Activate the LSTM layer.
Parameters: - data : numpy array, shape (num_frames, num_inputs)
Activate with this data.
- reset : bool, optional
Reset the layer to its initial state before activating it.
Returns: - numpy array, shape (num_frames, num_hiddens)
Activations for this data.
-
reset
()¶ Reset the layer to its initial state.
Parameters: - init : numpy array, shape (num_hiddens,), optional
Reset the hidden units to this initial state.
- cell_init : numpy array, shape (num_hiddens,), optional
Reset the cells to this initial state.
- input_gate :
-
class
madmom.ml.nn.layers.
Layer
¶ Generic callable network layer.
-
activate
()¶ Activate the layer.
Parameters: - data : numpy array
Activate with this data.
Returns: - numpy array
Activations for this data.
-
reset
()¶ Reset the layer to its initial state.
-
-
class
madmom.ml.nn.layers.
MaxPoolLayer
¶ 2D max-pooling network layer.
Parameters: - size : tuple
The size of the pooling region in each dimension.
- stride : tuple, optional
The strides between successive pooling regions in each dimension. If None stride = size.
-
activate
()¶ Activate the layer.
Parameters: - data : numpy array
Activate with this data.
Returns: - numpy array
Max pooled data.
-
class
madmom.ml.nn.layers.
PadLayer
¶ Padding layer that pads the input with a constant value.
Parameters: - width : int
Width of the padding (only one value for all dimensions)
- axes : iterable
Indices of axes to be padded
- value : float
Value to be used for padding.
-
activate
()¶ Activate the layer.
Parameters: - data : numpy array
Activate with this data.
Returns: - numpy array
Padded data.
-
class
madmom.ml.nn.layers.
RecurrentLayer
¶ Recurrent network layer.
Parameters: - weights : numpy array, shape (num_inputs, num_hiddens)
Weights.
- bias : scalar or numpy array, shape (num_hiddens,)
Bias.
- recurrent_weights : numpy array, shape (num_hiddens, num_hiddens)
Recurrent weights.
- activation_fn : numpy ufunc
Activation function.
- init : numpy array, shape (num_hiddens,), optional
Initial state of hidden units.
-
activate
()¶ Activate the layer.
Parameters: - data : numpy array, shape (num_frames, num_inputs)
Activate with this data.
- reset : bool, optional
Reset the layer to its initial state before activating it.
Returns: - numpy array, shape (num_frames, num_hiddens)
Activations for this data.
-
reset
()¶ Reset the layer to its initial state.
Parameters: - init : numpy array, shape (num_hiddens,), optional
Reset the hidden units to this initial state.
-
class
madmom.ml.nn.layers.
ReshapeLayer
¶ Reshape Layer.
Parameters: - newshape : int or tuple of ints
The new shape should be compatible with the original shape. If an integer, then the result will be a 1-D array of that length. One shape dimension can be -1. In this case, the value is inferred from the length of the array and remaining dimensions.
- order : {‘C’, ‘F’, ‘A’}, optional
Index order or the input. See np.reshape for a detailed description.
-
activate
()¶ Activate the layer.
Parameters: - data : numpy array
Activate with this data.
Returns: - numpy array
Reshaped data.
-
class
madmom.ml.nn.layers.
StrideLayer
¶ Stride network layer.
Parameters: - block_size : int
Re-arrange (stride) the data in blocks of given size.
-
activate
()¶ Activate the layer.
Parameters: - data : numpy array
Activate with this data.
Returns: - numpy array
Strided data.
-
class
madmom.ml.nn.layers.
TransposeLayer
¶ Transpose layer.
Parameters: - axes : list of ints, optional
By default, reverse the dimensions of the input, otherwise permute the axes of the input according to the values given.
-
activate
()¶ Activate the layer.
Parameters: - data : numpy array
Activate with this data.
Returns: - numpy array
Transposed data.
-
madmom.ml.nn.layers.
convolve
¶ Convolve the data with the kernel in ‘valid’ mode, i.e. only where kernel and data fully overlaps.
Parameters: - data : numpy array
Data to be convolved.
- kernel : numpy array
Convolution kernel
Returns: - numpy array
Convolved data
madmom.ml.nn.activations¶
This module contains neural network activation functions for the ml.nn module.
-
madmom.ml.nn.activations.
linear
(x, out=None)[source]¶ Linear function.
Parameters: - x : numpy array
Input data.
- out : numpy array, optional
Array to hold the output data.
Returns: - numpy array
Unaltered input data.
-
madmom.ml.nn.activations.
tanh
(x, out=None)[source]¶ Hyperbolic tangent function.
Parameters: - x : numpy array
Input data.
- out : numpy array, optional
Array to hold the output data.
Returns: - numpy array
Hyperbolic tangent of input data.
-
madmom.ml.nn.activations.
sigmoid
(x, out=None)[source]¶ Logistic sigmoid function.
Parameters: - x : numpy array
Input data.
- out : numpy array, optional
Array to hold the output data.
Returns: - numpy array
Logistic sigmoid of input data.
-
madmom.ml.nn.activations.
relu
(x, out=None)[source]¶ Rectified linear (unit) transfer function.
Parameters: - x : numpy array
Input data.
- out : numpy array, optional
Array to hold the output data.
Returns: - numpy array
Rectified linear of input data.
-
madmom.ml.nn.activations.
elu
(x, out=None)[source]¶ Exponential linear (unit) transfer function.
Parameters: - x : numpy array
Input data.
- out : numpy array, optional
Array to hold the output data.
Returns: - numpy array
Exponential linear of input data
References
[1] Djork-Arné Clevert, Thomas Unterthiner, Sepp Hochreiter (2015): Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs), http://arxiv.org/abs/1511.07289