madmom.ml.crf

This module contains an implementation of Conditional Random Fields (CRFs)

class madmom.ml.crf.ConditionalRandomField(initial, final, bias, transition, observation)[source]

Implements a linear-chain Conditional Random Field using a matrix-based definition:

\[ \begin{align}\begin{aligned}P(Y|X) = exp[E(Y,X)] / Σ_{Y'}[E(Y', X)]\\E(Y,X) = Σ_{i=1}^{N} [y_{n-1}^T A y_n + y_n^T c + x_n^T W y_n ] + y_0^T π + y_N^T τ,\end{aligned}\end{align} \]

where Y is a sequence of labels in one-hot encoding and X are the observed features.

Parameters:

initial : numpy array

Initial potential (π) of the CRF. Also defines the number of states.

final : numpy array

Potential (τ) of the last variable of the CRF.

bias : numpy array

Label bias potential (c).

transition : numpy array

Matrix defining the transition potentials (A), where the rows are the ‘from’ dimension, and columns the ‘to’ dimension.

observation : numpy array

Matrix defining the observation potentials (W), where the rows are the ‘observation’ dimension, and columns the ‘state’ dimension.

Examples

Create a CRF that emulates a simple hidden markov model. This means that the bias and final potential will be constant and thus have no effect on the predictions.

>>> eta = np.spacing(1)  # for numerical stability
>>> initial = np.log(np.array([0.7, 0.2, 0.1]) + eta)
>>> final = np.ones(3)
>>> bias = np.ones(3)
>>> transition = np.log(np.array([[0.6, 0.2, 0.2],
...                               [0.1, 0.7, 0.2],
...                               [0.1, 0.1, 0.8]]) + eta)
>>> observation = np.log(np.array([[0.9, 0.5, 0.1],
...                                [0.1, 0.5, 0.1]]) + eta)
>>> crf = ConditionalRandomField(initial, final, bias,
...                              transition, observation)
>>> crf  
<madmom.ml.crf.ConditionalRandomField object at 0x...>

We can now decode the most probable state sequence given an observation sequence. Since we are emulating a discrete HMM, the observation sequence needs to be observation ids in one-hot encoding.

The following observation sequence corresponds to “0, 0, 1, 0, 1, 1”:

>>> obs = np.array([[1, 0], [1, 0], [0, 1], [1, 0], [0, 1], [0, 1]])

Now we can find the most likely state sequence:

>>> crf.process(obs)
array([0, 0, 1, 1, 1, 1], dtype=uint32)
process(observations, **kwargs)[source]

Determine the most probable configuration of Y given the state sequence x:

\[y^* = argmax_y P(Y=y|X=x)\]
Parameters:

observations : numpy array

Observations (x) to decode the most probable state sequence for.

Returns:

y_star : numpy array

Most probable state sequence.