madmom.ml.gmm

This module contains functionality needed for fitting and scoring Gaussian Mixture Models (GMMs) (needed e.g. in madmom.features.beats_hmm).

The needed functionality is taken from sklearn.mixture.GMM which is released under the BSD license and was written by these authors:

This version works with sklearn v0.16 an onwards. All commits until 0650d5502e01e6b4245ce99729fc8e7a71aacff3 are incorporated.

madmom.ml.gmm.logsumexp(arr, axis=0)[source]

Computes the sum of arr assuming arr is in the log domain.

Parameters:

arr : numpy array

Input data [log domain].

axis : int, optional

Axis to operate on.

Returns:

numpy array

log(sum(exp(arr))) while minimizing the possibility of over/underflow.

Notes

Function copied from sklearn.utils.extmath.

madmom.ml.gmm.pinvh(a, cond=None, rcond=None, lower=True)[source]

Compute the (Moore-Penrose) pseudo-inverse of a hermetian matrix.

Calculate a generalized inverse of a symmetric matrix using its eigenvalue decomposition and including all ‘large’ eigenvalues.

Parameters:

a : array, shape (N, N)

Real symmetric or complex hermetian matrix to be pseudo-inverted.

cond, rcond : float or None

Cutoff for ‘small’ eigenvalues. Singular values smaller than rcond * largest_eigenvalue are considered zero. If None or -1, suitable machine precision is used.

lower : boolean

Whether the pertinent array data is taken from the lower or upper triangle of a.

Returns:

B : array, shape (N, N)

Raises:

LinAlgError

If eigenvalue does not converge

Notes

Function copied from sklearn.utils.extmath.

madmom.ml.gmm.log_multivariate_normal_density(x, means, covars, covariance_type='diag')[source]

Compute the log probability under a multivariate Gaussian distribution.

Parameters:

x : array_like, shape (n_samples, n_features)

List of n_features-dimensional data points. Each row corresponds to a single data point.

means : array_like, shape (n_components, n_features)

List of n_features-dimensional mean vectors for n_components Gaussians. Each row corresponds to a single mean vector.

covars : array_like

List of n_components covariance parameters for each Gaussian. The shape depends on covariance_type:

  • (n_components, n_features) if ‘spherical’,
  • (n_features, n_features) if ‘tied’,
  • (n_components, n_features) if ‘diag’,
  • (n_components, n_features, n_features) if ‘full’.

covariance_type : {‘diag’, ‘spherical’, ‘tied’, ‘full’}

Type of the covariance parameters. Defaults to ‘diag’.

Returns:

lpr : array_like, shape (n_samples, n_components)

Array containing the log probabilities of each data point in x under each of the n_components multivariate Gaussian distributions.

class madmom.ml.gmm.GMM(n_components=1, covariance_type='full')[source]

Gaussian Mixture Model

Representation of a Gaussian mixture model probability distribution. This class allows for easy evaluation of, sampling from, and maximum-likelihood estimation of the parameters of a GMM distribution.

Initializes parameters such that every mixture component has zero mean and identity covariance.

Parameters:

n_components : int, optional

Number of mixture components. Defaults to 1.

covariance_type : {‘diag’, ‘spherical’, ‘tied’, ‘full’}

String describing the type of covariance parameters to use. Defaults to ‘diag’.

Attributes

weights_ (array, shape (n_components,)) This attribute stores the mixing weights for each mixture component.
means_ (array, shape (n_components, n_features)) Mean parameters for each mixture component.
covars_ (array) Covariance parameters for each mixture component. The shape depends on covariance_type.:: - (n_components, n_features) if ‘spherical’, - (n_features, n_features) if ‘tied’, - (n_components, n_features) if ‘diag’, - (n_components, n_features, n_features) if ‘full’.
converged_ (bool) True when convergence was reached in fit(), False otherwise.
score_samples(x)[source]

Return the per-sample likelihood of the data under the model.

Compute the log probability of x under the model and return the posterior distribution (responsibilities) of each mixture component for each element of x.

Parameters:

x: array_like, shape (n_samples, n_features)

List of n_features-dimensional data points. Each row corresponds to a single data point.

Returns:

logprob : array_like, shape (n_samples,)

Log probabilities of each data point in x.

responsibilities : array_like, shape (n_samples, n_components)

Posterior probabilities of each mixture component for each observation.

score(x)[source]

Compute the log probability under the model.

Parameters:

x : array_like, shape (n_samples, n_features)

List of n_features-dimensional data points. Each row corresponds to a single data point.

Returns:

logprob : array_like, shape (n_samples,)

Log probabilities of each data point in x.

fit(x, random_state=None, tol=0.001, min_covar=0.001, n_iter=100, n_init=1, params='wmc', init_params='wmc')[source]

Estimate model parameters with the expectation-maximization algorithm.

A initialization step is performed before entering the em algorithm. If you want to avoid this step, set the keyword argument init_params to the empty string ‘’ when creating the GMM object. Likewise, if you would like just to do an initialization, set n_iter=0.

Parameters:

x : array_like, shape (n, n_features)

List of n_features-dimensional data points. Each row corresponds to a single data point.

random_state: RandomState or an int seed (0 by default)

A random number generator instance.

min_covar : float, optional

Floor on the diagonal of the covariance matrix to prevent overfitting.

tol : float, optional

Convergence threshold. EM iterations will stop when average gain in log-likelihood is below this threshold.

n_iter : int, optional

Number of EM iterations to perform.

n_init : int, optional

Number of initializations to perform, the best results is kept.

params : string, optional

Controls which parameters are updated in the training process. Can contain any combination of ‘w’ for weights, ‘m’ for means, and ‘c’ for covars.

init_params : string, optional

Controls which parameters are updated in the initialization process. Can contain any combination of ‘w’ for weights, ‘m’ for means, and ‘c’ for covars.