madmom.ml.gmm¶
This module contains functionality needed for fitting and scoring Gaussian Mixture Models (GMMs) (needed e.g. in madmom.features.beats_hmm).
The needed functionality is taken from sklearn.mixture.GMM which is released under the BSD license and was written by these authors:
- Ron Weiss <ronweiss@gmail.com>
- Fabian Pedregosa <fabian.pedregosa@inria.fr>
- Bertrand Thirion <bertrand.thirion@inria.fr>
This version works with sklearn v0.16 an onwards. All commits until 0650d5502e01e6b4245ce99729fc8e7a71aacff3 are incorporated.
-
madmom.ml.gmm.
logsumexp
(arr, axis=0)[source]¶ Computes the sum of arr assuming arr is in the log domain.
Parameters: arr : numpy array
Input data [log domain].
axis : int, optional
Axis to operate on.
Returns: numpy array
log(sum(exp(arr))) while minimizing the possibility of over/underflow.
Notes
Function copied from sklearn.utils.extmath.
-
madmom.ml.gmm.
pinvh
(a, cond=None, rcond=None, lower=True)[source]¶ Compute the (Moore-Penrose) pseudo-inverse of a hermetian matrix.
Calculate a generalized inverse of a symmetric matrix using its eigenvalue decomposition and including all ‘large’ eigenvalues.
Parameters: a : array, shape (N, N)
Real symmetric or complex hermetian matrix to be pseudo-inverted.
cond, rcond : float or None
Cutoff for ‘small’ eigenvalues. Singular values smaller than rcond * largest_eigenvalue are considered zero. If None or -1, suitable machine precision is used.
lower : boolean
Whether the pertinent array data is taken from the lower or upper triangle of a.
Returns: B : array, shape (N, N)
Raises: LinAlgError
If eigenvalue does not converge
Notes
Function copied from sklearn.utils.extmath.
-
madmom.ml.gmm.
log_multivariate_normal_density
(x, means, covars, covariance_type='diag')[source]¶ Compute the log probability under a multivariate Gaussian distribution.
Parameters: x : array_like, shape (n_samples, n_features)
List of n_features-dimensional data points. Each row corresponds to a single data point.
means : array_like, shape (n_components, n_features)
List of n_features-dimensional mean vectors for n_components Gaussians. Each row corresponds to a single mean vector.
covars : array_like
List of n_components covariance parameters for each Gaussian. The shape depends on covariance_type:
- (n_components, n_features) if ‘spherical’,
- (n_features, n_features) if ‘tied’,
- (n_components, n_features) if ‘diag’,
- (n_components, n_features, n_features) if ‘full’.
covariance_type : {‘diag’, ‘spherical’, ‘tied’, ‘full’}
Type of the covariance parameters. Defaults to ‘diag’.
Returns: lpr : array_like, shape (n_samples, n_components)
Array containing the log probabilities of each data point in x under each of the n_components multivariate Gaussian distributions.
-
class
madmom.ml.gmm.
GMM
(n_components=1, covariance_type='full')[source]¶ Gaussian Mixture Model
Representation of a Gaussian mixture model probability distribution. This class allows for easy evaluation of, sampling from, and maximum-likelihood estimation of the parameters of a GMM distribution.
Initializes parameters such that every mixture component has zero mean and identity covariance.
Parameters: n_components : int, optional
Number of mixture components. Defaults to 1.
covariance_type : {‘diag’, ‘spherical’, ‘tied’, ‘full’}
String describing the type of covariance parameters to use. Defaults to ‘diag’.
Attributes
weights_ (array, shape (n_components,)) This attribute stores the mixing weights for each mixture component. means_ (array, shape (n_components, n_features)) Mean parameters for each mixture component. covars_ (array) Covariance parameters for each mixture component. The shape depends on covariance_type.:: - (n_components, n_features) if ‘spherical’, - (n_features, n_features) if ‘tied’, - (n_components, n_features) if ‘diag’, - (n_components, n_features, n_features) if ‘full’. converged_ (bool) True when convergence was reached in fit(), False otherwise. -
score_samples
(x)[source]¶ Return the per-sample likelihood of the data under the model.
Compute the log probability of x under the model and return the posterior distribution (responsibilities) of each mixture component for each element of x.
Parameters: x: array_like, shape (n_samples, n_features)
List of n_features-dimensional data points. Each row corresponds to a single data point.
Returns: logprob : array_like, shape (n_samples,)
Log probabilities of each data point in x.
responsibilities : array_like, shape (n_samples, n_components)
Posterior probabilities of each mixture component for each observation.
-
score
(x)[source]¶ Compute the log probability under the model.
Parameters: x : array_like, shape (n_samples, n_features)
List of n_features-dimensional data points. Each row corresponds to a single data point.
Returns: logprob : array_like, shape (n_samples,)
Log probabilities of each data point in x.
-
fit
(x, random_state=None, tol=0.001, min_covar=0.001, n_iter=100, n_init=1, params='wmc', init_params='wmc')[source]¶ Estimate model parameters with the expectation-maximization algorithm.
A initialization step is performed before entering the em algorithm. If you want to avoid this step, set the keyword argument init_params to the empty string ‘’ when creating the GMM object. Likewise, if you would like just to do an initialization, set n_iter=0.
Parameters: x : array_like, shape (n, n_features)
List of n_features-dimensional data points. Each row corresponds to a single data point.
random_state: RandomState or an int seed (0 by default)
A random number generator instance.
min_covar : float, optional
Floor on the diagonal of the covariance matrix to prevent overfitting.
tol : float, optional
Convergence threshold. EM iterations will stop when average gain in log-likelihood is below this threshold.
n_iter : int, optional
Number of EM iterations to perform.
n_init : int, optional
Number of initializations to perform, the best results is kept.
params : string, optional
Controls which parameters are updated in the training process. Can contain any combination of ‘w’ for weights, ‘m’ for means, and ‘c’ for covars.
init_params : string, optional
Controls which parameters are updated in the initialization process. Can contain any combination of ‘w’ for weights, ‘m’ for means, and ‘c’ for covars.
-