depynd.information

Module contents

depynd.information.mutual_information(X, Y, mi_estimator='auto', is_discrete='auto', force_non_negative=False, **kwargs)

Estimate mutual information between X and Y.

Parameters:
  • X (array-like, shape (n_samples, n_features_x) or (n_samples)) – Observations of a variable.
  • Y (array-like, shape (n_samples, n_features_y) or (n_samples)) – Observations of the other variable.
  • mi_estimator ({'knn', 'dr', 'plugin', 'auto'}, default 'auto') – MI estimator. If ‘auto’, MI estimator will be selected depending on whether all features are purely discrete or not. If purely discrete, ‘plugin’ estimator will be used. Otherwise, ‘knn’ estimator will be selected.
  • is_discrete ({'auto', bool}, default 'auto') – If bool, then it determines whether to consider all features purely discrete or not. If ‘auto’, a column which contains duplicate entries will be considered discrete.
  • force_non_negative (bool, default False) – If True, the result will be taken max with zero.
  • kwargs (dict) – Optional parameters for MI estimation.
Returns:

mi – Estimated mutual information between X and Y.

Return type:

float

depynd.information.conditional_mutual_information(X, Y, Z, mi_estimator='auto', is_discrete='auto', force_non_negative=False, **kwargs)

Estimate conditional mutual information between X and Y given Z.

Parameters:
  • X (array-like, shape (n_samples, n_features_x) or (n_samples)) – Observations of a conditioned variable.
  • Y (array-like, shape (n_samples, n_features_y) or (n_samples)) – Observations of the other conditioned variable.
  • Z (array-like, shape (n_samples, n_features_z) or (n_samples)) – Observations of the conditioning variable.
  • mi_estimator ({'knn', 'dr', 'plugin', 'auto'}, default 'auto') – MI estimator. If ‘auto’, MI estimator will be selected depending on whether all features are purely discrete or not. If purely discrete, ‘plugin’ estimator will be used. Otherwise, ‘knn’ estimator will be selected.
  • is_discrete ({'auto', bool}, default 'auto') – If bool, then it determines whether to consider all features purely discrete or not. If ‘auto’, a column which contains duplicate entries will be considered discrete.
  • force_non_negative (bool, default False) – If True, the result will be taken max with zero.
  • kwargs (dict, default None) – Optional parameters for MI estimation.
Returns:

cmi – Estimated conditional mutual information between X and Y, given Z.

Return type:

float