Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Identifying interesting relationships between pairs of variables in large data sets is increasingly important. Here, we present a measure of dependence for two-variable relationships: the maximal information coefficient (MIC). MIC captures a wide range of associations both functional and not, and for functional relationships provides a score that roughly equals the coefficient of determination (R(2)) of the data relative to the regression function. MIC belongs to a larger class of maximal information-based nonparametric exploration (MINE) statistics for identifying and classifying relationships. We apply MIC and MINE to data sets in global health, gene expression, major-league baseball, and the human gut microbiota and identify known and novel relationships.

Original publication

DOI

10.1126/science.1205438

Type

Journal article

Journal

Science

Publication Date

16/12/2011

Volume

334

Pages

1518 - 1524

Keywords

Algorithms, Animals, Baseball, Data Interpretation, Statistical, Female, Gene Expression, Genes, Fungal, Genomics, Humans, Intestines, Male, Metagenome, Mice, Obesity, Saccharomyces cerevisiae