Dissimilarity

The dissimilarity module provides distance measures and dissimilarity-based tests.

Distance Calculations

nuee.dissimilarity.vegdist(x: ndarray | DataFrame, method: str = 'bray', binary: bool = False, diag: bool = False, upper: bool = False, na_rm: bool = False) ndarray[source]

Calculate ecological distance matrices.

This function calculates various dissimilarity indices commonly used in community ecology. It is designed to be compatible with the R vegan package’s vegdist function.

Parameters:
  • x (np.ndarray or pd.DataFrame) – Community data matrix with samples in rows and species in columns.

  • method (str, default='bray') – Distance measure: ‘bray’, ‘jaccard’, ‘euclidean’, ‘manhattan’, ‘canberra’, ‘gower’, ‘altGower’, ‘morisita’, ‘horn’, ‘mountford’, ‘raup’, ‘binomial’, ‘chao’, ‘cao’, ‘kulczynski’, ‘mahalanobis’

  • binary (bool, default=False) – Convert data to presence/absence before calculating distances.

  • diag (bool, default=False) – Include diagonal elements in output.

  • upper (bool, default=False) – Include upper triangle in output.

  • na_rm (bool, default=False) – Remove samples with missing values.

Returns:

Square symmetric distance matrix of shape (n_samples, n_samples).

Return type:

np.ndarray

Examples

>>> import nuee
>>> species = nuee.datasets.varespec()
>>> dist = nuee.vegdist(species, method="bray")
>>> print(f"Shape: {dist.shape}")
Shape: (24, 24)

See also

adonis2

PERMANOVA using distance matrices

betadisper

Test for homogeneity of dispersions

mantel

Mantel test for matrix correlation

References

Permutation Tests

PERMANOVA

nuee.dissimilarity.adonis2(distance_matrix: ndarray | DataFrame | Series, factors: ndarray | DataFrame | Series, permutations: int = 999, random_state: int | Generator | None = None, distance_method: str = 'bray', **kwargs) DataFrame[source]

Convenience wrapper for distance-based PERMANOVA.

Parameters:
  • distance_matrix – Square or condensed distance matrix.

  • factors – Predictor variables supplied as a DataFrame/Series/array.

  • permutations – Number of permutations for the significance test.

  • random_state – Seed or Generator controlling permutation reproducibility.

Returns:

Same payload as permanova().

Return type:

dict

ANOSIM

nuee.dissimilarity.anosim(distance_matrix: ndarray | DataFrame | Series, grouping: ndarray | DataFrame | Series, permutations: int = 999, random_state: int | Generator | None = None, **kwargs) Dict[str, float | int][source]

Analysis of similarities (ANOSIM).

Parameters:
  • distance_matrix – Square or condensed distance matrix.

  • grouping – Group assignments for each observation.

  • permutations – Number of permutations used to estimate the p-value. Set to 0 to skip.

  • random_state – Seed controlling permutation reproducibility.

Returns:

Dictionary with r_statistic, p_value and permutations.

Return type:

dict

MRPP

nuee.dissimilarity.mrpp(distance_matrix: ndarray | DataFrame | Series, grouping: ndarray | DataFrame | Series, permutations: int = 999, random_state: int | Generator | None = None, distance_method: str = 'euclidean', **kwargs) Dict[str, float | int][source]

Multi-Response Permutation Procedure (MRPP).

Parameters:
  • distance_matrix – Square or condensed distance matrix.

  • grouping – Group assignments for each observation.

  • permutations – Number of permutations for p-value estimation. Set to 0 to skip permutations.

  • random_state – Seed controlling permutation reproducibility.

Returns:

Contains delta (observed within-group distance), expected_delta (mean delta under permutations), a_statistic (chance-corrected agreement), p_value, and permutations.

Return type:

dict

Beta Dispersion

nuee.dissimilarity.betadisper(distance_matrix: ndarray | DataFrame | Series, grouping: ndarray | DataFrame | Series, permutations: int = 999, random_state: int | Generator | None = None, distance_method: str = 'bray', **kwargs) Dict[str, float | int | ndarray | DataFrame][source]

Multivariate homogeneity of group dispersions.

Parameters:
  • distance_matrix – Square or condensed distance matrix, or original data matrix which will be converted via vegdist.

  • grouping – Group assignments for each observation.

  • permutations – Number of permutations for the ANOVA-like dispersion test (default: 999). Set to 0 to skip the permutation test.

  • random_state – Seed controlling permutation reproducibility.

  • distance_method – Distance metric when converting from raw data (default bray).

Returns:

  • distances: Series of distances to group centroids.

  • centroids: DataFrame of group centroids in PCoA space.

  • f_statistic / p_value: results of the dispersion test.

  • group_means: within-group mean distances.

  • permutations: number of permutations performed.

Return type:

dict

Mantel Test

nuee.dissimilarity.mantel(x: ndarray | DataFrame | Series, y: ndarray | DataFrame | Series, method: str = 'pearson', permutations: int = 999, random_state: int | Generator | None = None, **kwargs) Dict[str, float | int | str][source]

Procrustes Test

nuee.dissimilarity.protest(x: ndarray | DataFrame, y: ndarray | DataFrame, permutations: int = 999, *, random_state: int | RandomState | Generator | None = None, scale: bool = True) Dict[str, float | int | ndarray][source]

Perform a Procrustean randomization test (PROTEST) between two ordinations.

Parameters:
  • x – Configuration matrices with matched observations (rows) and axes (columns).

  • y – Configuration matrices with matched observations (rows) and axes (columns).

  • permutations – Number of row permutations applied to y to approximate the null distribution. Set to 0 to skip permutation testing.

  • random_state – Seed or numpy RNG used for the permutation stream. When None the global numpy RNG is used.

  • scale – Forwarded to nuee.procrustes() to control symmetric scaling.

Returns:

Dictionary capturing the observed correlation, permutation p-value, number of permutations, and Procrustes transformation details.

Return type:

dict