Dissimilarity¶
The dissimilarity module provides distance measures and dissimilarity-based tests.
Distance Calculations¶
- nuee.dissimilarity.vegdist(x: ndarray | DataFrame, method: str = 'bray', binary: bool = False, diag: bool = False, upper: bool = False, na_rm: bool = False) ndarray[source]¶
Calculate ecological distance matrices.
This function calculates various dissimilarity indices commonly used in community ecology. It is designed to be compatible with the R vegan package’s vegdist function.
- Parameters:
x (np.ndarray or pd.DataFrame) – Community data matrix with samples in rows and species in columns.
method (str, default='bray') – Distance measure: ‘bray’, ‘jaccard’, ‘euclidean’, ‘manhattan’, ‘canberra’, ‘gower’, ‘altGower’, ‘morisita’, ‘horn’, ‘mountford’, ‘raup’, ‘binomial’, ‘chao’, ‘cao’, ‘kulczynski’, ‘mahalanobis’
binary (bool, default=False) – Convert data to presence/absence before calculating distances.
diag (bool, default=False) – Include diagonal elements in output.
upper (bool, default=False) – Include upper triangle in output.
na_rm (bool, default=False) – Remove samples with missing values.
- Returns:
Square symmetric distance matrix of shape (n_samples, n_samples).
- Return type:
np.ndarray
Examples
>>> import nuee >>> species = nuee.datasets.varespec() >>> dist = nuee.vegdist(species, method="bray") >>> print(f"Shape: {dist.shape}") Shape: (24, 24)
See also
adonis2PERMANOVA using distance matrices
betadisperTest for homogeneity of dispersions
mantelMantel test for matrix correlation
References
Permutation Tests¶
PERMANOVA¶
- nuee.dissimilarity.adonis2(distance_matrix: ndarray | DataFrame | Series, factors: ndarray | DataFrame | Series, permutations: int = 999, random_state: int | Generator | None = None, distance_method: str = 'bray', **kwargs) DataFrame[source]¶
Convenience wrapper for distance-based PERMANOVA.
- Parameters:
distance_matrix – Square or condensed distance matrix.
factors – Predictor variables supplied as a
DataFrame/Series/array.permutations – Number of permutations for the significance test.
random_state – Seed or
Generatorcontrolling permutation reproducibility.
- Returns:
Same payload as
permanova().- Return type:
ANOSIM¶
- nuee.dissimilarity.anosim(distance_matrix: ndarray | DataFrame | Series, grouping: ndarray | DataFrame | Series, permutations: int = 999, random_state: int | Generator | None = None, **kwargs) Dict[str, float | int][source]¶
Analysis of similarities (ANOSIM).
- Parameters:
distance_matrix – Square or condensed distance matrix.
grouping – Group assignments for each observation.
permutations – Number of permutations used to estimate the p-value. Set to 0 to skip.
random_state – Seed controlling permutation reproducibility.
- Returns:
Dictionary with
r_statistic,p_valueandpermutations.- Return type:
MRPP¶
- nuee.dissimilarity.mrpp(distance_matrix: ndarray | DataFrame | Series, grouping: ndarray | DataFrame | Series, permutations: int = 999, random_state: int | Generator | None = None, distance_method: str = 'euclidean', **kwargs) Dict[str, float | int][source]¶
Multi-Response Permutation Procedure (MRPP).
- Parameters:
distance_matrix – Square or condensed distance matrix.
grouping – Group assignments for each observation.
permutations – Number of permutations for p-value estimation. Set to 0 to skip permutations.
random_state – Seed controlling permutation reproducibility.
- Returns:
Contains
delta(observed within-group distance),expected_delta(mean delta under permutations),a_statistic(chance-corrected agreement),p_value, andpermutations.- Return type:
Beta Dispersion¶
- nuee.dissimilarity.betadisper(distance_matrix: ndarray | DataFrame | Series, grouping: ndarray | DataFrame | Series, permutations: int = 999, random_state: int | Generator | None = None, distance_method: str = 'bray', **kwargs) Dict[str, float | int | ndarray | DataFrame][source]¶
Multivariate homogeneity of group dispersions.
- Parameters:
distance_matrix – Square or condensed distance matrix, or original data matrix which will be converted via
vegdist.grouping – Group assignments for each observation.
permutations – Number of permutations for the ANOVA-like dispersion test (default: 999). Set to 0 to skip the permutation test.
random_state – Seed controlling permutation reproducibility.
distance_method – Distance metric when converting from raw data (default
bray).
- Returns:
distances: Series of distances to group centroids.centroids: DataFrame of group centroids in PCoA space.f_statistic/p_value: results of the dispersion test.group_means: within-group mean distances.permutations: number of permutations performed.
- Return type:
Mantel Test¶
Procrustes Test¶
- nuee.dissimilarity.protest(x: ndarray | DataFrame, y: ndarray | DataFrame, permutations: int = 999, *, random_state: int | RandomState | Generator | None = None, scale: bool = True) Dict[str, float | int | ndarray][source]¶
Perform a Procrustean randomization test (PROTEST) between two ordinations.
- Parameters:
x – Configuration matrices with matched observations (rows) and axes (columns).
y – Configuration matrices with matched observations (rows) and axes (columns).
permutations – Number of row permutations applied to
yto approximate the null distribution. Set to0to skip permutation testing.random_state – Seed or numpy RNG used for the permutation stream. When
Nonethe global numpy RNG is used.scale – Forwarded to
nuee.procrustes()to control symmetric scaling.
- Returns:
Dictionary capturing the observed correlation, permutation p-value, number of permutations, and Procrustes transformation details.
- Return type: