API Reference¶
This page contains the API reference for the nuee package.
nuee Package¶
nuee: Community Ecology Analysis in Python¶
nuee is a comprehensive Python implementation of the popular R package vegan
for community ecology analysis. It provides tools for ordination, diversity analysis,
dissimilarity measures, and statistical testing commonly used in ecological research.
Modules¶
- ordinationmodule
Ordination methods including NMDS, RDA, CCA, PCA, and environmental fitting
- diversitymodule
Diversity indices and rarefaction analysis
- dissimilaritymodule
Distance measures and dissimilarity-based tests (PERMANOVA, ANOSIM, etc.)
- permutationmodule
Permutation-based statistical tests
- plottingmodule
Visualization functions for ecological data
- compositionmodule
Compositional data analysis utilities (CLR/ILR/ALR transforms, closure, etc.)
- datasetsmodule
Sample datasets for testing and examples
Examples
Basic NMDS ordination:
>>> import nuee
>>> species_data = nuee.datasets.varespec()
>>> nmds_result = nuee.metaMDS(species_data, k=2, distance="bray")
>>> print(f"NMDS Stress: {nmds_result.stress:.3f}")
NMDS Stress: 0.133
Calculate diversity indices:
>>> shannon_div = nuee.shannon(species_data)
>>> simpson_div = nuee.simpson(species_data)
>>> richness = nuee.specnumber(species_data)
Perform PERMANOVA test:
>>> distances = nuee.vegdist(species_data, method="bray")
>>> env_data = nuee.datasets.varechem()
>>> permanova_result = nuee.adonis2(distances, env_data)
Notes
nuee is inspired by the R package vegan developed by Jari Oksanen and the vegan development team. It aims to provide similar functionality in a Pythonic interface while leveraging the scientific Python ecosystem (NumPy, SciPy, pandas, matplotlib).
References
- nuee.metaMDS(X: ndarray | DataFrame, k: int = 2, distance: str = 'bray', trymax: int = 20, maxit: int = 200, trace: bool = False, autotransform: bool = True, wascores: bool = True, expand: bool = True, random_state: int | None = None, **kwargs) OrdinationResult[source]¶
Non-metric Multidimensional Scaling with automatic transformation.
This function provides a high-level interface for NMDS ordination, following the conventions of the R vegan package’s metaMDS function. It automatically handles data transformation and uses multiple random starts to find the best ordination solution.
- Parameters:
X (np.ndarray or pd.DataFrame) – Community data matrix with samples in rows and species in columns. Values should be non-negative abundances or counts.
k (int, default=2) – Number of dimensions for the ordination. Common choices are 2 or 3.
distance (str, default="bray") – Distance metric to use for calculating dissimilarities. See
nuee.vegdist()for available options.trymax (int, default=20) – Maximum number of random starts to find the best solution. Higher values increase computation time but may find better solutions.
maxit (int, default=200) – Maximum number of iterations for each random start.
trace (bool, default=False) – If True, print progress information including stress values.
autotransform (bool, default=True) – If True, automatically apply square root transformation to abundance data (values > 1) to reduce the influence of dominant species.
wascores (bool, default=True) – If True, calculate weighted average species scores based on site scores and species abundances.
expand (bool, default=True) – If True, expand the result to include additional information.
random_state (int, optional) – Seed used for reproducible random starts. If
None, each run is initialised independently.**kwargs (dict) – Additional parameters passed to the NMDS class.
- Returns:
Result object containing:
- pointspd.DataFrame
Site (sample) scores in the ordination space
- speciespd.DataFrame, optional
Species scores (if wascores=True)
- stressfloat
Final stress value (lower is better)
- convergedbool
Whether the solution converged
- Return type:
Notes
NMDS is a rank-based ordination method that attempts to preserve the rank order of dissimilarities between samples. Unlike metric methods like PCA, NMDS makes no assumptions about the linearity of relationships.
Stress values provide a measure of fit: - < 0.05: excellent - 0.05 - 0.10: good - 0.10 - 0.20: acceptable - > 0.20: poor (consider increasing k or using a different method)
The function uses multiple random starts (trymax) because NMDS can get stuck in local optima. The solution with the lowest stress is returned.
Examples
Basic NMDS ordination:
>>> import nuee >>> species = nuee.datasets.varespec() >>> nmds = nuee.metaMDS(species, k=2, distance="bray") >>> print(f"Stress: {nmds.stress:.3f}") Stress: 0.133
With custom parameters:
>>> nmds = nuee.metaMDS(species, k=3, distance="bray", trymax=50, trace=True) Applying square root transformation Applying Wisconsin double standardization NMDS stress: 0.0869 NMDS converged
Visualize the ordination:
>>> import matplotlib.pyplot as plt >>> fig = nuee.plot_ordination(nmds, display="sites") >>> plt.show()
See also
References
[1] Kruskal, J.B. (1964). Nonmetric multidimensional scaling: a numerical method. Psychometrika 29, 115-129.
[2] Minchin, P.R. (1987). An evaluation of relative robustness of techniques for ecological ordinations. Vegetatio 69, 89-107.
[3] Oksanen, J., et al. (2020). vegan: Community Ecology Package. https://CRAN.R-project.org/package=vegan
- nuee.rda(X: ndarray | DataFrame, Y: ndarray | DataFrame | None = None, Z: ndarray | DataFrame | None = None, formula: str | None = None, data: DataFrame | None = None, scale: bool = False, center: bool = True, **kwargs) ConstrainedOrdinationResult[source]¶
Redundancy Analysis (RDA).
RDA is a constrained ordination method that finds linear combinations of explanatory variables that best explain the variation in the response matrix.
- Parameters:
X – Response matrix (samples x species)
Y – Explanatory matrix (samples x variables)
Z – Conditioning matrix for partial RDA (optional)
formula – Formula string (e.g., “~ var1 + var2”)
data – DataFrame containing variables for formula
scale – Whether to scale species to unit variance
**kwargs – Additional parameters
- Returns:
ConstrainedOrdinationResult with RDA results
Examples
# Simple RDA result = rda(species_data, environmental_data)
# RDA with formula result = rda(species_data, formula=”~ pH + temperature”, data=env_data)
# Partial RDA result = rda(species_data, environmental_data, conditioning_data)
- nuee.cca(Y: ndarray | DataFrame, X: ndarray | DataFrame | None = None, formula: str | None = None, **kwargs) ConstrainedOrdinationResult | OrdinationResult[source]¶
Canonical Correspondence Analysis (or CA when X is None).
- Parameters:
Y – Species data matrix (sites x species).
X – Environmental data matrix (sites x variables) or DataFrame for formula evaluation. When None and no formula is given, an unconstrained Correspondence Analysis (CA) is performed.
formula – R-style formula string referencing columns in X.
- Return type:
ConstrainedOrdinationResult (CCA) or OrdinationResult (CA).
- nuee.ca(Y: ndarray | DataFrame, **kwargs) OrdinationResult[source]¶
Correspondence Analysis (unconstrained).
- Parameters:
Y – Species data matrix (sites x species).
- Return type:
OrdinationResult with CA results.
- nuee.pca(X: ndarray | DataFrame, n_components: int | None = None, scale: bool = True, center: bool = True, **kwargs) OrdinationResult[source]¶
Principal Component Analysis.
- Parameters:
X – Data matrix (samples x variables).
n_components – Number of components to keep.
scale – Whether to scale variables (unit variance) prior to analysis.
center – Whether to subtract column means prior to analysis.
- Return type:
OrdinationResult with PCA results.
- nuee.lda(X: ndarray | DataFrame, y: ndarray | Series, n_components: int | None = None, solver: str = 'svd', **kwargs) OrdinationResult[source]¶
Linear Discriminant Analysis.
- Parameters:
- Return type:
OrdinationResult with LDA results.
- nuee.envfit(ordination: OrdinationResult, env: ndarray | DataFrame, permutations: int = 999, scaling: int | str | None = 2, choices: Sequence[int] | None = None, random_state: int | None = None, **kwargs) Dict[str, Any][source]¶
Fit environmental vectors to an ordination configuration.
- Parameters:
ordination – OrdinationResult providing site scores.
env – Environmental data matrix (matching site order).
permutations – Number of permutations used for significance testing.
scaling – Scaling option passed to ordination.get_scores.
choices – Optional 1-based axis indices to include; defaults to all axes.
random_state – Seed for the permutation generator.
- Returns:
Dictionary mirroring vegan’s envfit output layout with a
vectorsentry containing scores, r, r², and p-values.- Return type:
- nuee.ordistep(ordination: OrdinationResult, env: ndarray | DataFrame, direction: str = 'both', **kwargs) dict[source]¶
Stepwise model selection for ordination.
- Parameters:
ordination – OrdinationResult object
env – Environmental data matrix
direction – Direction of selection (“forward”, “backward”, “both”)
**kwargs – Additional parameters
- Returns:
Dictionary with selection results
- nuee.procrustes(X: ndarray | DataFrame, Y: ndarray | DataFrame, scale: bool = True) dict[source]¶
Procrustes analysis to compare two ordinations.
- Parameters:
X – First ordination matrix
Y – Second ordination matrix
scale – Whether to scale configurations
- Returns:
Dictionary with Procrustes results
- nuee.diversity(x: ndarray | DataFrame, index: str = 'shannon', groups: ndarray | Series | None = None, base: float = 2.718281828459045) float | ndarray | Series[source]¶
Calculate diversity indices for community data.
This function provides a unified interface for calculating various diversity indices commonly used in ecology. It can calculate diversity for individual samples or for pooled groups.
- Parameters:
x (np.ndarray or pd.DataFrame) – Community data matrix with samples in rows and species in columns. Values should be non-negative abundances or counts. Can also be a 1D array for a single sample.
index ({'shannon', 'simpson', 'invsimpson', 'fisher'}, default='shannon') – Diversity index to calculate: - ‘shannon’: Shannon entropy H’ = -sum(p_i * log(p_i)) - ‘simpson’: Gini-Simpson index 1 - sum(p_i^2) - ‘invsimpson’: Inverse Simpson 1 / sum(p_i^2) - ‘fisher’: Fisher’s alpha
groups (np.ndarray or pd.Series, optional) – Grouping factor for calculating pooled diversities. If provided, samples are pooled within each group before calculating diversity.
base (float, default=e) – Base of logarithm for Shannon index. Common choices: - e (natural log): nats - 2: bits - 10: dits
- Returns:
Diversity values for each sample (or group if groups is provided). If input is a DataFrame, returns a pd.Series with sample/group names.
- Return type:
float, np.ndarray, or pd.Series
Notes
Shannon diversity (H’) measures both richness and evenness: - Higher values indicate more diverse communities - Ranges from 0 (single species) to log(S) where S is species richness - Most common diversity index in ecology
Simpson’s index measures dominance: - We report the Gini-Simpson form (1 - sum(p_i^2)), matching vegan::diversity - Larger values indicate greater diversity - The inverse Simpson (1 / sum(p_i^2)) is available via
index='invsimpson'Fisher’s alpha assumes a log-series distribution: - Useful for abundance data - Less sensitive to sample size than richness - Can be slow for large datasets
Examples
Calculate Shannon diversity:
>>> import nuee >>> species = nuee.datasets.varespec() >>> div = nuee.diversity(species, index="shannon") >>> print(f"Mean diversity: {div.mean():.3f}") Mean diversity: 1.754
Calculate Simpson diversity:
>>> div_simp = nuee.diversity(species, index="simpson")
Calculate diversity for grouped samples:
>>> import numpy as np >>> groups = np.array(['A', 'A', 'B', 'B', 'C', 'C']) >>> div_grouped = nuee.diversity(species[:6], index="shannon", groups=groups)
Use different logarithm bases:
>>> div_bits = nuee.diversity(species, index="shannon", base=2) >>> print("Diversity in bits:", div_bits.mean()) Diversity in bits: 2.5300454694165375
See also
shannonShannon diversity (convenience function)
simpsonSimpson diversity (convenience function)
fisher_alphaFisher’s alpha (convenience function)
renyiRenyi entropy for multiple scales
specnumberSpecies richness
References
[1] Shannon, C.E. (1948). A mathematical theory of communication. Bell System Technical Journal 27, 379-423.
[2] Simpson, E.H. (1949). Measurement of diversity. Nature 163, 688.
[3] Fisher, R.A., Corbet, A.S., Williams, C.B. (1943). The relation between the number of species and the number of individuals in a random sample of an animal population. Journal of Animal Ecology 12, 42-58.
- nuee.specnumber(x: ndarray | DataFrame, groups: ndarray | Series | None = None) DiversityResult[source]¶
Calculate species richness (number of species).
- Parameters:
x – Community data matrix or vector
groups – Grouping factor for pooled richness
- Returns:
DiversityResult object with automatic plotting
- nuee.fisher_alpha(x: ndarray | DataFrame) DiversityResult[source]¶
Calculate Fisher’s alpha diversity index.
- Parameters:
x – Community data matrix or vector
- Returns:
DiversityResult object with automatic plotting
- nuee.renyi(x: ndarray | DataFrame, scales: float | List[float] = [0, 1, 2, inf], hill: bool = False) ndarray | DataFrame[source]¶
Calculate Renyi entropy or Hill numbers.
- Parameters:
x – Community data matrix or vector
scales – Scale parameters (alpha values)
hill – Whether to return Hill numbers instead of Renyi entropy
- Returns:
Renyi entropy or Hill numbers for each scale
- nuee.simpson(x: ndarray | DataFrame) DiversityResult[source]¶
Calculate Gini-Simpson diversity (1 - sum(p_i^2)).
- Parameters:
x – Community data matrix or vector
- Returns:
DiversityResult object with automatic plotting
- nuee.shannon(x: ndarray | DataFrame, base: float = 2.718281828459045) DiversityResult[source]¶
Calculate Shannon diversity index.
H = -sum(p_i * log(p_i))
- Parameters:
x – Community data matrix or vector
base – Base of logarithm
- Returns:
DiversityResult object with automatic plotting
- nuee.evenness(x: ndarray | DataFrame, method: str = 'pielou') DiversityResult[source]¶
Calculate evenness indices.
- Parameters:
x – Community data matrix or vector
method – Evenness method (“pielou”, “simpson”, “evar”)
- Returns:
DiversityResult object with automatic plotting
- nuee.rarefy(x: ndarray | DataFrame, sample: int) ndarray | Series[source]¶
Rarefy species richness to a standard sample size.
- Parameters:
x – Community data matrix
sample – Sample size for rarefaction
- Returns:
Rarefied species richness
- nuee.rarecurve(x: ndarray | DataFrame, step: int = 1, sample: int | None = None) dict[source]¶
Calculate rarefaction curves.
- Parameters:
x – Community data matrix
step – Step size for rarefaction
sample – Maximum sample size
- Returns:
Dictionary with rarefaction curves
- nuee.estimateR(x: ndarray | DataFrame) DataFrame[source]¶
Estimate species richness using various estimators.
- Parameters:
x – Community data matrix
- Returns:
DataFrame with richness estimates
- nuee.specaccum(x: ndarray | DataFrame, method: str = 'random', permutations: int = 100) dict[source]¶
Calculate species accumulation curves.
- Parameters:
x – Community data matrix
method – Accumulation method (“random”, “exact”, “rarefaction”)
permutations – Number of permutations for random method
- Returns:
Dictionary with accumulation results
- nuee.poolaccum(x: ndarray | DataFrame) dict[source]¶
Calculate pooled species accumulation.
- Parameters:
x – Community data matrix
- Returns:
Dictionary with pooled accumulation results
- nuee.vegdist(x: ndarray | DataFrame, method: str = 'bray', binary: bool = False, diag: bool = False, upper: bool = False, na_rm: bool = False) ndarray[source]¶
Calculate ecological distance matrices.
This function calculates various dissimilarity indices commonly used in community ecology. It is designed to be compatible with the R vegan package’s vegdist function.
- Parameters:
x (np.ndarray or pd.DataFrame) – Community data matrix with samples in rows and species in columns.
method (str, default='bray') – Distance measure: ‘bray’, ‘jaccard’, ‘euclidean’, ‘manhattan’, ‘canberra’, ‘gower’, ‘altGower’, ‘morisita’, ‘horn’, ‘mountford’, ‘raup’, ‘binomial’, ‘chao’, ‘cao’, ‘kulczynski’, ‘mahalanobis’
binary (bool, default=False) – Convert data to presence/absence before calculating distances.
diag (bool, default=False) – Include diagonal elements in output.
upper (bool, default=False) – Include upper triangle in output.
na_rm (bool, default=False) – Remove samples with missing values.
- Returns:
Square symmetric distance matrix of shape (n_samples, n_samples).
- Return type:
np.ndarray
Examples
>>> import nuee >>> species = nuee.datasets.varespec() >>> dist = nuee.vegdist(species, method="bray") >>> print(f"Shape: {dist.shape}") Shape: (24, 24)
See also
adonis2PERMANOVA using distance matrices
betadisperTest for homogeneity of dispersions
mantelMantel test for matrix correlation
References
[1] Legendre, P. and Legendre, L. (2012). Numerical Ecology. Elsevier.
- nuee.adonis2(distance_matrix: ndarray | DataFrame | Series, factors: ndarray | DataFrame | Series, permutations: int = 999, random_state: int | Generator | None = None, distance_method: str = 'bray', **kwargs) DataFrame[source]¶
Convenience wrapper for distance-based PERMANOVA.
- Parameters:
distance_matrix – Square or condensed distance matrix.
factors – Predictor variables supplied as a
DataFrame/Series/array.permutations – Number of permutations for the significance test.
random_state – Seed or
Generatorcontrolling permutation reproducibility.
- Returns:
Same payload as
permanova().- Return type:
- nuee.anosim(distance_matrix: ndarray | DataFrame | Series, grouping: ndarray | DataFrame | Series, permutations: int = 999, random_state: int | Generator | None = None, **kwargs) Dict[str, float | int][source]¶
Analysis of similarities (ANOSIM).
- Parameters:
distance_matrix – Square or condensed distance matrix.
grouping – Group assignments for each observation.
permutations – Number of permutations used to estimate the p-value. Set to 0 to skip.
random_state – Seed controlling permutation reproducibility.
- Returns:
Dictionary with
r_statistic,p_valueandpermutations.- Return type:
- nuee.mrpp(distance_matrix: ndarray | DataFrame | Series, grouping: ndarray | DataFrame | Series, permutations: int = 999, random_state: int | Generator | None = None, distance_method: str = 'euclidean', **kwargs) Dict[str, float | int][source]¶
Multi-Response Permutation Procedure (MRPP).
- Parameters:
distance_matrix – Square or condensed distance matrix.
grouping – Group assignments for each observation.
permutations – Number of permutations for p-value estimation. Set to 0 to skip permutations.
random_state – Seed controlling permutation reproducibility.
- Returns:
Contains
delta(observed within-group distance),expected_delta(mean delta under permutations),a_statistic(chance-corrected agreement),p_value, andpermutations.- Return type:
- nuee.betadisper(distance_matrix: ndarray | DataFrame | Series, grouping: ndarray | DataFrame | Series, permutations: int = 999, random_state: int | Generator | None = None, distance_method: str = 'bray', **kwargs) Dict[str, float | int | ndarray | DataFrame][source]¶
Multivariate homogeneity of group dispersions.
- Parameters:
distance_matrix – Square or condensed distance matrix, or original data matrix which will be converted via
vegdist.grouping – Group assignments for each observation.
permutations – Number of permutations for the ANOVA-like dispersion test (default: 999). Set to 0 to skip the permutation test.
random_state – Seed controlling permutation reproducibility.
distance_method – Distance metric when converting from raw data (default
bray).
- Returns:
distances: Series of distances to group centroids.centroids: DataFrame of group centroids in PCoA space.f_statistic/p_value: results of the dispersion test.group_means: within-group mean distances.permutations: number of permutations performed.
- Return type:
- nuee.mantel(x: ndarray | DataFrame | Series, y: ndarray | DataFrame | Series, method: str = 'pearson', permutations: int = 999, random_state: int | Generator | None = None, **kwargs) Dict[str, float | int | str][source]¶
- nuee.mantel_partial(x: ndarray | DataFrame | Series, y: ndarray | DataFrame | Series, z: ndarray | DataFrame | Series, method: str = 'pearson', permutations: int = 999, random_state: int | Generator | None = None, **kwargs) Dict[str, float | int | str][source]¶
- nuee.protest(x: ndarray | DataFrame, y: ndarray | DataFrame, permutations: int = 999, *, random_state: int | RandomState | Generator | None = None, scale: bool = True) Dict[str, float | int | ndarray][source]¶
Perform a Procrustean randomization test (PROTEST) between two ordinations.
- Parameters:
x – Configuration matrices with matched observations (rows) and axes (columns).
y – Configuration matrices with matched observations (rows) and axes (columns).
permutations – Number of row permutations applied to
yto approximate the null distribution. Set to0to skip permutation testing.random_state – Seed or numpy RNG used for the permutation stream. When
Nonethe global numpy RNG is used.scale – Forwarded to
nuee.procrustes()to control symmetric scaling.
- Returns:
Dictionary capturing the observed correlation, permutation p-value, number of permutations, and Procrustes transformation details.
- Return type:
- nuee.permanova(distance_matrix: ndarray | DataFrame | Series, factors: ndarray | DataFrame | Series, permutations: int = 999, random_state: int | Generator | None = None, distance_method: str = 'bray', **kwargs) DataFrame[source]¶
Distance-based PERMANOVA (sequential sums of squares).
- Parameters:
distance_matrix – Square or condensed distance matrix.
factors – DataFrame, Series, or array of predictor variables. Each column is treated as a separate term evaluated sequentially.
permutations – Number of permutations for significance testing. Set to
0orNoneto skip permutation p-values.random_state – Seed or
Generatorfor reproducible permutations.
- Returns:
Dictionary containing a result table (
table), the total sum of squares, and the permutation F-statistics.- Return type:
- nuee.permtest(statistic_func: Callable, data: ndarray, permutations: int = 999, **kwargs) dict[source]¶
General permutation test.
- Parameters:
statistic_func – Function to calculate test statistic
data – Data array
permutations – Number of permutations
**kwargs – Additional parameters
- Returns:
Dictionary with test results
- nuee.permutest(ordination_result: ConstrainedOrdinationResult, permutations: int = 999, *, random_state: int | Generator | None = None) Dict[str, float | int | ndarray | DataFrame][source]¶
Permutation test for constrained ordination (RDA/CCA).
- Parameters:
ordination_result – Constrained ordination result (e.g., from
nuee.rda()).permutations – Number of permutations used to build the null distribution.
random_state – Optional random seed or Generator for reproducible results.
- Returns:
Dictionary containing the ANOVA-style table and permutation details.
- Return type:
- nuee.anova_cca(ordination_result: ConstrainedOrdinationResult, permutations: int = 999, *, random_state: int | Generator | None = None) Dict[str, float | int | ndarray | DataFrame][source]¶
Permutation ANOVA for constrained ordination results.
- Parameters:
ordination_result – Result object returned by
nuee.rda()ornuee.cca().permutations – Number of permutations used to approximate the null distribution.
random_state – Optional seed/Generator for reproducible permutations.
- Returns:
Dictionary containing the ANOVA table and permutation metadata.
- Return type:
- nuee.plot_ordination(result: OrdinationResult, axes: Tuple[int, int] = (0, 1), display: str = 'sites', choices: List[int] | None = None, type: str = 'points', groups: ndarray | Series | None = None, colors: List[str] | None = None, figsize: Tuple[int, int] = (8, 6), scaling: int | str | None = None, title: str | None = None, **kwargs) Figure[source]¶
Plot ordination results.
- Parameters:
result – OrdinationResult object
axes – Which axes to plot
display – What to display (“sites”, “species”, “both”)
choices – Alternative way to specify axes
type – Plot type (“points”, “text”, “none”)
groups – Grouping factor for coloring points
colors – Colors for groups
figsize – Figure size
**kwargs – Additional plotting arguments
- Returns:
matplotlib Figure object
- nuee.plot_diversity(diversity_data: ndarray | Series | DataFrame | DiversityResult, figsize: tuple = (8, 6), **kwargs) Figure[source]¶
Plot diversity indices.
- Parameters:
diversity_data – Diversity values
figsize – Figure size
**kwargs – Additional plotting arguments
- Returns:
matplotlib Figure object
- nuee.plot_dissimilarity(distance_matrix: ndarray | DataFrame, figsize: tuple = (8, 6), **kwargs) Figure[source]¶
Plot dissimilarity matrix as heatmap.
- Parameters:
distance_matrix – Distance/dissimilarity matrix
figsize – Figure size
**kwargs – Additional plotting arguments
- Returns:
matplotlib Figure object
- nuee.biplot(result: OrdinationResult, axes: Tuple[int, int] = (0, 1), scaling: str | int = 'species', correlation: bool = False, figsize: Tuple[int, int] = (10, 8), title: str | None = None, arrow_mul: float | None = None, n_species: int | None = 15, show_site_labels: bool = True, show_species_labels: bool = True, repel: bool = True, fontsize: int = 8, site_kw: Dict[str, Any] | None = None, species_kw: Dict[str, Any] | None = None, env_kw: Dict[str, Any] | None = None, groups: ndarray | Series | list | None = None, color_by: ndarray | Series | list | None = None, cmap: str | None = None, **kwargs) Figure[source]¶
Create a biplot for ordination results.
For unconstrained ordination (PCA, CA, LDA), species loadings are drawn as arrows from the origin. For constrained ordination (RDA / CCA), species are shown as points and environmental variables as arrows.
- Parameters:
result (OrdinationResult) – Ordination result object.
axes (tuple of int) – Which ordination axes to plot (0-indexed).
scaling (str or int) – Scaling mode: 1/”sites”, 2/”species”, 3/”symmetric”.
correlation (bool) – If True, use raw correlation values without auto-scaling.
title (str, optional) – Plot title.
arrow_mul (float, optional) – Manual multiplier for arrow length.
n_species (int or None) – Show only the top n_species by loading magnitude.
Noneshows all species.show_site_labels (bool) – Whether to display site name labels.
show_species_labels (bool) – Whether to display species name labels.
repel (bool) – Use adjustText for ggrepel-style label placement.
fontsize (int) – Base font size for labels.
site_kw (dict, optional) – Extra keyword arguments for site scatter points.
species_kw (dict, optional) – Extra keyword arguments for species scatter/arrows.
env_kw (dict, optional) – Extra keyword arguments for environmental arrows.
groups (array-like, optional) – Categorical group labels (one per site) for coloured scatter. Auto-detected from LDA results.
color_by (array-like, optional) – Continuous values (one per site) for colour-mapped scatter with a colourbar. Mutually exclusive with groups.
cmap (str, optional) – Matplotlib colormap name. Default is the colour cycle for groups (up to 10) and
"viridis"for color_by.**kwargs – Additional keyword arguments passed to site scatter.
- nuee.ordiplot(result: OrdinationResult, axes: Tuple[int, int] = (0, 1), display: str = 'sites', figsize: Tuple[int, int] = (8, 6), scaling: int | str | None = None, **kwargs) Figure[source]¶
Basic ordination plot.
- Parameters:
result – OrdinationResult object
axes – Which axes to plot
display – What to display (“sites”, “species”, “both”)
figsize – Figure size
**kwargs – Additional plotting arguments
- Returns:
matplotlib Figure object
- nuee.plot_rarecurve(rarecurve_data: Dict[str, Dict[str, ndarray]], figsize: tuple = (10, 6), **kwargs) Figure[source]¶
Plot rarefaction curves.
- Parameters:
rarecurve_data – Rarefaction curve data
figsize – Figure size
**kwargs – Additional plotting arguments
- Returns:
matplotlib Figure object
- nuee.plot_specaccum(specaccum_data: Dict[str, ndarray], figsize: tuple = (8, 6), **kwargs) Figure[source]¶
Plot species accumulation curves.
- Parameters:
specaccum_data – Species accumulation data
figsize – Figure size
**kwargs – Additional plotting arguments
- Returns:
matplotlib Figure object
- nuee.ordiellipse(result: OrdinationResult, groups: ndarray | Series, axes: Tuple[int, int] = (0, 1), conf: float = 0.95, figsize: Tuple[int, int] = (8, 6), scaling: int | str | None = None, **kwargs) Figure[source]¶
Add confidence ellipses to ordination plot.
- Parameters:
result – OrdinationResult object
groups – Grouping factor
axes – Which axes to plot
conf – Confidence level for ellipses
figsize – Figure size
**kwargs – Additional plotting arguments
- Returns:
matplotlib Figure object
- nuee.ordispider(result: OrdinationResult, groups: ndarray | Series, axes: Tuple[int, int] = (0, 1), figsize: Tuple[int, int] = (8, 6), scaling: int | str | None = None, **kwargs) Figure[source]¶
Add spider plots (lines from centroid to points) to ordination.
- Parameters:
result – OrdinationResult object
groups – Grouping factor
axes – Which axes to plot
figsize – Figure size
**kwargs – Additional plotting arguments
- Returns:
matplotlib Figure object
- nuee.plot_betadisper(betadisper_result: dict, figsize: tuple = (8, 6), **kwargs) Figure[source]¶
Plot beta dispersion results.
- Parameters:
betadisper_result – Beta dispersion results
figsize – Figure size
**kwargs – Additional plotting arguments
- Returns:
matplotlib Figure object
- nuee.closure(mat: ndarray, *, out: ndarray | None = None) ndarray[source]¶
Perform closure so that each composition sums to 1.
- Parameters:
mat – A matrix where rows are compositions and columns are components.
out – Optional array where the result is stored.
- Returns:
Matrix of proportions with non-negative entries that sum to 1 per row.
- Return type:
- nuee.multiplicative_replacement(mat: ndarray, delta: float | None = None) ndarray[source]¶
Replace structural zeros with a small non-zero value.
- nuee.power(x: ndarray, a: float) ndarray[source]¶
Raise each component to a power and renormalise via closure.
- nuee.clr(mat: ndarray, ignore_zero: bool = False) ndarray[source]¶
Compute the centred log-ratio transformation.
- nuee.inner(mat: ndarray) ndarray[source]¶
Compute the inner product matrix in the Aitchison simplex.
- nuee.ilr(mat: ndarray, basis: ndarray | None = None) ndarray[source]¶
Perform the isometric log-ratio transformation.
- nuee.ilr_inv(mat: ndarray, basis: ndarray | None = None) ndarray[source]¶
Inverse isometric log-ratio transformation.
- nuee.alr(mat: ndarray, denominator_idx: int = -1) ndarray[source]¶
Perform the additive log-ratio transformation.
- nuee.alr_inv(mat: ndarray, denominator_idx: int = -1) ndarray[source]¶
Inverse additive log-ratio transformation.
- nuee.sbp_basis(sbp: ndarray) ndarray[source]¶
Construct an orthonormal basis from a sequential binary partition.
- nuee.center(mat: ndarray) ndarray[source]¶
Alias for
centralize()kept for API parity.
- nuee.replace_zeros(X: ndarray | DataFrame, detection_limits: ndarray | None = None, delta: float | None = None) ndarray | DataFrame[source]¶
Multiplicative zero replacement for compositional data.
Replaces zeros with a small value proportional to the detection limit (or column minimum of non-zero values) and adjusts non-zero entries so that each row sum is preserved exactly.
- Parameters:
X (array-like or DataFrame, shape (n, D)) – Compositional data matrix. Zeros mark below-detection-limit values.
detection_limits (array-like of shape (D,), optional) – Per-component detection limits. When None, the column-wise minimum of strictly positive values is used as a proxy.
delta (float, optional) – Fraction of the detection limit used as the replacement value. Default is 0.65 (Martín-Fernández et al. 2003).
- Returns:
Data with zeros replaced. Row sums match the input exactly.
- Return type:
numpy.ndarray or DataFrame
References
[1] Martín-Fernández, J. A., Barceló-Vidal, C. & Pawlowsky-Glahn, V. (2003). Dealing with zeros and missing values in compositional data sets using nonparametric imputation. Mathematical Geology 35(3).
- nuee.impute_missing(X: ndarray | DataFrame, method: str = 'lrEM', max_iter: int = 100, tol: float = 0.0001, random_state: int | None = None) ndarray | DataFrame[source]¶
Impute missing values in compositional data using the lrEM algorithm.
Uses the ALR (additive log-ratio) EM algorithm of Palarea-Albaladejo & Martín-Fernández (2008), matching the approach in R’s
zCompositionspackage. Observed values are preserved exactly in the output.Ideally one column should be fully observed (no
NaNvalues) to serve as the ALR denominator. When no column is complete, the column with the fewest missing values is chosen and its gaps are pre-filled using row-proportional estimation from column-mean ratios before running EM.- Parameters:
X (array-like or DataFrame, shape (n, D)) – Compositional data matrix.
NaNmarks missing components. Observed (non-NaN) values must be strictly positive.method ({"lrEM", "lrDA"}, default "lrEM") –
"lrEM"returns the conditional expectation (deterministic)."lrDA"adds noise from the conditional covariance for multiple imputation / data augmentation.max_iter (int, default 100) – Maximum number of EM iterations.
tol (float, default 1e-4) – Convergence tolerance on the relative change of the log-likelihood.
random_state (int, optional) – Seed for the random number generator (only used when method=”lrDA”).
- Returns:
Completed data. Observed values are unchanged; imputed values are scaled consistently with the original observed components.
- Return type:
numpy.ndarray or DataFrame
References
[1] Palarea-Albaladejo, J. & Martín-Fernández, J. A. (2008). A modified EM alr-algorithm for replacing rounded zeros in compositional data sets. Computers & Geosciences 34(8).