MNM: Fit a Multi-Species N-Mixture Model (MNM) using Nimble

Description

Fits a multi-species N-mixture (MNM) model to observed count data, leveraging Nimble for Bayesian inference. This model accounts for variation in detection probability and abundance across multiple species and sites.

Usage

MNM(
  Y = NULL,
  iterations = 60000,
  burnin = 20000,
  thin = 10,
  Xp = NULL,
  Xn = NULL,
  verbose = TRUE,
  ...
)

Value

An MNM object that contains the following components:

summary: Nimble model summary (mean, standard deviation, standard error, quantiles, effective sample size and Rhat value for all monitored values)
n_parameters: Number of parameters in the model (for use in calculating information criteria)
data: Observed abundances
fitted_Y: Predicted values for Y. Posterior predictive checks can be performed by comparing fitted_Y with the observed data.
logLik: Log-likelihood of the observed data (Y) given the model parameters.
n_converged: Number of parameters with successful convergence (Rhat < 1.1).
plot: traceplots and density plots for all monitored variables.

Arguments

Y

An array of observed count data of dimension (R, T, S), where:

R: Number of sites
T: Number of replicates
S: Number of species

This array is typically produced by the simulateData function.

iterations

Integer. Total number of MCMC iterations to run. Default is 60,000.

burnin

Integer. Number of initial MCMC iterations to discard as burn-in. Default is 20,000.

thin

Integer. Thinning interval for MCMC samples to reduce autocorrelation. Default is 10.

Xp

An array of covariates affecting detection probability, with dimensions (R, S, P1), where:

R: Number of sites
S: Number of species
P1: Number of detection-related covariates

See examples for implementation details.

Xn

An array of covariates affecting abundance, with dimensions (R, S, P2), where:

R: Number of sites
S: Number of species
P1: Number of abundance-related covariates

See examples for implementation details.

verbose

Control the level of output displayed during function execution. Default is TRUE.

...

Additional arguments passed for prior distribution specification. Supported distributions include dnorm, dexp, dgamma, dbeta, dunif, dlnorm, dbern, dpois, dbinom, dcat, dmnorm, dwish, dchisq, dinvgamma, dt, dweib, ddirch, dmulti, dmvt. Default prior distributions are:

prior_detection_probability: prior distribution for the detection probability intercept (gamma). Default is 'dnorm(0, 0.001)'.
prior_precision: prior distribution for the precision matrix for the species-level random effect. Default is 'dwish(Omega[1:S,1:S], df)'.
prior_mean: prior distribution for the mean of the species-level random effect (mu). Default is 'dnorm(0,0.001)'.
prior_hurdle: prior distribution for theta, the probability of structural zero in hurdle models. Default is 'dbeta(1,1)'.
prior_mean_AR: prior distribution for the mean of the autoregressive random effect (phi). Default is 'dnorm(0,0.001)'.
prior_sd_AR: prior distribution for the standard deviation of the autoregressive random effect (phi). Default is 'dexp(1)'.

See Nimble (r-nimble.org) documentation for distribution details.

Details

This function takes observed count data and covariates, then fits an MNM model using Nimble. The model estimates species-specific detection probabilities and abundances, allowing for covariate effects. The function also supports posterior predictive checks by comparing observed counts with predicted values.

References

Royle, J. A. (2004). N-mixture models for estimating population size from spatially replicated counts. Biometrics, 60(1), 108-115.
Mimnagh, N., Parnell, A., Prado, E., & Moral, R. D. A. (2022). Bayesian multi-species N-mixture models for unmarked animal communities. Environmental and Ecological Statistics, 29(4), 755-778.

Examples

Run this code

# Example 1:
# Covariates must be of dimension (R, S, P1/P2). If covariates of an alternative dimension are used,
#  they must first be coerced into the right format.
# If we have two abundance-covariates, one site-level covariate and one species-level
# covariate, they may be combined as follows:
R <- 10  # Number of sites
S <- 5   # Number of species
T<-5
Y <- array(sample(0:10, 100, replace = TRUE), dim = c(R, T, S))
covariate_1 <- runif(R)  # Site-level covariate
covariate_2 <- runif(S)  # Species-level covariate

# Expand covariate_1 to have S columns
expanded_covariate_1 <- matrix(rep(covariate_1, S), nrow = R, ncol = S)
# Expand covariate_2 to have R rows
expanded_covariate_2 <- t(matrix(rep(covariate_2, R), nrow = S, ncol = R))
# Combine into an array of dimensions (R, S, 2)
Xn <- array(c(expanded_covariate_1, expanded_covariate_2), dim = c(R, S, 2))
dim(Xn) # this is now in the correct format and can be used.
result <- MNM(Y, Xn = Xn)
# nimble creates auxiliary functions that may be removed after model
# run is complete using rm(list = ls(pattern = "^str"))
print(result@summary)
 data(birds)
# Example 2: North American Breeding Bird Data
# Data must first be reformatted to an array of dimension (R,T,S,K)
R <- 15
T <- 10
S <- 10
K <- 4
# Ensure data is ordered consistently
birds <- birds[order(birds$Route, birds$Year, birds$English_Common_Name), ]
# Create a 4D array with proper dimension
Y <- array(NA, dim = c(R, T, S, K))
# Map route, species, and year to indices
route_idx <- as.numeric(factor(birds$Route))
species_idx <- as.numeric(factor(birds$English_Common_Name))
year_idx <- as.numeric(factor(birds$Year))
# Populate the array
stop_data <- as.matrix(birds[, grep("^Stop", colnames(birds))])
for (i in seq_len(nrow(birds))) {
  Y[route_idx[i], , species_idx[i], year_idx[i]] <- stop_data[i, ]
  }
  # Assign dimnames
  dimnames(Y) <- list(
    Route = sort(unique(birds$Route)),
      Stop = paste0("Stop", 1:T),
        Species = sort(unique(birds$English_Common_Name)),
          Year = sort(unique(birds$Year))
          )
# Selecting only 5 bird species and 1 year for analysis:
Y<-Y[,,1:5,1]
model<-MNM_fit(Y=Y, AR=FALSE, Hurdle=FALSE, iterations=5000, burnin=1000)

Run the code above in your browser using DataLab