MNM_Hurdle: Fit a Multi-Species N-Mixture Model with Hurdle Component using Nimble

Description

This function fits a multi-species N-mixture (MNM) model incorporating a Hurdle component to handle zero-inflated data, allowing for robust estimation of abundance and detection probabilities across multiple species and sites.

Usage

MNM_Hurdle(
  Y = NULL,
  iterations = 60000,
  burnin = 20000,
  thin = 10,
  Xp = NULL,
  Xn = NULL,
  verbose = TRUE,
  ...
)

Value

An MNM object that contains the following components:

summary: Nimble model summary statistics (mean, standard deviation, standard error, quantiles, effective sample size and Rhat value for all monitored values)
n_parameters: Number of parameters in the model (for use in calculating information criteria).
data: Observed abundances.
fitted_Y: Predicted values for Y. Posterior predictive checks can be performed by comparing fitted_Y with the observed data.
logLik: Log-likelihood of the observed data (Y) given the model parameters.
n_converged: Number of parameters with successful convergence (Rhat < 1.1).
plot: traceplots and density plots for all monitored variables.

Arguments

Y

Array of observed counts, with dimensions (R, T, S, K), where:

R: Number of sites.
T: Number of repeated counts (replicates).
S: Number of species.

iterations

Integer. Number of iterations to be used in the JAGS model. Defaults to 60,000.

burnin

Integer. Number of iterations to be discarded as burn-in. Defaults to 20,000.

thin

Integer. Thinning interval for the MCMC chains. Defaults to 10.

Xp

Array of detection covariates with dimensions (R, S, P1), where:

R: Number of sites.
S: Number of species.
P1: Number of detection probability covariates.

Xn

Array of abundance covariates with dimensions (R, S, P2), where:

R: Number of sites.
S: Number of species.
P2: Number of abundance covariates.

verbose

Control the level of output displayed during function execution. Default is TRUE.

...

Additional arguments passed for prior distribution specification. Supported distributions include dnorm, dexp, dgamma, dbeta, dunif, dlnorm, dbern, dpois, dbinom, dcat, dmnorm, dwish, dchisq, dinvgamma, dt, dweib, ddirch, dmulti, dmvt. Default prior distributions are:

prior_detection_probability: prior distribution for the detection probability intercept (gamma). Default is 'dnorm(0, 0.001)'.
prior_precision: prior distribution for the precision matrix for the species-level random effect. Default is 'dwish(Omega[1:S,1:S], df)'.
prior_mean: prior distribution for the mean of the species-level random effect (mu). Default is 'dnorm(0,0.001)'.
prior_hurdle: prior distribution for theta, the probability of structural zero in hurdle models. Default is 'dbeta(1,1)'.
prior_mean_AR: prior distribution for the mean of the autoregressive random effect (phi). Default is 'dnorm(0,0.001)'.
prior_sd_AR: prior distribution for the standard deviation of the autoregressive random effect (phi). Default is 'dexp(1)'.

See Nimble (r-nimble.org) documentation for distribution details.

Details

This function uses the Nimble framework to fit a Hurdle model, which combines a truncated Poisson distribution for non-zero counts with a separate process for modeling zero counts. The model is particularly suitable for ecological data with excess zeros, such as species occurrence data.

The model supports covariates influencing both abundance and detection probabilities, and outputs posterior distributions for model parameters, derived quantities, and predicted values. Convergence diagnostics and posterior predictive checks can also be performed using the returned results.

References

Royle, J. A. (2004). N-mixture models for estimating population size from spatially replicated counts. Biometrics, 60(1), 108-115.
Mimnagh, N., Parnell, A., Prado, E., & Moral, R. D. A. (2022). Bayesian multi-species N-mixture models for unmarked animal communities. Environmental and Ecological Statistics, 29(4), 755-778.

Examples

Run this code

# Example 1:
Y <- array(rpois(100, lambda = 5), dim = c(10, 5, 2))
Xp <- array(runif(100), dim = c(10, 2, 5))
Xn <- array(runif(100), dim = c(10, 2, 3))

model <- MNM_Hurdle(Y = Y, Xp = Xp, Xn = Xn)
# nimble creates auxiliary functions that may be removed after model
# run is complete using rm(list=ls(pattern = "^str"))
# Accessing results
print(model@summary)

data(birds)

# Example 2: North American Breeding Bird Data
# Data must first be reformatted to an array of dimension (R,T,S,K)
R <- 15
T <- 10
S <- 10
K <- 4
# Ensure data is ordered consistently
birds <- birds[order(birds$Route, birds$Year, birds$English_Common_Name), ]

# Create a 4D array with proper dimension
Y <- array(NA, dim = c(R, T, S, K))

# Map route, species, and year to indices
route_idx <- as.numeric(factor(birds$Route))
species_idx <- as.numeric(factor(birds$English_Common_Name))
year_idx <- as.numeric(factor(birds$Year))

# Populate the array
stop_data <- as.matrix(birds[, grep("^Stop", colnames(birds))])

for (i in seq_len(nrow(birds))) {
  Y[route_idx[i], , species_idx[i], year_idx[i]] <- stop_data[i, ]
  }

  # Assign dimnames
  dimnames(Y) <- list(
    Route = sort(unique(birds$Route)),
      Stop = paste0("Stop", 1:T),
        Species = sort(unique(birds$English_Common_Name)),
          Year = sort(unique(birds$Year))
          )

# Selecting only 5 bird species and 1 year for analysis:
Y<-Y[,,1:5,1]

model<-MNM_fit(Y=Y, AR=FALSE, Hurdle=TRUE, iterations=5000, burnin=1000)

Run the code above in your browser using DataLab