MNM_Hurdle_AR: Fit a multi-species N-mixture (MNM) Hurdle-AR model using Nimble

Description

Fits a multi-species N-mixture model (MNM) with an autoregressive (AR-1) component and a Hurdle (zero-altered) component using Nimble. This model is suitable for zero-inflated data and data collected over extended time periods.

Usage

MNM_Hurdle_AR(
  Y = NULL,
  iterations = 60000,
  burnin = 20000,
  thin = 10,
  Xp = NULL,
  Xn = NULL,
  verbose = TRUE,
  ...
)

Value

An object of class MNM with the following components:

summary: Summary statistics for monitored parameters, including mean, standard deviation, standard error, quantiles, effective sample size, and Rhat values.
n_parameters: Number of parameters in the model (useful for calculating information criteria).
data: Observed abundances (Y).
fitted_Y: Predicted values for Y. Use these for posterior predictive checks by comparing them with observed data.
logLik: Log-likelihood of the observed data (Y) given the model parameters.
n_converged: Number of parameters with successful convergence (Rhat < 1.1).
plot: traceplots and density plots for all monitored variables.

Arguments

Y

Array of observed counts with dimensions (R, T, S, K), where:

R: Number of sites.
T: Number of replicates.
S: Number of species.
K: Number of time periods.

iterations

Number of MCMC iterations for model fitting. Default is 60,000.

burnin

Number of initial iterations to discard as burn-in. Default is 20,000.

thin

Thinning interval for the MCMC chains. Default is 10.

Xp

Array of detection covariates with dimensions (R, S, P1), where:

R: Number of sites.
S: Number of species.
P1: Number of detection probability covariates.

Xn

Array of abundance covariates with dimensions (R, S, K, P2), where:

R: Number of sites.
S: Number of species.
K: Number of time points.
P2: Number of abundance covariates.

verbose

Control the level of output displayed during function execution. Default is TRUE.

...

Additional arguments passed for prior distribution specification. Supported distributions include dnorm, dexp, dgamma, dbeta, dunif, dlnorm, dbern, dpois, dbinom, dcat, dmnorm, dwish, dchisq, dinvgamma, dt, dweib, ddirch, dmulti, dmvt. Default prior distributions are:

prior_detection_probability: prior distribution for the detection probability intercept (gamma). Default is 'dnorm(0, 0.001)'.
prior_precision: prior distribution for the precision matrix for the species-level random effect. Default is 'dwish(Omega[1:S,1:S], df)'.
prior_mean: prior distribution for the mean of the species-level random effect (mu). Default is 'dnorm(0,0.001)'.
prior_hurdle: prior distribution for theta, the probability of structural zero in hurdle models. Default is 'dbeta(1,1)'.
prior_mean_AR: prior distribution for the mean of the autoregressive random effect (phi). Default is 'dnorm(0,0.001)'.
prior_sd_AR: prior distribution for the standard deviation of the autoregressive random effect (phi). Default is 'dexp(1)'.

See Nimble (r-nimble.org) documentation for distribution details.

Details

The MNM_Hurdle_AR model extends the standard N-mixture model by incorporating:

Hurdle (zero-altered) component: Handles zero-inflated data by modelling excess zeros separately.
Autoregressive (AR-1) component: Accounts for temporal dependencies in abundance data.

The model is fitted to data formatted as produced by the simulateData function. Covariates affecting detection probability and abundance may also be provided. Results include posterior summaries, model diagnostics, and predictions for posterior predictive checks.

References

Royle, J. A. (2004). N-mixture models for estimating population size from spatially replicated counts. Biometrics, 60(1), 108-115.
Mimnagh, N., Parnell, A., Prado, E., & Moral, R. D. A. (2022). Bayesian multi-species N-mixture models for unmarked animal communities. Environmental and Ecological Statistics, 29(4), 755-778.

Examples

Run this code

# Example 1: Simulate data and fit the model
# Simulating example data
set.seed(42)
R <- 5  # Number of sites
T <- 10  # Number of replicates
S <- 3  # Number of species
K <- 2  # Number of time periods
P1 <- 2  # Number of detection covariates
P2 <- 3  # Number of abundance covariates

x<-simulateData(model="HurdleAR", R=R, T=T, ,S=S, K=K)
Xp <- array(runif(R * S * K * P1), dim = c(R, S, K, P1))
Xn <- array(runif(R * S * K * P2), dim = c(R, S, K, P2))
# Fit the MNM_Hurdle_AR model
result <- MNM_Hurdle_AR(Y = x[["Y"]], Xp = Xp, Xn = Xn)
# nimble creates auxiliary functions that may be removed after model run
# is complete using rm(list = ls(pattern = "^str"))
# Access results
print(result@summary)

data(birds)

# Example 2: North American Breeding Bird Data
# Data must first be reformatted to an array of dimension (R,T,S,K)
R <- 15
T <- 10
S <- 10
K <- 4
# Ensure data is ordered consistently
birds <- birds[order(birds$Route, birds$Year, birds$English_Common_Name), ]

# Create a 4D array with proper dimension
Y <- array(NA, dim = c(R, T, S, K))

# Map route, species, and year to indices
route_idx <- as.numeric(factor(birds$Route))
species_idx <- as.numeric(factor(birds$English_Common_Name))
year_idx <- as.numeric(factor(birds$Year))

# Populate the array
stop_data <- as.matrix(birds[, grep("^Stop", colnames(birds))])

for (i in seq_len(nrow(birds))) {
  Y[route_idx[i], , species_idx[i], year_idx[i]] <- stop_data[i, ]
  }

  # Assign dimnames
  dimnames(Y) <- list(
    Route = sort(unique(birds$Route)),
      Stop = paste0("Stop", 1:T),
        Species = sort(unique(birds$English_Common_Name)),
          Year = sort(unique(birds$Year))
          )

# Selecting only 5 bird species for analysis:
Y<-Y[,,1:5,]

model<-MNM_fit(Y=Y, AR=TRUE, Hurdle=TRUE, iterations=5000, burnin=1000)

Run the code above in your browser using DataLab