Function mpm_create()
is the core workhorse function that creates
all flavors of MPM in lefko3
. All other MPM creation functions act
as wrappers for this function. As such, this function provides the most
general and most detailed control over the MPM creation process.
mpm_create(
historical = FALSE,
stage = TRUE,
age = FALSE,
devries = FALSE,
reduce = FALSE,
simple = FALSE,
err_check = FALSE,
data = NULL,
year = NULL,
pop = NULL,
patch = NULL,
stageframe = NULL,
supplement = NULL,
overwrite = NULL,
repmatrix = NULL,
alive = NULL,
obsst = NULL,
size = NULL,
sizeb = NULL,
sizec = NULL,
repst = NULL,
matst = NULL,
fec = NULL,
stages = NULL,
yearcol = NULL,
popcol = NULL,
patchcol = NULL,
indivcol = NULL,
agecol = NULL,
censorcol = NULL,
modelsuite = NULL,
paramnames = NULL,
inda = NULL,
indb = NULL,
indc = NULL,
dev_terms = NULL,
density = NA_real_,
CDF = TRUE,
random_inda = FALSE,
random_indb = FALSE,
random_indc = FALSE,
negfec = FALSE,
exp_tol = 700L,
theta_tol = 100000000L,
censor = FALSE,
censorkeep = NULL,
start_age = NA_integer_,
last_age = NA_integer_,
fecage_min = NA_integer_,
fecage_max = NA_integer_,
fectime = 2L,
fecmod = 1,
cont = TRUE,
prebreeding = TRUE,
stage_NRasRep = FALSE,
sparse_output = FALSE
)
An object of class lefkoMat
. This is a list that holds the
matrix projection model and all of its metadata. The structure has the
following elements:
A list of full projection matrices in order of sorted patches and
occasion times. All matrices output in R's matrix
class, or in
the dgCMatrix
class from the Matrix
package if sparse.
A list of survival transition matrices sorted as in A
. All
matrices output in R's matrix
class, or in the dgCMatrix
class
from the Matrix
package if sparse.
A list of fecundity matrices sorted as in A
. All matrices
output in R's matrix
class, or in the dgCMatrix
class from the
Matrix
package if sparse.
A data frame matrix showing the pairing of ahistorical stages used to create historical stage pairs. Only used in historical MPMs.
A data frame showing age-stage pairs. Only used in age-by-stage MPMs.
A data frame detailing the characteristics of associated ahistorical stages, in the form of a modified stageframe that includes status as an entry stage through reproduction. Used in all stage-based and age-by-stage MPMs.
A data frame giving the population, patch, and year of each matrix in order.
A vector showing the numbers of individuals and rows in the vertical dataset used as input.
A short vector describing the number of non-zero elements in
U
and F
matrices, and the number of annual matrices.
This is the qc
portion of the modelsuite
input.
An optional element only added if err_check = TRUE
.
This is a list of vital rate probability matrices, with 7 columns in the
order of survival, observation probability, reproduction probability, primary
size transition probability, secondary size transition probability, tertiary
size transition probability, and probability of juvenile transition to
maturity.
An optional element only added if err_check = TRUE
.
This is a data frame giving the values used to determine each matrix element
capable of being estimated.
An optional element only added if err_check = TRUE
and a
raw MPM is requested. This consists of the original dataset as edited by
this function for indexing purposes.
A logical value indicating whether to build a historical
MPM. Defaults to FALSE
.
A logical value indicating whether to build a stage-based MPM.
If both stage = TRUE
and age = TRUE
, then will proceed to
build an age-by-stage MPM. Defaults to TRUE
.
A logical value indicating whether to build an age-based MPM. If
both stage = TRUE
and age = TRUE
, then will proceed to build
an age-by-stage MPM. Defaults to FALSE
.
A logical value indicating whether to use deVries format
for historical MPMs. Defaults to FALSE
, in which case historical MPMs
are created in Ehrlen format.
A logical value denoting whether to remove ages, ahistorical
stages, or historical stages associated exclusively with zero transitions.
These are removed only if the respective row and column sums in ALL matrices
estimated equal 0. Defaults to FALSE
.
A logical value indicating whether to produce A
,
U
, and F
matrices, or only the latter two. Defaults to
FALSE
, in which case all three are output.
A logical value indicating whether to append extra
information used in matrix calculation within the output list. Defaults to
FALSE
.
A data frame of class hfvdata
. Required for all MPMs,
except for function-based MPMs in which modelsuite
is set to a
vrm_input
object.
A variable corresponding to observation occasion, or a set of
such values, given in values associated with the year
term used in
vital rate model development. Can also equal "all"
, in which case
matrices will be estimated for all occasions. Defaults to "all"
.
A variable designating which populations will have matrices
estimated. Should be set to specific population names, or to "all"
if
all populations should have matrices estimated. Only used in raw MPMs.
A variable designating which patches or subpopulations will have
matrices estimated. Should be set to specific patch names, or to "all"
if matrices should be estimated for all patches. Defaults to NULL
, in
which case patch designations are ignored.
An object of class stageframe
. These objects are
generated by function sf_create()
, and include information on
the size, observation status, propagule status, reproduction status,
immaturity status, maturity status, stage group, size bin widths, and other
key characteristics of each ahistorical stage. Not needed for purely
age-based MPMs.
An optional data frame of class lefkoSD
that
provides supplemental data that should be incorporated into the MPM. Three
kinds of data may be integrated this way: transitions to be estimated via the
use of proxy transitions, transition overwrites from the literature or
supplemental studies, and transition multipliers for survival and fecundity.
This data frame should be produced using the supplemental()
function. Can be used in place of or in addition to an overwrite table (see
overwrite
below) and a reproduction matrix (see repmatrix
below).
An optional data frame developed with the
overwrite()
function describing transitions to be overwritten
either with given values or with other estimated transitions. Note that this
function supplements overwrite data provided in supplement
.
An optional reproduction matrix. This matrix is composed
mostly of 0
s, with non-zero entries acting as element identifiers and
multipliers for fecundity (with 1
equaling full fecundity). If left
blank, and no supplement
is provided, then all stages marked as
reproductive produce offspring at 1x that of estimated fecundity, and that
offspring production will yield the first stage noted as propagule or
immature. May be the dimensions of either a historical or an ahistorical
matrix. If the latter, then all stages will be used in occasion t-1
for each suggested ahistorical transition. Not used in purely age-based
MPMs.
A vector of names of binomial variables corresponding to status
as alive (1
) or dead (0
) in occasions t+1, t,
and t-1, respectively. Defaults to
c("alive3", "alive2", "alive1")
for historical MPMs, and
c("alive3", "alive2")
for ahistorical MPMs. Only needed for raw MPMs.
A vector of names of binomial variables corresponding to
observation status in occasions t+1, t, and t-1,
respectively. Defaults to c("obsstatus3", "obsstatus2", "obsstatus1")
for historical MPMs, and c("obsstatus3", "obsstatus2")
for
ahistorical MPMs. Only needed for raw MPMs.
A vector of names of variables coding the primary size variable
in occasions t+1, t, and t-1, respectively. Defaults to
c("sizea3", "sizea2", "sizea1")
for historical MPMs, and
c("sizea3", "sizea2")
for ahistorical MPMs. Only needed for raw,
stage-based MPMs.
A vector of names of variables coding the secondary size variable in occasions t+1, t, and t-1, respectively. Defaults to an empty set, assuming that secondary size is not used. Only needed for raw, stage-based MPMs.
A vector of names of variables coding the tertiary size variable in occasions t+1, t, and t-1, respectively. Defaults to an empty set, assuming that tertiary size is not used. Only needed for raw, stage-based MPMs.
A vector of names of binomial variables corresponding to
reproductive status in occasions t+1, t, and t-1,
respectively. Defaults to c("repstatus3", "repstatus2", "repstatus1")
for historical MPMs, and c("repstatus3", "repstatus2")
for
ahistorical MPMs. Only needed for raw MPMs.
A vector of names of binomial variables corresponding to
maturity status in occasions t+1, t, and t-1,
respectively. Defaults to c("matstatus3", "matstatus2", "matstatus1")
for historical MPMs, and c("matstatus3", "matstatus2")
for
ahistorical MPMs. Must be provided if building raw MPMs, and stages
is not provided.
A vector of names of variables coding for fecundity in occasions
t+1, t, and t-1, respectively. Defaults to
c("feca3", "feca2", "feca1")
for historical MPMs, and
c("feca3", "feca2")
for ahistorical MPMs. Only needed for raw,
stage-based MPMs.
An optional vector denoting the names of the variables within
the main vertical dataset coding for the stages of each individual in
occasions t+1 and t, and t-1, if historical. The names
of stages in these variables should match those used in the
stageframe
exactly. If left blank, then rlefko3()
will attempt
to infer stages by matching values of alive
, obsst
,
size
, sizev
, sizec
, repst
, and matst
to
characteristics noted in the associated stageframe
. Only used in raw,
stage-based MPMs.
The variable name or column number corresponding to occasion
t in the dataset. Defaults to "year2"
. Only needed for raw
MPMs.
The variable name or column number corresponding to the
identity of the population. Defaults to "popid"
if a value is
provided for pop
; otherwise empty. Only needed for raw MPMs.
The variable name or column number corresponding to patch in
the dataset. Defaults to "patchid"
if a value is provided for
patch
; otherwise empty. Only needed for raw MPMs.
The variable name or column number coding individual identity. Only needed for raw MPMs.
The variable name or column corresponding to age in time
t. Defaults to "obsage"
. Only used in raw age-based and
age-by-stage MPMs.
The variable name or column number denoting the censor
status. Only needed in raw MPMs, and only if censor = TRUE
.
One of three kinds of lists. The first is a
lefkoMod
object holding the vital rate models and associated
metadata. Alternatively, an object of class vrm_input
may be
provided. Finally, this argument may simply be a list of models used to
parameterize the MPM. In the final scenario, data
and
paramnames
must also be given, and all variable names must match
across all objects. If entered, then a function-based MPM will be developed.
Otherwise, a raw MPM will be developed. Only used in function-based MPMs.
A data frame with three columns, the first describing all
terms used in linear modeling, the second (must be called mainparams
)
giving the general model terms that will be used in matrix creation, and the
third showing the equivalent terms used in modeling (must be named
modelparams
). Function create_pm()
can be used to
create a skeleton paramnames
object, which can then be edited. Only
required to build function-based MPMs if modelsuite
is neither a
lefkoMod
object nor a vrm_input
object.
Can be a single value to use for individual covariate a
in all matrices, a pair of values to use for times t and t-1
in historical matrices, or a vector of such values corresponding to each
occasion in the dataset. Defaults to NULL
. Only used in
function-based MPMs.
Can be a single value to use for individual covariate b
in all matrices, a pair of values to use for times t and t-1
in historical matrices, or a vector of such values corresponding to each
occasion in the dataset. Defaults to NULL
. Only used in
function-based MPMs.
Can be a single value to use for individual covariate c
in all matrices, a pair of values to use for times t and t-1
in historical matrices, or a vector of such values corresponding to each
occasion in the dataset. Defaults to NULL
. Only used in
function-based MPMs.
A numeric vector of 2 elements in the case of a Leslie MPM,
and of 14 elements in all other cases. Consists of scalar additions to the
y-intercepts of vital rate linear models used to estimate vital rates in
function-based MPMs. Defaults to 0
values for all vital rates.
A numeric value indicating density value to use to propagate
matrices. Only needed if density is an explanatory term used in one or more
vital rate models. Defaults to NA
. Only used in function_based MPMs.
A logical value indicating whether to use the cumulative
distribution function to estimate size transition probabilities in
function-based MPMs. Defaults to TRUE
, and should only be changed to
FALSE
if approximate probabilities calculated via the midpoint method
are preferred.
A logical value denoting whether to treat individual
covariate a
as a random, categorical variable. Otherwise is treated
as a fixed, numeric variable. Defaults to FALSE
. Only used in
function-based MPMs.
A logical value denoting whether to treat individual
covariate b
as a random, categorical variable. Otherwise is treated
as a fixed, numeric variable. Defaults to FALSE
. Only used in
function-based MPMs.
A logical value denoting whether to treat individual
covariate c
as a random, categorical variable. Otherwise is treated
as a fixed, numeric variable. Defaults to FALSE
. Only used in
function-based MPMs.
A logical value denoting whether fecundity values estimated to
be negative should be reset to 0
. Defaults to FALSE
.
A numeric value used to indicate a maximum value to set
exponents to in the core kernel to prevent numerical overflow. Defaults to
700
. Only used in function-based MPMs.
A numeric value used to indicate a maximum value to theta
as used in the negative binomial probability density kernel. Defaults to
100000000
, but can be reset to other values during error checking.
Only used in function-based MPMs.
If TRUE
, then data will be removed according to the
variable set in censorcol
, such that only data with censor values
equal to censorkeep
will remain. Defaults to FALSE
. Only
used in raw MPMs.
The value of the censor variable denoting data elements to
keep. Defaults to 0
. Only used in raw MPMs.
The age from which to start the matrix. Defaults to
NULL
, in which case age 1
is used if
prebreeding = TRUE
, and age 0
is used if
prebreeding = FALSE
. Only used in age-based MPMs.
The final age to use in the matrix. Defaults to NULL
,
in which case the highest age in the dataset is used. Only used in age-based
and age-by-stage MPMs.
The minimum age at which reproduction is possible.
Defaults to NULL
, which is interpreted to mean that fecundity should
be assessed starting in the minimum age observed in the dataset. Only used
in age-based MPMs.
The maximum age at which reproduction is possible.
Defaults to NULL
, which is interpreted to mean that fecundity should
be assessed until the final observed age. Only used in age-based MPMs.
An integer indicating whether to estimate fecundity using
the variable given for fec
in time t (2
) or time
t+1 (3
). Only used for purely age-based MPMs. Defaults to
2
.
A scalar multiplier for fecundity. Only used for purely
age-based MPMs. Defaults to 1.0
.
A logical value designating whether to allow continued survival
of individuals past the final age noted in age-based and age-by-stage MPMs,
using the demographic characteristics of the final age. Defaults to
TRUE
.
A logical value indicating whether the life history model
is a pre-breeding model. Defaults to TRUE
.
A logical value indicating whether to treat
non-reproductive individuals as reproductive. Used only in raw, stage-based
MPMs in cases where stage assignment must still be handled. Not used in
function-based MPMs, and in stage-based MPMs in which a valid hfvdata
class data frame with stages already assigned is provided.
A logical value indicating whether to output matrices
in sparse format. Defaults to FALSE
, in which case all matrices are
output in standard matrix format.
This function automatically determines whether to create a raw or function-based MPM given inputs supplied by the user.
If used, the reproduction matrix (field repmatrix
) may be supplied as
either historical or ahistorical. If provided as historical, then
a historical MPM must be estimated.
If neither a supplement nor a reproduction matrix are used, and the MPM to create is stage-based, then fecundity will be assumed to occur from all reproductive stages to all propagule and immature stages.
Users may at times wish to estimate MPMs using a dataset incorporating
multiple patches or subpopulations, but without discriminating between those
patches or subpopulations. Should the aim of analysis be a general MPM that
does not distinguish these patches or subpopulations, the
modelsearch()
run should not include patch terms.
Input options including multiple variable names must be entered in the order of variables in occasion t+1, t, and t-1. Rearranging the order will lead to erroneous calculations, and will may lead to fatal errors.
This function provides two different means of estimating the probability of
size transition. The midpoint method (CDF = FALSE
) refers to the
method in which the probability is estimated by first estimating the
probability associated with transition from the exact size at the midpoint
of the size class using the corresponding probability density function, and
then multiplying that value by the bin width of the size class. Doak et al.
2021 (Ecological Monographs) noted that this method can produce biased
results, with total size transitions associated with a specific size not
totaling to 1.0 and even specific size transition probabilities capable of
being estimated at values greater than 1.0. The alternative and default
method (CDF = TRUE
) uses the cumulative density function to estimate
the probability of size transition as the cumulative probability of size
transition at the greater limit of the size class minus the cumulative
probability of size transition at the lower limit of the size class. This
latter method avoids this bias. Note, however, that both methods are exact
and unbiased for negative binomial and Poisson distributions.
Under the Gaussian and gamma size distributions, the number of estimated
parameters may differ between the two ipm_method
settings. Because
the midpoint method has a tendency to incorporate upward bias in the
estimation of size transition probabilities, it is more likely to yield non-
zero values when the true probability is extremely close to 0. This will
result in the summary.lefkoMat()
function yielding higher numbers of
estimated parameters than the ipm_method = "CDF"
yields in some cases.
# \donttest{
# Lathyrus historical function-based MPM example
data(lathyrus)
sizevector <- c(0, 4.6, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8,
9)
stagevector <- c("Sd", "Sdl", "Dorm", "Sz1nr", "Sz2nr", "Sz3nr", "Sz4nr",
"Sz5nr", "Sz6nr", "Sz7nr", "Sz8nr", "Sz9nr", "Sz1r", "Sz2r", "Sz3r",
"Sz4r", "Sz5r", "Sz6r", "Sz7r", "Sz8r", "Sz9r")
repvector <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1)
obsvector <- c(0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
matvector <- c(0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
immvector <- c(1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
propvector <- c(1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0)
indataset <- c(0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
binvec <- c(0, 4.6, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,
0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5)
lathframeln <- sf_create(sizes = sizevector, stagenames = stagevector,
repstatus = repvector, obsstatus = obsvector, matstatus = matvector,
immstatus = immvector, indataset = indataset, binhalfwidth = binvec,
propstatus = propvector)
lathvertln <- verticalize3(lathyrus, noyears = 4, firstyear = 1988,
patchidcol = "SUBPLOT", individcol = "GENET", blocksize = 9,
juvcol = "Seedling1988", sizeacol = "lnVol88", repstracol = "Intactseed88",
fecacol = "Intactseed88", deadacol = "Dead1988",
nonobsacol = "Dormant1988", stageassign = lathframeln, stagesize = "sizea",
censorcol = "Missing1988", censorkeep = NA, NAas0 = TRUE, censor = TRUE)
lathvertln$feca2 <- round(lathvertln$feca2)
lathvertln$feca1 <- round(lathvertln$feca1)
lathvertln$feca3 <- round(lathvertln$feca3)
lathmodelsln3 <- modelsearch(lathvertln, historical = TRUE,
approach = "mixed", suite = "main",
vitalrates = c("surv", "obs", "size", "repst", "fec"), juvestimate = "Sdl",
bestfit = "AICc&k", sizedist = "gaussian", fecdist = "poisson",
indiv = "individ", patch = "patchid", year = "year2", year.as.random = TRUE,
patch.as.random = TRUE, show.model.tables = TRUE, quiet = "partial")
lathsupp3 <- supplemental(stage3 = c("Sd", "Sd", "Sdl", "Sdl", "mat", "Sd", "Sdl"),
stage2 = c("Sd", "Sd", "Sd", "Sd", "Sdl", "rep", "rep"),
stage1 = c("Sd", "rep", "Sd", "rep", "Sd", "mat", "mat"),
eststage3 = c(NA, NA, NA, NA, "mat", NA, NA),
eststage2 = c(NA, NA, NA, NA, "Sdl", NA, NA),
eststage1 = c(NA, NA, NA, NA, "Sdl", NA, NA),
givenrate = c(0.345, 0.345, 0.054, 0.054, NA, NA, NA),
multiplier = c(NA, NA, NA, NA, NA, 0.345, 0.054),
type = c(1, 1, 1, 1, 1, 3, 3), type_t12 = c(1, 2, 1, 2, 1, 1, 1),
stageframe = lathframeln, historical = TRUE)
lathmat3ln <- mpm_create(historical = TRUE, year = "all", patch = "all",
stageframe = lathframeln, modelsuite = lathmodelsln3, data = lathvertln,
supplement = lathsupp3)
# }
Run the code above in your browser using DataLab