Runs (or continues running) MCMCs for simulating the life expectancy for all countries of the world, using a Bayesian hierarchical model.
run.e0.mcmc(sex = c("Female", "Male"), nr.chains = 3, iter = 160000,
output.dir = file.path(getwd(), "bayesLife.output"),
thin = 10, replace.output = FALSE, annual = FALSE,
start.year = 1873, present.year = 2020, wpp.year = 2019,
my.e0.file = NULL, my.locations.file = NULL, use.wpp.data = TRUE,
constant.variance = FALSE, seed = NULL,
parallel = FALSE, nr.nodes = nr.chains, compression.type = 'None',
verbose = FALSE, verbose.iter = 100, mcmc.options = NULL, ...)
continue.e0.mcmc(iter, chain.ids = NULL,
output.dir = file.path(getwd(), "bayesLife.output"),
parallel = FALSE, nr.nodes = NULL, auto.conf = NULL,
verbose = FALSE, verbose.iter = 10, ...)
An object of class bayesLife.mcmc.set
which is a list with two components:
An object of class bayesLife.mcmc.meta
.
A list of objects of class bayesLife.mcmc
, one for each MCMC.
Sex for which to run the simulation.
Number of MCMC chains to run.
Number of iterations to run in each chain. In addition to a single value, it can have the value ‘auto’ for an automatic assessment of the convergence. In such a case, the function runs for the number of iterations given in the global option auto.conf
list (see e0mcmc.options
), then checks if the MCMCs converged (using the auto.conf
settings). If it did not converge, the procedure is repeated until convergence is reached or the number of repetition exceeded auto.conf$max.loops
.
Directory which the simulation output should be written into.
Thinning interval between consecutive observations to be stored on disk.
If TRUE
, existing outputs in output.dir
will be replaced by results of this simulation.
If TRUE
, the model will be trained based on annual data. in such a case, argument my.e0.file
must be used to provide the annual observed data.
Start year for using historical data.
End year for using historical data.
Year for which WPP data is used. The functions loads a package called wpp\(x\) where \(x\) is the wpp.year
and uses the e0*
datasets.
File name containing user-specified e0 time series for one or more countries. See Details below.
File name containing user-specified locations. See Details below.
Logical indicating if default WPP data should be used, i.e. if my.e0.file
will be matched with the WPP data in terms of time periods and locations. If FALSE
, it is assumed that the my.e0.file
contains all locations and time periods to be included in the simulation.
Logical indicating if the model should be estimated using constant variance. It should only be used if the standard deviation lowess is to be analysed, see compute.loess
.
Seed of the random number generator. If NULL
no seed is set. It can be used to generate reproducible results.
Logical determining if the simulation should run multiple chains in parallel. If it is TRUE
, the package snowFT is required.
Relevant only if parallel
is TRUE
. It gives the number of nodes for running the simulation in parallel. By default it equals to the number of chains.
One of ‘None’, ‘gz’, ‘xz’, ‘bz’, determining type of a compression of the MCMC files.
Logical switching log messages on and off.
Integer determining how often (in number of iterations) log messages are outputted during the estimation.
List of options that overwrites global MCMC options as defined in e0mcmc.options
. Type e0mcmc.options()
to view default values.
In continue.e0.mcmc
, one can overwrite the global auto.conf
option, see e0mcmc.options
for its definition. This argument is only used if the function argument iter
is set to ‘auto’.
Additional parameters to be passed to the function snowFT::performParallel
, if parallel
is TRUE
.
Array of chain identifiers that should be resumed. If it is NULL
, all existing chains in output.dir
are resumed.
Hana Sevcikova, Patrick Gerland contributed to the documentation.
The function run.e0.mcmc
uses a set of global options (for priors, initial values etc.), possibly modified by the mcmc.options
argument. One can also modify these options using e0mcmc.options
. Call e0mcmc.options()
for the full set of options. Function continue.e0.mcmc
inherits its set of options from the corresponding run.e0.mcmc
call.
The function run.e0.mcmc
creates an object of class bayesLife.mcmc.meta
and stores it in output.dir
. It launches nr.chains
MCMCs, either sequentially or in parallel. Parameter traces of each chain are stored as (possibly compressed) ASCII files in a subdirectory of output.dir
, called mc
x where x is the identifier of that chain. There is one file per parameter, named after the parameter with the suffix “.txt”, possibly followed by a compression suffix if compression.type
is given. Country-specific parameters have the suffix _country
c where c is the country code. In addition to the trace files, each mc
x directory contains the object bayesLife.mcmc
in binary format. All chain-specific files are written into disk after the first, last and each \(i\)-th (thinned) iteration, where \(i\) is given by the global option buffer.size
.
Using the function continue.e0.mcmc
one can continue simulating an existing MCMCs by iter
iterations for either all or selected chains. The global options used for generating the existing MCMCs will be used. Only the auto.conf
option can be overwritten by passing the new value as an argument.
The function loads observed data (further denoted as WPP dataset), depending on the specified sex, from the e0F
(e0M
) and e0F_supplemental
(e0M_supplemental
) datasets in a wpp\(x\) package where \(x\) is the wpp.year
. It is then merged with the include
dataset that corresponds to the same wpp.year
. The argument my.e0.file
can be used to overwrite those default data. If use.wpp.data
is FALSE
, it fully replaces the default dataset. Otherwise (by default), such a file can include a subset of countries contained in the WPP dataset, as well as a set of new countries. In the former case,
the function replaces the corresponding country data from the WPP dataset with values in this file. Only columns are replaced that match column names of the WPP dataset, and in addition, columns ‘last.observed’ and ‘include_code’ are used, if present. Countries are merged with WPP using the column ‘country_code’. In addition, in order the countries to be included in the simulation, in both cases (whether they are included in the WPP dataset or not), they must be contained in the table of locations (UNlocations
). In addition, their corresponding ‘include_code’ must be set to 2. If the column ‘include_code’ is present in my.e0.file
, its value overwrites the default include code, unless is -1.
If annual
is TRUE
the default WPP dataset is not used and the my.e0.file
argument must provide the dataset to be used for estimation. Its time-related columns should be single years.
The default UN table of locations mentioned above can be overwritten/extended by using a file passed as the my.locations.file
argument. Such a file must have the same structure as the UNlocations
dataset. Entries in this file will overwrite corresponding entries in UNlocations
matched by the column ‘country_code’. If there is no such entry in the default dataset, it will be appended. This option of appending new locations is especially useful in cases when my.e0.file
contains new countries/regions that are not included in UNlocations
. In such a case, one must provide a my.locations.file
with a definition of those countries/regions.
For simulation of the hyperparameters of the Bayesian hierarchical model, all countries are used that are included in the WPP dataset, possibly complemented by the my.e0.file
, that have include_code
equal to 2. The hyperparameters are used to simulate country-specific parameters, which is done for all countries with include_code
equal 1 or 2. The following values of include_code
in my.e0.file
are recognized: -1 (do not overwrite the default include code), 0 (ignore), 1 (include in prediction but not estimation), 2 (include in both, estimation and prediction). Thus, the set of countries included in the estimation and prediction can be fully specified by the user.
Optionally, my.e0.file
can contain a column called last.observed
containing the year of the last observation for each country. In such a case, the code would ignore any data after that time point. Furthermore, the function e0.predict
fills in the missing values using the median of the BHM procedure (stored in e0.matrix.reconstructed
of the bayesLife.prediction
object). For last.observed
values that are below a middle year of a time interval \([t_i, t_{i+1}]\) (computed as \(t_i+3\)) the last valid data point is the time interval \([t_{i-1}, t_i]\), whereas for values larger equal a middle year, the data point in \([t_i, t_{i+1}]\) is valid.
The package contains a dataset called my_e0_template
(in extdata
directory) which is a template for user-specified my.e0.file
.
J. L. Chunn, A. E. Raftery, P. Gerland, H. Sevcikova (2013): Bayesian Probabilistic Projections of Life Expectancy for All Countries. Demography 50(3):777-801. <doi:10.1007/s13524-012-0193-x>
get.e0.mcmc
, summary.bayesLife.mcmc.set
, e0mcmc.options
, e0.predict
.
if (FALSE) {
m <- run.e0.mcmc(nr.chains = 1, iter = 5, thin = 1, verbose = TRUE)
summary(m)
m <- continue.e0.mcmc(iter = 5, verbose = TRUE)
summary(m)}
Run the code above in your browser using DataLab