R2GUESS and creates an ESS object.The as.ESS.object function compiles the main information
relating to a previous run of R2GUESS and compiles them into
an ESS object to be further analyzed. Main parameters (e.g. nsweep, burn.in) are read from the
'feature' file automatically generated at the end of every
R2GUESS run. Main outputs are also included in the object to
enable post-processing and further analyses.
as.ESS.object(dataY, dataX, path.input, path.output,
root.file.output, label.X = NULL, label.Y = NULL, path.par,
path.init = NULL, file.par, file.init = NULL, file.log = NULL,
MAP.file = NULL, command=TRUE)a character vector (such as
'dataY.txt') specifying, assuming that data are in the
path.input folder, the location of the response
matrix. In the corresponding file observations are presented in
rows, and the (possibly multivariate) outcome(s) in columns. The
first two rows (single integers) represent the number of rows
(n) and columns (q) in the matrix.
is a character vector (such as
'dataX.txt') specifying, assuming that data are in the
path.input folder, the location of the predictor
matrix. In the corresponding file observations are presented in
rows, and the predictors in columns. The first two rows (single
integers) represent the number of rows (n) and columns
(p) in the matrix.
path linking to the directory containing the data
(dataX and dataY). If
dataX or/and dataY have been entered
as data frame(s), the function will generate the corresponding
text files required to run GUESS in the path.input folder.
path indicating the directory in which output files are saved.
name specifying the file stem of the
different output files in the path.output directory.
a character vector specifying the name of the
predictors. If not specified (=NULL), the variables are labelled
by their position in the matrix. Predictors name and information
can be given in the MAP.file in the case of SNP
data (field SNPName).
a character vector specifying the name of
the outcomes. If not specified (=NULL), the outcomes are
labelled by Y1,..Yq where q is the dimension of the
response matrix, or will be the name of the argument
dataY (if specified by a data frame).
path to the directory containing the
parameter file (argument file.par)
path to the directory containing the init
file (argument file.init) specifying which variable
were included at the first iteration of the MCMC run. By
default (file.init=NULL) no init file is
required.
name of the parameter file containing all the
user-specified parameters used to set up the run and the features
of the moves. This file is located in path.par and contains
fields that are extensively described in
http://www.bgx.org.uk/software/GUESS_Doc_short.pdf.
name of the file specifying which variable have been included at the iteration of the MCMC run.
name of the log file. This file compiles in real time
summary information describing the initial parameters, the
computational time and state of the run. This file will also
contains information about moves sampled at each sweep. By default
(=NULL), the name is given by the argument
root.file.output extended by '_log' and for
computational efficiency (especially when GPU is enabled) a
minimal amount of information is returned.
is either a one element character vector or a data
frame. If a character vector is used, it specifies, assuming that data are in the
path.input folder, the location of the annotation
file. In the corresponding file each predictor is presented in
rows, and are described as a MAP.file. If a data frame
argument is passed, it links to a px3 matrix.
Boolean specifying whether the automatically generated C++ command line is saved in the object or not.
An object of class ESS which compiles the following
information:
dataYa character vector defining the location of the response matrix, assuming that data are in the path.input folder.
dataXa character vector defining the location of the predictor matrix, assuming that data are in the path.input folder.
path.inputpath linking to the directory containing the data (dataX and dataY).
path.outputpath indicating the directory in which output files were saved.
path.parpath indicating the directory in which to find the parameter file used for the run.
path.initpath indicating the location of file
describing the initial guess of the MCMC procedure. If no
init files were specified, the field is set to NULL.
timeBoolean value indicating if a file recording the
time each sweep took has been created and saved in path.output directory.
file.parname of the parameter file containing all the user-specified parameters used to set up the run and the features of the moves.
file.initname of the file specifying which variables
were arbitrarily included at the iteration of the MCMC run. If no init file was specified (=NULL),
initial guesses were defined by a stepwise regression approach.
file.loglocation of the log file.
root.file.outputfile name specifying the file stem used
to write the output files in the directory specified by path.output.
nsweepinteger specifying the number of sweeps of the MCMC run (including the burn-in).
topthe number of top models that are reported in the output.
BestModelsA list describing the best model
visited, with respect to the fields listed in the summary.ESS.
label.Xa character vector specifying the name of the predictors. If not specified (=NULL), the variables are labelled by their position in the matrix from 1 to p.
label.Ya character vector specifying the name of the outcomes. If not specified (=NULL), the outcomes are labelled by Y1,..Yq, where q is the dimension of the outcome matrix.
pthe number of predictors in the X matrix.
qthe number of outcomes in the response matrix.
nthe number of observations.
nb.chainthe number of chains in the evolutionary algorithm.
burn.ininteger specifying the number of sweeps which were discarded to account for burn-in.
confa character vector defining the location of the file compiling observed values for the confounders of interest.
cudaa boolean value indicating if linear algebra operations have been re-routed towards the GPU.
Egama priori average model size.
Sgama priori standard deviation of the model size.
MAP.filea character vector specifying the location of
the predictor annotation file, assuming that data are in
path.input.
commanda character vector describing the C++ command line
used to generate the results, if saved.
seedthe random seed used to initialise the pseudo-random number generator.
Finisha Boolean value indicating if the run terminated, or was interrupted before reaching the user-defined time limit.
# NOT RUN {
dataX <- "data-X-C-CODE.txt"
dataY <- "data-Y-ALL-C-CODE.txt"
path.input <- system.file("Input", package="R2GUESS")
path.output <- tempdir()
file.copy(system.file("Output", package="R2GUESS"), path.output, recursive = TRUE)
path.output <- file.path(path.output, "Output")
path.par <- system.file("extdata", package="R2GUESS")
file.par <- "Par_file_example_Hopx.xml"
root.file.output <- "Example-GUESS-Y-Hopx"
label.Y <- c("ADR","Fat","Heart","Kidney")
my.env <- new.env()
data(MAP.file,envir=my.env)
MAP.file <- my.env$MAP.file
modelY_Hopx <-as.ESS.object(dataY=dataY,dataX=dataX,path.input=path.input,
path.output=path.output,root.file.output=root.file.output,label.X=NULL,
label.Y=label.Y,path.par=path.par,file.par=file.par,MAP.file=MAP.file)
print(modelY_Hopx)
class(modelY_Hopx)
# }
Run the code above in your browser using DataLab