prepareCGPairedDifferenceData
reads in a data frame and
settings
in order to create a
cgPairedDifferenceData
object. The created object is designed to have exploratory and
fit methods applied to it.
prepareCGPairedDifferenceData(dfr, format = "listed", analysisname = "", endptname = "", endptunits = "", logscale = TRUE, zeroscore = NULL, addconstant = NULL, digits = NULL, expunitname= "", refgrp = NULL, stamps = FALSE)
format
argument."listed"
. Either "listed"
or
"groupcolumns"
must be used. Abbreviations of "l"
or "g"
, respectively,
or otherwise sufficient matching values can be used:
"listed"
"groupcolumns"
""
.""
.""
.TRUE
.logscale=TRUE
) is specified. The default value
is NULL
. To derive a score value to replace zero,
"estimate"
can be specified, see Details below on the algorithm used.logscale=TRUE
is desired. The default value is
NULL
. A positive numeric value can be specified to be added, or a "simple"
algorthm specified to estimate a value to add. See Details secion
below on the algorithm used.NULL
, which will examine each individual data value and choose the
one that has the maximum number of digits after any trailing
zeroes are ignored. The max number of digits will be 4.""
.NULL
,
which will just use the first level determined in the data frame.FALSE
.cgPairedDifferenceData
object is returned, with the following slots:
dfr
argument in the function call.regfrp
column of values is the subtrahend (second term) in the subtraction.analysisname
analysisname
.endptname
endptname
, and set to "Endpoint"
if input was left
at the default ""
.endptunits
endptunits
.endptscale
"log"
if
logscale=TRUE
and "original"
if
logscale=FALSE
.zeroscore
NULL
if the input argument
was NULL
. Otherwise has the derived (from
zeroscore="estimate"
)
or specified numeric value.addconstant
NULL
if the input argument
was NULL
. Otherwise has the specified or derived numeric
value.digits
digits
or is set to the determined value of digits from the
input data. Will be an integer of 0, 1, 2, 3, or 4.grpnames
dfr
and the refgrp
specification.expunitname
expunitname
and processing of the data frame.refgrp
refgrp
.stamps
stamps
.dfr
can be of the format
"listed"
or "groupcolumns"
. If format="listed"
for dfr
is specified, then there
must be three columns for an input data frame. The first column
needs to be the experimental unit identifier,
the second column needs to be the group identifier,
and the third is the endpoint. The first column of the listed input data format,
needs to have two sets of distinct values since it is the
experimental unit identifier of response pairs. The second column of the listed
input data format needs to have exactly 2 distinct values since
it is the group identifier.
If format="groupcolumns"
for dfr
is specified, then
there can be two columns or three columns.
cgPairedDifferenceData
object,
another column will be binded from the left and become the
first column, with the column header of
expunitname
is specified, and "expunit" if the default
expunitname=""
is specified. A sequence of integers
starting with 1 up to the number of pairs/rows will be
generated to uniquely identify each experimental unit pair.
expunitname
setting if expunitname
is not explicity specified to
something else instead of its default expunitname=""
.
As the evaluation data set is prepared for
cgPairedDifferenceData
object, any experimental unit
pairs/rows with
missing values in the
endpoint are flagged. This includes a check to make sure that each
experimental unit identified has a complete pair of numeric observations.
zeroscore="estimate"
is specified, a number
close to zero is derived to replace all zeroes for subsequent
log-scale analyses. A spline fit (using spline
and
method="natural"
)
of the log of the
response vector on the original response vector is performed. The
zeroscore is then derived from the log-scale value of the spline curve at the original
scale value of zero. This approach comes from the concept of
arithmetic-logarithmic scaling discussed in Tukey, Ciminera, and
Heyse (1985).
addconstant="simple"
is specified, a number is derived and added
to all response values. The approach taken is
from the "white" book on S (Chambers and Hastie, 1992),
page 68. The range (max - min
) of the response values is multiplied by 0.0001
to derive the number to add to all the
response values.
Chambers, J.M, and Hastie, T.R. (1992), Statistical Modeling in S. Chapman&Hall/CRC.
prepare
data(anorexiaFT)
anorexiaFT.data <- prepareCGPairedDifferenceData(anorexiaFT, format="groupcolumns",
analysisname="Anorexia FT",
endptname="Weight",
endptunits="lbs",
expunitname="Patient",
digits=1, logscale=TRUE)
Run the code above in your browser using DataLab