Moving Estimates Using Overlapping Windows
movStats(
formula,
stat = NULL,
discrete = FALSE,
space = c("n", "x"),
eps = if (space == "n") 15,
varyeps = FALSE,
xinc = NULL,
xlim = NULL,
times = NULL,
tunits = "year",
msmooth = c("smoothed", "raw", "both"),
tsmooth = c("supsmu", "lowess"),
bass = 8,
span = 1/4,
maxdim = 6,
penalty = NULL,
trans = function(x) x,
itrans = function(x) x,
loess = FALSE,
ols = FALSE,
qreg = FALSE,
lrm = FALSE,
orm = FALSE,
hare = FALSE,
family = "logistic",
k = 5,
tau = (1:3)/4,
melt = FALSE,
data = environment(formula),
pr = c("none", "kable", "plain", "margin")
)
a data table, with attribute infon
which is a data frame with rows corresponding to strata and columns N
, Wmean
, Wmin
, Wmax
if stat
computed N
. These summarize the number of observations used in the windows. If varyeps=TRUE
there is an additional column eps
with the computed per-stratum eps
. When space='n'
and xinc
is not given, the computed xinc
also appears as a column. An additional attribute info
is a kable
object ready for printing to describe the window characteristics.
a formula with the analysis variable on the left and the x-variable on the right, following by optional stratification variables
function of one argument that returns a named list of computed values. Defaults to computing mean and quartiles + N except when y is binary in which case it computes moving proportions. If y has two columns the default statistics are Kaplan-Meier estimates of cumulative incidence at a vector of times
.
set to TRUE
if x-axis variable is discrete and no intervals should be created for windows
defines whether intervals used fixed width or fixed sample size
tolerance for window (half width of window). For space='x'
is in data units, otherwise is the sample size for half the window, not counting the middle target point.
applies to space='n'
and causes a smaller eps
to be used in strata with fewer than `` observations so as to arrive at three x points
increment in x to evaluate stats, default is xlim range/100 for space='x'
. For space='n'
xinc
defaults to m observations, where m = max(n/200, 1).
2-vector of limits to evaluate if space='x'
(default is 10th to 10th)
vector of times for evaluating one minus Kaplan-Meier estimates
time units when times
is given
set to 'smoothed'
or 'both'
to compute lowess
-smooth moving estimates. msmooth='both'
will display both. 'raw'
will display only the moving statistics. msmooth='smoothed'
(the default) will display only he smoothed moving estimates.
defaults to the super-smoother 'supsmu'
for after-moving smoothing. Use tsmooth='lowess'
to instead use lowess
.
the supsmu
bass
parameter used to smooth the moving statistics if tsmooth='supsmu'
. The default of 8 represents quite heavy smoothing.
the lowess
span
used to smooth the moving statistics
passed to hare
, default is 6
passed to hare
, default is to use BIC. Specify 2 to use AIC.
transformation to apply to x
inverse transformation
set to TRUE to also compute loess estimates
set to TRUE to include rcspline estimate of mean using ols
set to TRUE to include quantile regression estimates w rcspline
set to TRUE to include logistic regression estimates w rcspline
set to TRUE to include ordinal logistic regression estimates w rcspline (mean + quantiles in tau
)
set to TRUE to include hazard regression estimtes of incidence at times
, using the polspline
package
link function for ordinal regression (see rms::orm
)
number of knots to use for ols and/or qreg rcspline
quantile numbers to estimate with quantile regression
set to TRUE to melt data table and derive Type and Statistic
data.table or data.frame, default is calling frame
defaults to no printing of window information. Use pr='plain'
to print in the ordinary way, pr='kable
to convert the object to knitr::kable
and print, or pr='margin'
to convert to kable
and place in the Quarto
right margin. For the latter two results='asis'
must be in the chunk header.
Frank Harrell
Function to compute moving averages and other statistics as a function
of a continuous variable, possibly stratified by other variables.
Estimates are made by creating overlapping moving windows and
computing the statistics defined in the stat function for each window.
The default method, space='n'
creates varying-width intervals each having a sample size of 2*eps +1
, and the smooth estimates are made every xinc
observations. Outer intervals are not symmetric in sample size (but the mean x in those intervals will reflect that) unless eps=10
, as outer intervals are centered at observations 10 and n - 10 + 1. The mean x-variable within each windows is taken to represent that window. If trans
and itrans
are given, x means are computed on the trans(x)
scale and then itrans
'd. For space='x'
, by default estimates are made on to the 10th smallest to the 10th largest
observed values of the x variable to avoid extrapolation and to
help getting the moving statistics off on an adequate start for
the left tail. Also by default the moving estimates are smoothed using supsmu
.
When melt=TRUE
you can feed the result into ggplot
like this:
ggplot(w, aes(x=age, y=crea, col=Type)) + geom_line() +
facet_wrap(~ Statistic)
See here for several examples.