Learn R Programming

LocalControl (version 1.1.2.2)

SPSloess: LOESS Smoothing of Outcome by Treatment in Supervised Propensiy Scoring

Description

Express Expected Outcome by Treatment as LOESS Smooths of Fitted Propensity Scores.

Usage

SPSloess(
  envir,
  dframe,
  trtm,
  pscr,
  yvar,
  faclev = 3,
  deg = 2,
  span = 0.75,
  fam = "symmetric"
)

Arguments

envir

Local control classic environment.

dframe

data.frame of the form returned by SPSlogit().

trtm

the two-level factor on the left-hand-side in the formula argument to SPSlogit().

pscr

fitted propensity scores of the form returned by SPSlogit().

yvar

continuous outcome measure or result unknown at the time patient was assigned (possibly non-randomly) to treatment; "NA"s are allowed in yvar.

faclev

optional; maximum number of distinct numerical values a variable can assume and yet still be converted into a factor variable; faclev=1 causes a binary indicator to be treated as a continuous variable determining a proportion.

deg

optional; degree (1=linear or 2=quadratic) of the local fit.

span

optional; span (0 to 2) argument for the loess() function.

fam

optional; "gaussian" or "symmetric".

Details

SPSloess

Once one has fitted a somewhat smooth curve through scatters of observed outcomes, Y, versus the fitted propensity scores, X, for the patients in each of the two treatment groups, one can consider the question: "Over the range where both smooth curves are defined (i.e. their common support), what is the (weighted) average signed difference between these two curves?"

If the distribution of patients (either treated or untreated) were UNIFORM over this range, the (unweighted) average signed difference (treated minus untreated) would be an appropriate estimate of the overall difference in outcome due to choice of treatment.

Histogram patient counts within 100 cells of width 0.01 provide a naive "non-parametric density estimate" for the distribution of total patients (treated or untreated) along the propensity score axis. The weighted average difference (and standard error) displayed by SPSsmoot() are based on an R density() smooth of these counts.

In situations where the propensity scoring distribution for all patients in a therapeutic class is known to differ from that of the patients within the current study, that population weighted average would also be of interest. Thus the SPSloess() output object contains two data frames, logrid and lofit, useful in further computations.

  • logridloess grid data.frame containing 11 variables and 100 observations. The PS variable contains propensity score "cell means" of 0.005 to 0.995 in steps of 0.010. Variables F0, S0 and C0 for treatment 0 and variables F1, S1 and C1 for treatment 1 contain fitted smooth spline values, standard error estimates and patient counts, respectively. The DIF variable is simply (F1\-F0), the SED variable is sqrt(S1\^2+S0\^2), the HST variable is proportional to (C0+C1), and the DEN variable is the estimated probability density of patients along the PS axis. Observations with "NA" for variables F0, S0, F1 or S1 represent "extremes" where the lowess fits could not be extrapolated because no observed outcomes were available.

  • losub0, losub1loess fit data.frame contains 4 variables for each distinct PS value in lofit. These 4 variables are named PS, YAVG, TRT==0 and 1, respectively, and FIT = spline prediction for the specified degrees-of-freedom (default df=1.)

  • spanloess span setting.

  • lotdifoutcome treatment difference mean.

  • lotsdeoutcome treatment difference standard deviation.

References

Cleveland WS, Devlin SJ. (1988) Locally-weighted regression: an approach to regression analysis by local fitting. J Amer Stat Assoc 83: 596-610.

Cleveland WS, Grosse E, Shyu WM. (1992) Local regression models. Chapter 8 of Statistical Models in S eds Chambers JM and Hastie TJ. Wadsworth & Brooks/Cole.

Obenchain RL. (2011) USPSinR.pdf USPS R-package vignette, 40 pages.

Ripley BD, loess() based on the 'cloess' package of Cleveland, Grosse and Shyu.