Fits a logit or probit that
uses weights and variance estimates
appropriate for the edsurvey.data.frame
,
light.edsurvey.data.frame
, or edsurvey.data.frame.list
.
glm.sdf(formula, family = binomial(link = "logit"), data,
weightVar = NULL, relevels = list(), jrrIMax = 1,
omittedLevels = TRUE, defaultConditions = TRUE, recode = NULL,
returnNumberOfPSU=FALSE, returnVarEstInputs = FALSE)logit.sdf(formula, data, weightVar = NULL, relevels = list(),
jrrIMax = 1, omittedLevels = TRUE, defaultConditions = TRUE,
recode = NULL, returnNumberOfPSU = FALSE,
returnVarEstInputs = FALSE)
probit.sdf(formula, data, weightVar = NULL, relevels = list(),
jrrIMax = 1, omittedLevels = TRUE, defaultConditions = TRUE,
recode = NULL, returnVarEstInputs = FALSE)
the glm.sdf
function currently fits only the binomial
outcome models, such as logit and probit, although other link
functions are available for binomial models. See the link
argument in the help for
family
.
an edsurvey.data.frame
character indicating the weight variable to use (see Details).
The weightVar
must be one of the weights for the
edsurvey.data.frame
. If NULL
, uses the default
for the edsurvey.data.frame
.
a list; used when the user wants to change the contrasts from the default treatment contrasts to the treatment contrasts with a chosen omitted group. The name of each element should be the variable name, and the value should be the group to be omitted.
the \(V_{jrr}\) term (see Details) can be estimated with
any positive number of plausible values and is estimated on
the lower
of the number of available plausible values and jrrIMax
. When
jrrIMax
is set to Inf
, all plausible values will be used.
Higher values of jrrIMax
lead to longer computing times and more
accurate variance estimates.
a logical value. When set to the default value of TRUE
, drops
those levels of all factor variables that are specified
in edsurvey.data.frame
. Use print
on an
edsurvey.data.frame
to see the omitted levels.
a logical value. When set to the default value of TRUE
, uses
the default conditions stored in an edsurvey.data.frame
to subset the data. Use print
on an
edsurvey.data.frame
to see the default conditions.
a list of lists to recode variables. Defaults to NULL
. Can be set as
recode=
list(
var1=
list(from=
c("a",
"b",
"c"),
to=
"d"))
. See Examples.
a logical value set to TRUE
to return the number of
primary sampling units (PSU)
a logical value set to TRUE
to return the
inputs to the jackknife and imputation variance
estimates. This is intended to allow for
the computation
of covariances between estimates.
An edsurveyGlm
with the following elements:
the function call
the formula used to fit the model
the estimates of the coefficients
the standard error estimates of the coefficients
the estimated variance due to uncertainty in the scores (plausible values variables)
the estimated variance due to sampling
the number of plausible values
the number of PSUs used in calculation
the variance estimates under the various plausible values
the values of the coefficients under the various plausible values
the coefficient matrix (typically produced by the summary of a model)
the name of the weight variable
the number of plausible values
the number of jackknife replicates used
always jackknife
when returnVarEstInputs
is TRUE
,
this element is returned. These are
used for calculating covariances with
varEstToCov
.
This function implements an estimator that correctly handles left-hand side variables that are logical, allows for survey sampling weights, and estimates variances using the jackknife replication method. The vignette titled Statistics describes estimation of the reported statistics.
The coefficients are estimated using the sample weights according to the section “Estimation of Weighted Means When Plausible Values Are Not Present” or the section “Estimation of Weighted Means When Plausible Values Are Present,” depending on if there are assessment variables or variables with plausible values in them.
How the standard errors of the coefficients are estimated depends on the presence of plausible values (assessment variables), But once it is obtained, the t statistic is given by $$t=\frac{\hat{\beta}}{\sqrt{\mathrm{var}(\hat{\beta})}}$$ where \( \hat{\beta} \) is the estimated coefficient and \(\mathrm{var}(\hat{\beta})\) is its variance of that estimate.
Note that logit.sdf
and probit.sdf
are included for convenience only;
they give the same results as a call to glm.sdf
with the binomial family
and the link function named in the function call (logit or probit).
By default, glm
fits a logistic regression when family
is not set,
so the two are expected to give the same results in that case.
Other types of generalized linear models are not supported.
All variance estimation methods are shown in the vignette titled Statistics. When the predicted value does not have plausible values, the variance of the coefficients is estimated according to the section “Estimation of Standard Errors of Weighted Means When Plausible Values Are Not Present, Using the Jackknife Method.”
When plausible values are present, the variance of the coefficients is estimated according to the section “Estimation of Standard Errors of Weighted Means When Plausible Values Are Present, Using the Jackknife Method.”
# NOT RUN {
# read in the example data (generated, not real student data)
sdf <- readNAEP(system.file("extdata/data", "M36NT2PM.dat", package = "NAEPprimer"))
# By default uses jackknife variance method using replicate weights
table(sdf$b013801)
logit1 <- logit.sdf(I(b013801 %in% c("26-100", ">100")) ~ dsex + b017451, data=sdf)
# use summary to get detailed results
summary(logit1)
logit2 <- logit.sdf(I(composite >= 300) ~ dsex + b013801, data=sdf)
summary(logit2)
logit3 <- glm.sdf(I(composite >= 300) ~ dsex + b013801, data=sdf,
family=quasibinomial(link="logit"))
summary(logit3)
# }
Run the code above in your browser using DataLab