This function will return a list of lists where the top-level keys (names) of
the items indicate the component of the full model (i.e. the term) that the
generated models can be used to test. At each of these keys is a list with
both the complex
and simple
models that can be compared to test
the component. The complex
models always include the target term, and
the simple
models are identical to the complex
except the
target term is removed. Thus, when the models are compared (e.g. using
anova
, except for Type III; see details below), the resulting
values will show the effect of adding the target term to the model. There are
three generally used approaches to determining what the appropriate
comparison models should be, called Type I, II, and III. See the sections
below for more information on these types.
generate_models(model, type)
The type of sums of squares to calculate:
Use 1
, I
, and sequential
for Type I.
Use 2
, II
, and hierarchical
for Type II.
Use 3
, III
, and orthogonal
for Type III.
A list of the augmented models for each term, where the associated term is the key for each model in the list.
For Type I SS, or sequential SS, each term is considered in
order after the preceding terms are considered. Consider the example model
Y ~ A + B + A:B
, where ":" indicates an interaction. To determine
the Type I effect of A
, we would compare the model Y ~ A
to
the same model without the term: Y ~ NULL
. For B
, we compare
Y ~ A + B
to Y ~ A
; and for A:B
, we compare Y ~
A + B + A:B
to Y ~ A + B
. Incidentally, the anova
function that ships with the base installation of R computes Type I
statistics.
For Type II SS, or hierarchical SS, each term is considered
in the presence of all of the terms that do not include it. For example,
consider an example three-way factorial model Y ~ A + B + C + A:B +
A:C + B:C + A:B:C
, where ":" indicates an interaction. The effect of
A
is found by comparing Y ~ B + C + B:C + A
to Y ~ B +
C + B:C
(the only terms included are those that do not include A
).
For B
, the comparison models would be Y ~ A + C + A:C + B
and
Y ~ A + C + A:C
; for A:B
, the models would be Y ~ A + B
+ C + A:C + B:C + A:B
and Y ~ A + B + C + A:C + B:C
; and so on.
For Type III SS, or orthogonal SS, each term is considered
in the presence of all of the other terms. For example, consider an example
two-way factorial model Y ~ A + B + A:B
, where ":" indicates an
interaction. The effect of A
is found by comparing Y ~ B + A:B
+ A
to Y ~ B + A:B
; for B
, the comparison models would be
Y ~ A + A:B + B
and Y ~ A + A:B
; and for A:B
, the
models would be Y ~ A + B + A:B
and Y ~ A + B
.
Unfortunately, anova()
cannot be used to compare Type III
models. anova()
does not allow for violation of the principle of
marginality, which is the rule that interactions should only be tested in
the context of their lower order terms. When an interaction term is present
in a model, anova()
will automatically add in the lower-order terms,
making a model like Y ~ A + A:B
unable to be compared: it will add
the lower-order term B
,and thus use the model Y ~ A + B + A:B
instead. To get the appropriate statistics for Type III comparisons, use
drop1()
with the full scope, i.e. drop1(model_fit,
scope = . ~ .)
.
# NOT RUN {
# create all type 2 comparison models
mod <- lm(Thumb ~ Height * Sex, data = Fingers)
mods_2 <- generate_models(mod, type = 2)
# compute the SS for the Height term
mod_Height <- anova(mods_2[["Height"]]$simple, mods_2[["Height"]]$complex)
mod_Height[["Sum of Sq"]][[2]]
# }
Run the code above in your browser using DataLab