If there is no grouping (i.e. no |
in the formula), the result
is a lm.madlib
object. Otherwise, it is a lm.madlib.grps
object, which is just a list of lm.madlib
objects.
A lm.madlib
object is a list which contains the following items:
grouping column(s)When there are grouping columns in the formula, the resulting list
has multiple items, each of which has the same name as one of the
grouping columns. All of these items are vectors, and they have the
same length, which is equal to the number of distinct combinations of
all the grouping column values. Each row of these items together is
one distinct combination of the grouping values. When there is no
grouping column in the formula, none of such items will appear in the
resulting list.
coefA numeric matrix, the fitting coefficients. Each row contains the
coefficients for the linear regression of each group of data. So the
number of rows is equal to the number of distinct combinations of
all the grouping column values. The number of columns is equal to
the number features (including intercept if it presents in the
formula).
r2A numeric array. R2 values for all combinations of the grouping
column values.
std_errA numeric matrix, the standard error for each coefficients.
t_statsA numeric matrix, the t-statistics for each coefficient, which is
the absolute value of the ratio of std_err
and coef
.
p_valuesA numeric matrix, the p-values of t_stats
. Each row is for a
fitting to a group of the data.
condition_noA numeric array, the condition number for all combinations of the
grouping column values.
bp_statsA numeric array when hetero = TRUE
, the Breusch-Pagan test
statistics for each combination of the grouping column values.
bp_p_valueA numeric array when hetero = TRUE
, the Breusch-Pagan test p
value for each combination of the grouping column values.
grpsAn integer, the number of groups that the data is divided into
according to the grouping columns in the formula.
grp.colsAn array of strings. The column names of the grouping columns.
has.interceptA logical, whether the intercept is included in the fitting.
ind.varsAn array of strings, all the different terms used as independent
variables in the fitting.
ind.strA string. The independent variables in an array format string.
callA language object. The function call that generates this result.
col.nameAn array of strings. The column names used in the fitting.
appearAn array of strings, the same length as the number of independent
variables. The strings are used to print a clean result, especially when
we are dealing with the factor variables, where the dummy variable
names can be very long due to the inserting of a random string to
avoid naming conflicts, see as.factor,db.obj-method
for details. The list also contains dummy
and dummy.expr
, which are also used for processing the categorical variables, but do not contain any important information.
modelA '>db.data.frame
object, which wraps the result
table of this function.
termsA terms
object, describing the terms in
the model formula.
nobsThe number of observations used to fit the model.
dataA db.obj
object, which wraps all the
data used in the database. If there are fittings for multiple groups, then this is only the wrapper for the data in one group.
origin.dataThe original db.obj
object. When there is no grouping, it is equal to data
above, otherwise it is the "sum" of data
from all groups.
Note that if there is grouping done, and there are multiple
lm.madlib objects in the final result, each one of them
contains the same copy model.