omega: Calculate the omega estimate of factor saturation

Description

McDonald has proposed coefficient omega as an estimate of the general factor saturation of a test. One way to find omega is to do a factor analysis of the original data set, rotate the factors obliquely, do a Schmid Leiman transformation, and then find omega. This function estimates omega as suggested by McDonald by using hierarchical factor analysis (following Jensen).

Usage

omega(m, nfactors, pc = "mle",key = NULL, flip=TRUE, digits=2,title="Omega",sl=TRUE,labels=NULL, plot=TRUE,rotate="oblimin", ...)

Arguments

A correlation matrix or a data.frame/matrix of data

nfactors

Number of factors believed to be group factors

pc="pa" for principal axes, pc="pc" for principal components, pc="mle" for maximum likelihood.

key

a vector of +/- 1s to specify the direction of scoring of items. The default is to assume all items are positively keyed, but if some items are reversed scored, then key should be specified.

flip

If flip is TRUE, then items are automatically flipped to have positive correlations on the general factor. Items that have been reversed are shown with a - sign.

digits

if specified, round the output to digits

title

Title for this analysis

If plotting the results, should the Schmid Leiman solution be shown or should the hierarchical solution be shown? (default sl=TRUE)

labels

If plotting, what labels should be applied to the variables

plot

plot=TRUE (default) calls omega.graph, plot =FALSE does not

rotate

What rotation to apply? The default is oblimin, the alternative is simplimax.

...

Allows additional parameters to be passed through to the factor routines

Value

omega hierarchicalThe $\omega_h$ coefficient
omega totalThe $omega_t$ coefficient
alphaCronbach's $\alpha$
schmidThe Schmid Leiman transformed factor matrix and associated matrices
schmid$slThe g factor loadings as well as the residualized factors
schmid$orthogVarimax rotated solution of the original factors
schmid$obliqueThe oblimin transformed factors
schmid$fcorthe correlation matrix of the oblique factors
schmid$gloadingThe loadings on the higher order, g, factor of the oblimin factors
keyA vector of -1 or 1 showing which direction the items were scored.

Details

``Many scales are assumed by their developers and users to be primarily a measure of one latent variable. When it is also assumed that the scale conforms to the effect indicator model of measurement (as is almost always the case in psychological assessment), it is important to support such an interpretation with evidence regarding the internal structure of that scale. In particular, it is important to examine two related properties pertaining to the internal structure of such a scale. The first property relates to whether all the indicators forming the scale measure a latent variable in common.

The second internal structural property pertains to the proportion of variance in the scale scores (derived from summing or averaging the indicators) accounted for by this latent variable that is common to all the indicators (Cronbach, 1951; McDonald, 1999; Revelle, 1979). That is, if an effect indicator scale is primarily a measure of one latent variable common to all the indicators forming the scale, then that latent variable should account for the majority of the variance in the scale scores. Put differently, this variance ratio provides important information about the sampling fluctuations when estimating individuals' standing on a latent variable common to all the indicators arising from the sampling of indicators (i.e., when dealing with either Type 2 or Type 12 sampling, to use the terminology of Lord, 1956). That is, this variance proportion can be interpreted as the square of the correlation between the scale score and the latent variable common to all the indicators in the infinite universe of indicators of which the scale indicators are a subset. Put yet another way, this variance ratio is important both as reliability and a validity coefficient. This is a reliability issue as the larger this variance ratio is, the more accurately one can predict an individual's relative standing on the latent variable common to all the scale's indicators based on his or her observed scale score. At the same time, this variance ratio also bears on the construct validity of the scale given that construct validity encompasses the internal structure of a scale." (Zinbarg, Yovel, Revelle, and McDonald, 2006).

McDonald has proposed coefficient omega (hierarchical ($\omega_h$) as an estimate of the general factor saturation of a test. Zinbarg, Revelle, Yovel and Li (2005) http://personality-project.org/revelle/publications/zinbarg.revelle.pmet.05.pdf compare McDonald's $\omega_h$ to Cronbach's $\alpha$ and Revelle's $\beta$ They conclude that $\omega_h$ is the best estimate. (See also Zinbarg et al., 2006)

One way to find $\omega_h$ is to do a factor analysis of the original data set, rotate the factors obliquely, factor that correlation matrix, do a Schmid-Leiman (schmid) transformation to find general factor loadings, and then find $\omega_h$. Here we present code to do that.

$\omega_h$ differs as a function of how the factors are estimated. Three options are available, pc="pa" does a principle axes factor analysis (factor.pa), pc="mle" uses the factanal function, and pc="pc" does a principal components analysis (principal).

For ability items, it is typically the case that all items will have positive loadings on the general factor. However, for non-cognitive items it is frequently the case that some items are to be scored positively, and some negatively. Although probably better to specify which directions the items are to be scored by specifying a key vector, if flip =TRUE (the default), items will be reversed so that they have positive loadings on the general factor. The keys are reported so that scores can be found using the score.items function.

Output from omega will be shown graphically using the omega.graph function. This requires Rgraphviz to be installed. If Rgraphviz is not available, select plot=FALSE.

$\beta$, an alternative to $\omega$, is defined as the worst split half reliability. It can be estimated by using ICLUST (a hierarchical clustering algorithm originally developed for main frames and written in Fortran and that is now available in R. (For a very complimentary review of why the ICLUST algorithm is useful in scale construction, see Cooksey and Soutar, 2005).

The omega function uses exploratory factor analysis to estimate the $\omega_h$ coefficient. It is important to remember that ``A recommendation that should be heeded, regardless of the method chosen to estimate $\omega_h$, is to always examine the pattern of the estimated general factor loadings prior to estimating $\omega_h$. Such an examination constitutes an informal test of the assumption that there is a latent variable common to all of the scale's indicators that can be conducted even in the context of EFA. If the loadings were salient for only a relatively small subset of the indicators, this would suggest that there is no true general factor underlying the covariance matrix. Just such an informal assumption test would have afforded a great deal of protection against the possibility of misinterpreting the misleading $\omega_h$ estimates occasionally produced in the simulations reported here." (Zinbarg et al., 2006, p 137).

A simple demonstration of the problem of an omega estimate reflecting just one of two group factors can be found in the last example.

Although omega is uniquely defined only for cases where 3 or more subfactors are extracted, it is sometimes desired to have a two factor solution. This is done by forcing the schmid extraction to treat the two subfactors as having equal loadings. See Zinbarg et al., 2007.

In addition to $\omega_h$, another of McDonald's coefficients is $\omega_t$. This is an estimate of the total reliability of a test.

McDonald's $\omega_t$, which is similar to Guttman's $\lambda_6$, guttman but uses the estimates of uniqueness ($u^2$ from factor analysis to find $e_j^2$. This is based on a decomposition of the variance of a test score, $V_x$ into four parts: that due to a general factor, $\vec{g}$, that due to a set of group factors, $\vec{f}$, (factors common to some but not all of the items), specific factors, $\vec{s}$ unique to each item, and $\vec{e}$, random error. (Because specific variance can not be distinguished from random error unless the test is given at least twice, some combine these both into error).

Letting $\vec{x} = \vec{cg} + \vec{Af} + \vec {Ds} + \vec{e}$ then the communality of item$_j$, based upon general as well as group factors, $h_j^2 = c_j^2 + \sum{f_{ij}^2}$ and the unique variance for the item $u_j^2 = \sigma_j^2 (1-h_j^2)$ may be used to estimate the test reliability. That is, if $h_j^2$ is the communality of item$_j$, based upon general as well as group factors, then for standardized items, $e_j^2 = 1 - h_j^2$ and $$\omega_t = \frac{\vec{1}\vec{cc'}\vec{1} + \vec{1}\vec{AA'}\vec{1}'}{V_x} = 1 - \frac{\sum(1-h_j^2)}{V_x} = 1 - \frac{\sum u^2}{V_x}$$ Because $h_j^2 \geq r_{smc}^2$, $\omega_t \geq \lambda_6$.

It is important to distinguish here between the two $\omega$ coefficients of McDonald, 1978 and Equation 6.20a of McDonald, 1999, $\omega_t$ and $\omega_h$. While the former is based upon the sum of squared loadings on all the factors, the latter is based upon the sum of the squared loadings on the general factor. $$\omega_h = \frac{ \vec{1}\vec{cc'}\vec{1}}{V_x}$$

References

http://personality-project.org/r/r.omega.html Revelle, W. (1979). Hierarchical cluster analysis and the internal structure of tests. Multivariate Behavioral Research, 14, 57-74. (http://personality-project.org/revelle/publications/iclust.pdf)

Zinbarg, R.E., Revelle, W., Yovel, I., & Li. W. (2005). Cronbach's Alpha, Revelle's Beta, McDonald's Omega: Their relations with each and two alternative conceptualizations of reliability. Psychometrika. 70, 123-133. http://personality-project.org/revelle/publications/zinbarg.revelle.pmet.05.pdf

Zinbarg, R., Yovel, I. & Revelle, W. (2007). Estimating omega for structures containing two group factors: Perils and prospects. Applied Psychological Measurement. 31 (2), 135-157. Zinbarg, R., Yovel, I., Revelle, W. & McDonald, R. (2006). Estimating generalizability to a universe of indicators that all have one attribute in common: A comparison of estimators for omega. Applied Psychological Measurement, 30, 121-144. DOI: 10.1177/0146621605278814 http://apm.sagepub.com/cgi/reprint/30/2/121

Examples

Run this code

test.data <- Harman74.cor$cov
my.omega <- omega(test.data,3)       
print(my.omega,digits=2)
#
#create 9 variables with a hierarchical structure
jen.data <- make.hierarchical()
#with correlations of
jen.data
#find omega 
jen.omega <- omega(jen.data,digits=2)
jen.omega

#create 8 items with a two factor solution, showing the use of the flip option
#sim2 <- item.sim(8)
#omega(sim2)   #an example of misidentification-- remember to look at the loadings matrices.
#apply omega to analyze 6 mental ability tests 
data(ability.cov)   #has a covariance matrix
omega(ability.cov$cov)

Run the code above in your browser using DataLab