describe
, pairs.panels
, error.bars
) are useful for basic descriptive analyses.Psychometric applications include routines for Very Simple Structure (VSS
), Item Cluster Analysis (ICLUST
) and principal axes factor analysis (factor.pa
), as well as functions to do Schmid Leiman transformations (schmid
) to transform a hierarchical factor structure into a bifactor solution and to graph both structures (omega.graph
) and to calculate reliability coefficients alpha (score.items
), beta (ICLUST
) and McDonald's omega (omega
and omega.graph
).
Additional functions make for more convenient descriptions of item characteristics. Functions under development include 1 and 2 parameter Item Response measures.
A number of procedures have been developed as part of the Synthetic Aperture Personality Assessment (SAPA) project. These routines facilitate forming and analyzing composite scales equivalent to using the raw data but doing so by adding within and between cluster/scale item correlations. These functions include extracting clusters from factor loading matrices (factor2cluster
), synthetically forming clusters from correlation matrices (cluster.cor
), and finding multiple correlation from correlation matrices (mat.regress
).
The most recent development version of the package is always available for download as a source file from the repository at
read.clipboard
),
simple descriptive statistics (describe
), and splom plots combined with correlations (pairs.panels
, adapted from the help files of pairs).The VSS
routines allow for testing the number of factors (VSS
), showing plots (VSS.plot
) of goodness of fit, and basic routines for estimating the number of factors/components to extract by examining the scree plot (VSS.scree
) or comparing with the scree of an equivalent matrix of random numbers (VSS.parallel
) .
In addition, there are routines for hierarchical factor analysis using Schmid Leiman tranformations (omega
, omega.graph
) as well as Item Cluster analysis (ICLUST
, ICLUST.graph
).
The more important functions in the package are for the analysis of multivariate data, with an emphasis upon those functions useful in scale construction of item composites.
When given a set of items from a personality inventory, one goal is to combine these into higher level item composites. This leads to several questions:
1) What are the basic properties of the data? describe
reports basic summary statistics (mean, sd, median, mad, range, minimum, maximum, skew, kurtosis, standard error) for vectors, columns of matrices, or data.frames. describe.by
provides descriptive statistics, organized by a grouping variable. pairs.panels
shows scatter plot matrices (SPLOMs) as well as histograms and the Pearson correlation for scales or items. error.bars
will plot variable means with associated confidence intervals.
2) What is the most appropriate number of item composites to form? After finding either standard Pearson correlations, or finding tetrachoric or polychoric correlations using a wrapper (poly.mat
) for John Fox's hetcor function, the dimensionality of the correlation matrix may be examined. The number of factors/components problem is a standard question of factor analysis, cluster analysis, or principal components analysis. Unfortunately, there is no agreed upon answer. The Very Simple Structure (VSS
) set of procedures has been proposed as on answer to the question of the optimal number of factors. Other procedures (VSS.scree
, VSS.parallel
, and fa.parallel
) also address this question.
3) What are the best composites to form? Although this may be answered using principal components (principal) or factor analysis (factor.pa) and to show the results graphically (fa.graph), it is sometimes more useful to address this question using cluster analytic techniques. (Better yet is to use maximum likelihood factor analysis using factanal
from the stats package.) Previous versions of ICLUST
(e.g., Revelle, 1979) have been shown to be particularly successful at doing this. Graphical output from ICLUST.graph
uses the Graphviz dot language and allows one to write files suitable for Graphviz.
Graphical organizations of cluster and factor analysis output can be done using cluster.plot
which plots items by cluster/factor loadings and assigns items to that dimension with the highest loading.
4) How well does a particular item composite reflect a single construct? This is a question of reliability and general factor saturation. Multiple solutions for this problem result in (Cronbach's) alpha (score.items
), (Revelle's) Beta (ICLUST
), and (McDonald's) omega
. Functions to estimate all three of these are included in psych.
5) For some applications, data matrices are synthetically combined from sampling different items for different people. So called Synthetic Aperture Personality Assessement (SAPA) techniques allow the formation of large correlation or covariance matrices even though no one person has taken all of the items. To analyze such data sets, it is easy to form item composites based upon the covariance matrix of the items, rather than original data set. These matrices may then be analyzed using a number of functions (e.g., cluster.cor
, factor.pa
, ICLUST
, principal
, mat.regress
, and factor2cluster
.
6) More typically, one has a raw data set to analyze. score.items
will score data sets on multiple scales, reporting the scale scores, item-scale and scale-scale correlations, as well as coefficient alpha and alpha-1. Using a `keys' matrix, scales can have overlapping or independent items. score.multiple.choice
scores multiple choice items or converts multiple choice items to dichtomous (0/1) format for other functions.
An additional set of functions generate simulated data to meet certain structural properties. item.sim
creates simple structure data, circ.sim
will produce circumplex structured data, item.dichot
produces circumplex or simple structured data for dichotomous items. These item structures are useful for understanding the effects of skew, differential item endorsement on factor and cluster analytic soutions.
Three data sets are included: bfi
represents 25 personality items thought to represent five factors of personality, iqitems
has 14 multiple choice iq items. sat.act
has data on self reported test scores by age and gender.
psych A package for personality, psychometric, and psychological research.
Useful data entry and descriptive statistics
describe Basic descriptive statistics useful for psychometrics
describe.by Find summary statistics by groups
read.clipboard shortcut for reading from the clipboard
read.clipboard.csv shortcut for reading comma delimited files from clipboard
pairs.panels SPLOM and correlations for a data matrix
multi.hist Histograms of multiple variables arranged in matrix form
skew Calculate skew for a vector, each column of a matrix, or data.frame
kurtosi Calculate kurtosis for a vector, each column of a matrix or dataframe
error.crosses Two way error bars
geometric.mean Find the geometric mean of a vector or columns of a data.frame
harmonic.mean Find the harmonic mean of a vector or columns of a data.frame
Data reduction through cluster and factor analysis
factor.pa Do a principal Axis factor analysis
fa.graph Show the results of a factor analysis or principal components analysis graphically
principal Do an eigen value decomposition to find the principal components of a matrix
fa.parallel Scree test and Parallel analysis
ICLUST
Apply the ICLUST algorithm
ICLUST.graph Graph the output from ICLUST using the dot language
ICLUST.rgraph Graph the output from ICLUST using rgraphviz
poly.mat Find the polychoric correlations for items (uses J. Fox's hetcor
omega Calculate the omega estimate of factor saturation (requires the GPArotation package
omega.graph Draw a hierarchical or SL orthogonalized solution
schmid Apply the Schmid Leiman transformation to a correlation matrix
score.items Combine items into multiple scales and find alpha
VSS Apply the Very Simple Structure criterion to determine the appropriate number of factors.
VSS.parallel Do a parallel analysis to determine the number of factors for a random matrix
VSS.plot Plot VSS output
VSS.scree Show the scree plot of the factor/principal components
VSS.simulate Generate simulated data for the factor model
make.hierarchical Generate simulated correlation matrices with hierarchical structure
Procedures particularly useful for Synthetic Aperture Personality Assessment
alpha.scale Find coefficient alpha for a scale (see also score.items)
correct.cor Correct a correlation matrix for unreliability
count.pairwise Count the number of complete cases when doing pair wise correlations
cluster.cor find correlations of composite variables from larger matrix
cluster.loadings find correlations of items with composite variables from a larger matrix
eigen.loadings Find the loadings when doing an eigen value decomposition
factor.pa Do a principal Axis factor analysis
factor2cluster extract cluster definitions from factor loadings
factor.congruence Factor congruence coefficient
factor.fit How well does a factor model fit a correlation matrix
factor.model Reproduce a correlation matrix based upon the factor model
factor.residuals Fit = data - model
factor.rotate ``hand rotate" factors
mat.regress multiple regression from matrix input
principal Do an eigen value decomposition to find the principal components of a matrix
Functions for generating simulated data sets
circ.sim Generate a two dimensional circumplex item structure
item.sim Generate a two dimensional simple structrue with particular item characteristics
congeneric.sim Generate a one factor congeneric reliability structure
psycho.demo Create artificial data matrices for teaching purposes
Miscellaneous functions
fisherz Apply the Fisher r to z transform
paired.r Test for the difference of two paired correlations
phi2poly Given a phi coefficient, what is the polychoric correlation
poly.mat Use John Fox's hetcor to create a matrix of correlations from a data.frame or matrix of integer values
polychor.matrix Use John Fox's polycor to create a matrix of correlations (not yet very useful)
Functions that are under development and not recommended for casual use
irt.item.diff.rasch IRT estimate of item difficulty with assumption that theta = 0
irt.person.rasch Item Response Theory estimates of theta (ability) using a Rasch like model
test.psych Run a test of the major functions on 5 different data sets. Primarily for development purposes. Although the output can be used as a demo of the various functions.
#See the separate man pages
test.psych()
Run the code above in your browser using DataLab