When doing cor(x, use= "pairwise"), it is nice to know the number of cases for each pairwise correlation. This is particularly useful when doing SAPA type analyses. More importantly, when there are some missing pairs, it is useful to supply imputed values so that further analyses may be done. This is useful if using the Massively Missing Completely at Random (MMCAR) designs used by the SAPA project. The specific pairs missing may be identified by pairwiseZero. Summaries of the counts are given by pairwiseDescribe.
pairwiseCount(x, y = NULL,diagonal=TRUE)
pairwiseDescribe(x,y,diagonal=FALSE,...)
pairwiseZero(x,y=NULL, min=0, short=TRUE)
pairwiseImpute(keys,R,fix=FALSE)
pairwiseReport(x,y=NULL,cut=0,diagonal=FALSE,...)
pairwiseSample(x,y=NULL,diagonal=FALSE,size=100,...)
pairwisePlot(x,y=NULL,upper=TRUE,diagonal=TRUE,labels=TRUE,show.legend=TRUE,n.legend=10,
colors=FALSE,gr=NULL,minlength=6,xlas=1,ylas=2,
main="Relative Frequencies",count=TRUE,...)
count.pairwise(x, y = NULL,diagonal=TRUE) #deprecated
An input matrix, typically a data matrix ready to be correlated.
An optional second input matrix
if TRUE, then report the diagonal, else fill the diagonals with NA
Other parameters to pass to describe
Count the number of item pairs with <= min entries
Show the table of the item pairs that have entries <= min
A keys.list specifying which items belong to which scale.
A correlation matrix to be described or imputed
Report the item pairs and numbers with cell sizes less than cut
If TRUE, then replace all NA correlations with the mean correlation for that within or between scale
Should the upper off diagonal matrix be drawn, or left blank?
if NULL, use column and row names, otherwise use labels
A legend (key) to the colors is shown on the right hand side
How many categories should be labelled in the legend?
Defaults to FALSE and will use a grey scale. colors=TRUE use colors \ from the colorRampPalette from red through white to blue
If not NULL, then the maximum number of characters to use in row/column labels
Orientation of the x axis labels (1 = horizontal, 0, parallel to axis, 2 perpendicular to axis)
Orientation of the y axis labels (1 = horizontal, 0, parallel to axis, 2 perpendicular to axis)
A title. Defaults to "Relative Frequencies"
A color gradient: e.g., gr <- colorRampPalette(c("#B52127", "white", "#2171B5")) will produce slightly more pleasing (to some) colors. See next to last example of corPlot
.
Should we count the number of pairwise observations using pairwiseCount, or just plot the counts for a matrix?
Sample size of the number of variables to sample in pairwiseSample
= matrix of counts of pairwise observations (if pairwiseCount)
The average correlation value of the observed correlations within/between scales
The numer of observed correlations within/between each scale
The percentage of complete data by scale
The original correlation matrix with NA values replaced by the mean correlation for items within/between the appropriate scale.
When using Massively Missing Completely at Random (MMCAR) designs used by the SAPA project, it is important to count the number of pairwise observations (pairwiseCount
). If there are pairs with 1 or fewer observations, these will produce NA values for correlations making subsequent factor analyses fa
or reliability analsyes omega
or scoreOverlap
impossible.
In order to identify item pairs with counts less than a certain value pairwiseReport
reports the names of those pairs with fewer than 'cut' observations. By default, it just reports the number of offending items. With short=FALSE, the print will give the items with n.obs < cut. Even more detail is available in the returned objects.
The specific pairs that have values <= n min in any particular table of the paiwise counts may be given by pairwiseZero
.
To remedy the problem of missing correlations, we impute the missing correlations using pairwiseImpute
.
The technique takes advantage of the scale based structure of SAPA items. Items within a scale (e.g. Letter Number Series similar to the ability
items) are imputed to correlate with items from another scale (e.g., Matrix Reasoning) at the average of these two between scale inter-item mean correlations. The average correlations within and between scales are reported by pairwiseImpute
and if the fix paremeter is specified, the imputed correlation matrix is returned.
Alternative methods of imputing these correlations are not yet implemented.
The time to count cell size varies linearly by the number of subjects and of the number of items squared. This becomes prohibitive for larger (big n items) data sets. pairwiseSample
will take samples of size=size of these bigger data sets and then returns basic descriptive statistics of these counts, including mean, median, and the .05, .25, .5, .75 and .95 quantiles.
# NOT RUN {
x <- matrix(rnorm(900),ncol=6)
y <- matrix(rnorm(450),ncol=3)
x[x < 0] <- NA
y[y > 1] <- NA
pairwiseCount(x)
pairwiseCount(y)
pairwiseCount(x,y)
pairwiseCount(x,diagonal=FALSE)
pairwiseDescribe(x,quant=c(.1,.25,.5,.75,.9))
#examine the structure of the ability data set
keys <- list(ICAR16=colnames(psychTools::ability),reasoning =
cs(reason.4,reason.16,reason.17,reason.19),
letters=cs(letter.7, letter.33,letter.34,letter.58, letter.7),
matrix=cs(matrix.45,matrix.46,matrix.47,matrix.55),
rotate=cs(rotate.3,rotate.4,rotate.6,rotate.8))
pairwiseImpute(keys,psychTools::ability)
# }
Run the code above in your browser using DataLab