indscal: Individual Differences Scaling

Description

Given a 3-way array X = array(x,dim=c(J,J,K)) with X[,,k] denoting the k-th subject's dissimilarity matrix rating J objects, the INDSCAL model can be written as c{ Z[i,j,k] = sum B[i,r]*B[j,r]*C[k,r] + E[i,j,k] } where Z is the array of scalar products obtained from X, B = matrix(b,J,R) are the object weights, C = matrix(c,K,R) are the non-negative subject weights, and E = array(e,dim=c(J,J,K)) is the 3-way residual array. The summation is for r = seq(1,R).

Weight matrices are estimated using an alternating least squares algorithm with optional constraints.

Usage

indscal(X,nfac,nstart=10,const=NULL,maxit=500,
        type=c("dissimilarity","similarity"),
        ctol=10^-4,parallel=FALSE,cl=NULL,
        output=c("best","all"))

Arguments

Three-way data array with dim=c(J,J,K) where X[,,k] is dissimilarity matrix. Can also input a list of (dis)similarity matrices or objects output by dist.

nfac

Number of factors.

nstart

Number of random starts.

const

Constraints for Modes B and C. See Note.

maxit

Maximum number of iterations.

type

Character indicating if X contains dissimilarity data (default) or similarity data.

ctol

Convergence tolerance.

parallel

Logical indicating if parLapply should be used. See Examples.

Cluster created by makeCluster. Only used when parallel=TRUE.

output

Output the best solution (default) or output all nstart solutions.

Value

If output="best", returns an object of class "indscal" with the following elements:
BMode B weight matrix.
CMode C weight matrix.
RsqR-squared value.
iterNumber of iterations.
cflagConvergence flag.
constSame as input.
strainMDS loss function.
Otherwise returns a list of length nstart where each element is an object of class "indscal".

Warnings

The ALS algorithm can perform poorly if the number of factors nfac is set too large.

Default is unconstrained ALS update, which may produce negative (invalid) Mode C weights. Use const=c(0,2) to force non-negativity via fnnls.

References

Bro, R., & De Jong, S. (1997). A fast non-negativity-constrained least squares algorithm. Journal of Chemometrics, 11, 393-401.

Carroll, J. D., & Chang, J-J. (1970). Analysis of individual differences in multidimensional scaling via an n-way generalization of "Eckart-Young" decmoposition. Psychometrika, 35, 283-319.

Harshman, R. A. (1970). Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multimodal factor analysis. UCLA Working Papers in Phonetics, 16, 1-84.

Harshman, R. A., & Lundy, M. E. (1994). PARAFAC: Parallel factor analysis. Computational Statistics and Data Analysis, 18, 39-72.

Examples

Run this code

##########   array example   ##########

# create random data array with INDSCAL structure
set.seed(3)
mydim <- c(50,5,10)
nf <- 2
X <- array(0,dim=c(rep(mydim[2],2),mydim[3]))
for(k in 1:mydim[3]) {
  X[,,k] <- as.matrix(dist(t(matrix(rnorm(prod(mydim[1:2])),mydim[1],mydim[2]))))
}

# fit INDSCAL model (unconstrained)
imod <- indscal(X,nfac=nf,nstart=1)
imod$Rsq

# check solution
Xhat <- fitted(imod)
sum((array(apply(X,3,ed2sp),dim=dim(X))-Xhat)^2)
imod$strain

# reorder and resign factors
imod$B[1:4,]
imod <- reorder(imod, 2:1)
imod$B[1:4,]
imod <- resign(imod, newsign=c(1,-1))
imod$B[1:4,]
sum((array(apply(X,3,ed2sp),dim=dim(X))-Xhat)^2)
imod$strain

# rescale factors
colSums(imod$B^2)
colSums(imod$C^2)
imod <- rescale(imod, mode="C")
colSums(imod$B^2)
colSums(imod$C^2)
sum((array(apply(X,3,ed2sp),dim=dim(X))-Xhat)^2)
imod$strain


##########   list example   ##########

# create random data array with INDSCAL structure
set.seed(4)
mydim <- c(100,8,20)
nf <- 3
X <- vector("list",mydim[3])
for(k in 1:mydim[3]) {
  X[[k]] <- dist(t(matrix(rnorm(prod(mydim[1:2])),mydim[1],mydim[2])))
}

# fit INDSCAL model (orthogonal B, non-negative C)
imod <- indscal(X,nfac=nf,nstart=1,const=c(1,2))
imod$Rsq

# check solution
Xhat <- fitted(imod)
sum((array(unlist(lapply(X,ed2sp)),dim=mydim[c(2,2,3)])-Xhat)^2)
imod$strain
crossprod(imod$B)


##########   parallel computation   ##########

# create random data array with INDSCAL structure
set.seed(3)
mydim <- c(50,5,10)
nf <- 2
X <- array(0,dim=c(rep(mydim[2],2),mydim[3]))
for(k in 1:mydim[3]) {
  X[,,k] <- as.matrix(dist(t(matrix(rnorm(prod(mydim[1:2])),mydim[1],mydim[2]))))
}

# fit INDSCAL model (10 random starts -- sequential computation)
set.seed(1)
system.time({imod <- indscal(X,nfac=nf)})
imod$Rsq

# fit INDSCAL model (10 random starts -- parallel computation)
set.seed(1)
cl <- makeCluster(detectCores())
ce <- clusterEvalQ(cl,library(multiway))
system.time({imod <- indscal(X,nfac=nf,parallel=TRUE,cl=cl)})
imod$Rsq
stopCluster(cl)

Run the code above in your browser using DataLab