Learn R Programming

STPGA (version 5.2.1)

CRITERIA: Optimality Criteria

Description

These are some default design criteria to be minimized. There is a table in the details section that gives the formula for each design criterion and describes their usage. Note that the inputs for these functions come in 3 syntax flavors, namely Type-X, Type-D and Type-K. Users can define and use their owm design criteria as long as it has the Type-X syntax as shown with the examples.

Usage

AOPT(Train, Test, P, lambda = 1e-05, C=NULL)
CDMAX(Train, Test, P, lambda = 1e-05, C=NULL)
CDMAX0(Train, Test, P, lambda = 1e-05, C=NULL)
CDMAX2(Train, Test, P, lambda = 1e-05, C=NULL)
CDMEAN(Train, Test, P, lambda = 1e-05, C=NULL)
CDMEAN0(Train, Test, P, lambda = 1e-05, C=NULL)
CDMEAN2(Train, Test, P, lambda = 1e-05, C=NULL)
CDMEANMM(Train, Test, Kinv,K, lambda = 1e-05, C=NULL, Vg=NULL, Ve=NULL)
DOPT(Train, Test, P, lambda = 1e-05, C=NULL)
EOPT(Train, Test, P, lambda = 1e-05, C=NULL)
GAUSSMEANMM(Train, Test, Kinv, K, lambda = 1e-05, C=NULL, Vg=NULL, Ve=NULL)
GOPTPEV(Train, Test, P, lambda = 1e-05, C=NULL)
GOPTPEV2(Train, Test, P, lambda = 1e-05, C=NULL)
PEVMAX(Train, Test, P, lambda = 1e-05, C=NULL)
PEVMAX0(Train, Test, P, lambda = 1e-05, C=NULL)
PEVMAX2(Train, Test, P, lambda = 1e-05, C=NULL)
PEVMEAN(Train, Test, P, lambda = 1e-05, C=NULL)
PEVMEAN0(Train, Test, P, lambda = 1e-05, C=NULL)
PEVMEAN2(Train, Test, P, lambda = 1e-05, C=NULL)
PEVMEANMM(Train, Test, Kinv,K, lambda = 1e-05, C=NULL, Vg=NULL, Ve=NULL)
dist_to_test(Train, Test, Dst, lambda, C)
dist_to_test2(Train, Test, Dst, lambda, C)
neg_dist_in_train(Train, Test, Dst, lambda, C)
neg_dist_in_train2(Train, Test, Dst, lambda, C)

Arguments

Train

vector of identifiers for individuals in the training set

Test

vector of identifiers for individuals in the test set

P

(Only for Type-X) \(n \times k\) matrix of the first PCs of the predictor variables. The matrix needs to have union of the identifiers of the candidate and test individuals as rownames.

Dst

(Only for Type-D) \(n \times n\) symmetric distance matrix with row and column names.

Kinv

(Only for Type-K) \(n \times n\) symmetric matrix (inverse of the relationship matrix K between n individuals) with row and column names.

K

(Only for Type-K) \(n \times n\) symmetric matrix (the relationship matrix K between n individuals).

lambda

scalar shrinkage parameter (\(\lambda>0\)).

C

Contrast Matrix.

Vg

(Only for PEVMEANMM) covariance matrix between traits generated by the relationship K (multi-trait version).

Ve

(Only for PEVMEANMM) residual covariance matrix for the traits (multi-trait version).

Value

value of the criterion.

Details

criterion name formula Type

AOPT

\(trace[C(P'_{Train}P_{Train}+lambda*I)^{-1}C']\) X

CDMAX

\(max[diag(CP_{Test}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Test}C')/\) X \(diag(CP_{Test}P'_{Test}C')]\)

CDMAX0

\(max[diag(CP_{Train}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Train}C')/\) X \(diag(CP_{Train}P'_{Train}C')]\)

CDMAX2

\(max[diag(CP_{Test}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Train}P_{Train}\) X \((P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Test}C')/diag(CP_{Test}P'_{Test}C')]\)

CDMEAN

\(mean[diag(CP_{Test}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Test}C')/\) X \(diag(CP_{Test}P'_{Test}C')]\)

CDMEAN0

\(mean[diag(CP_{Train}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Train}C')/\) X \(diag(CP_{Train}P'_{Train}C')]\)

CDMEAN2

\(mean[diag(CP_{Test}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Train}P_{Train}\) X
\((P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Test}C')/diag(CP_{Test}P'_{Test}C')] \)

CDMEANMM

\(-mean[diag(CZ_{Test}(K-lambda*(Z_{Train}'MZ_{Train}+\lambda*Kinv)^{-1}Z_{Test}'C')/\) K \((diag(CZ_{Test}KZ_{Test}'C'))]\)

DOPT

\(logdet(C(P'_{Train}P_{Train}+lambda*I))^{-1}C'\) X

EOPT

\(max(eigenval(C(P'_{Train}P_{Train}+lambda*I))^{-1}C'))\) X

GAUSSMEANMM

\(-mean(diag(Z_{Test}KZ_{Test}'-\) K \(Z_{Test}KZ_{Train}'(Z_{Train}KZ_{Train}'+\lambda*I_{ntrain})^{-1}Z_{Train}KZ_{Test}')\)

GOPTPEV

\(max(eigenval(CP_{Test}(P_{Train}'P_{Train}+\lambda*I_{ntrain})^{-1}P_{Test}'C'))\) X

GOPTPEV2

\(mean(eigenval(CP_{Test}(P_{Train}'P_{Train}+\lambda*I_{ntrain})^{-1}P_{Test}'C'))\) X

PEVMAX

\(max(diag(CP_{Test}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Test}C'))\) X

PEVMAX0

\(max(diag(CP_{Train}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Train}C'))\) X

PEVMAX2

\(max[diag(CP_{Test}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Train}P_{Train}\) X \((P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Test}C']\)

PEVMEAN

\(mean(diag(CP_{Test}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Test}C'))\) X

PEVMEAN0

\(mean(diag(CP_{Train}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Train}C'))\) X

PEVMEAN2

\(mean[diag(CP_{Test}(P'_{Train}P_{Train}+lambda*I)^{-1}\) X \(P'_{Train}P_{Train}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Test}C']\)

PEVMEANMM

\(mean(diag(CZ_{test}(Ztrain'MZtrain+lambda*Kinv)^{-1}Ztest'C')))\) K dist_to_test maximum distance from training set to test set based on Dst D dist_to_test2 mean distance from training set to test set based on Dst D
neg_dist_in_train negative of minimum distance between pairs in the training set based on Dst D neg_dist_in_train2 negative of mean distance between distinct pairs in the training set based on Dst D criterion name formula Type

AOPT

\(trace[C(P'_{Train}P_{Train}+lambda*I)^{-1}C']\) X

CDMAX

\(max[diag(CP_{Test}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Test}C')/\) X

Examples

Run this code
# NOT RUN {
	
# }
# NOT RUN {
#Examples to new criterion:
#1- PEVmax
STPGAUSERFUNC<-function(Train,Test, P, lambda=1e-6, C=NULL){
  PTrain<-P[rownames(P)%in%Train,]
  PTest<-P[rownames(P)%in%Test,]
  if (length(Test)==1){PTest=matrix(PTest, nrow=1)}
  if (!is.null(C)){ PTest<-C%*%PTest}
  PEV<-PTest%*%solve(crossprod(PTrain)+lambda*diag(ncol(PTrain)),t(PTrain))
    PEVmax<-max(diag(tcrossprod(PEV)))
  return(PEVmax)
}




######Here is an example of usage
data(iris)
#We will try to estimate petal width from
#variables sepal length and width and petal length.
X<-as.matrix(iris[,1:4])
distX<-as.matrix(dist(X))
rownames(distX)<-colnames(distX)<-rownames(X)<-paste(iris[,5],rep(1:50,3),sep="_" )
#test data 25 iris plants selected at random from the virginica family,
#candidates are the plants in the  setosa and versicolor families.
candidates<-rownames(X)[1:100]
test<-sample(setdiff(rownames(X),candidates), 25)
#want to select 25 examples using the criterion defined in STPGAUSERFUNC
#Increase niterations and npop substantially for better convergence.
ListTrain<-GenAlgForSubsetSelection(P=distX,Candidates=candidates,
Test=test,ntoselect=25,npop=50,
nelite=5, mutprob=.8, niterations=30,
lambda=1e-5, errorstat="STPGAUSERFUNC", plotiters=TRUE)
# }

Run the code above in your browser using DataLab