Carmone, Kara and Maxwell's Heuristic Identification of Noisy Variables (HINoV) method for symbolic data
HINoV.SDA(table.Symbolic, u=NULL, distance="H", Index="cRAND",method="pam",...)
m x m symmetric matrix (m - number of variables). Matrix contains pairwise adjusted Rand (or Rand) indices for partitions formed by the j-th variable with partitions formed by the l-th variable
sum of rows of parim
ranked values of topri
in decreasing order
symbolic data table
number of clusters
symbolic distance measure as parameter type in dist_SDA
clustering method: "single", "ward", "complete", "average", "mcquitty", "median", "centroid", "pam" (default), "SClust", "DClust"
"cRAND" - adjusted Rand index (default); "RAND" - Rand index
additional argument passed to dist_SDA
function
Andrzej Dudek andrzej.dudek@ue.wroc.pl, Justyna Wilk justyna.wilk@ue.wroc.pl Department of Econometrics and Computer Science, Wroclaw University of Economics, Poland http://keii.ue.wroc.pl/symbolicDA/
For HINoV in symbolic data analysis there can be used methods based on distance matrix such as hierarchical ("single", "ward", "complete", "average", "mcquitty", "median", "centroid") and optimization methods ("pam", "DClust") and also methods based on symbolic data table ("SClust").
See file ../doc/HINoVSDA_details.pdf for further details
Bock, H.H., Diday, E. (eds.) (2000), Analysis of Symbolic Data. Explanatory Methods for Extracting Statistical Information from Complex Data, Springer-Verlag, Berlin.
Diday, E., Noirhomme-Fraiture, M. (eds.) (2008), Symbolic Data Analysis with SODAS Software, John Wiley & Sons, Chichester.
Carmone, F.J., Kara, A., Maxwell, S. (1999), HINoV: a new method to improve market segment definition by identifying noisy variables, "Journal of Marketing Research", November, vol. 36, 501-509.
Hubert, L.J., Arabie, P. (1985), Comparing partitions, "Journal of Classification", no. 1, 193-218. Available at: tools:::Rd_expr_doi("10.1007/BF01908075").
Rand, W.M. (1971), Objective criteria for the evaluation of clustering methods, "Journal of the American Statistical Association", no. 336, 846-850. Available at: tools:::Rd_expr_doi("10.1080/01621459.1971.10482356").
Walesiak, M., Dudek, A. (2008), Identification of noisy variables for nonmetric and symbolic data in cluster analysis, In: C. Preisach, H. Burkhardt, L. Schmidt-Thieme, R. Decker (Eds.), Data analysis, machine learning and applications, Springer-Verlag, Berlin, Heidelberg, 85-92. Available at: tools:::Rd_expr_doi("1007/978-3-540-78246-9_11")
DClust
, SClust
, dist_SDA
; HINoV.Symbolic
, dist.Symbolic
in clusterSim
library; hclust
in stats
library; pam
in cluster
library
# LONG RUNNING - UNCOMMENT TO RUN
#data("cars",package="symbolicDA")
#r<- HINoV.SDA(cars, u=3, distance="U_2")
#print(r$stopri)
#plot(r$stopri[,2], xlab="Variable number", ylab="topri",
#xaxt="n", type="b")
#axis(1,at=c(1:max(r$stopri[,1])),labels=r$stopri[,1])
Run the code above in your browser using DataLab