calculates distances between symbolic objects described by interval-valued, multinominal and multinominal with weights variables
dist_SDA(table.Symbolic,type="U_2",subType=NULL,gamma=0.5,power=2,probType="J",
probAggregation="P_1",s=0.5,p=2,variableSelection=NULL,weights=NULL)
distance matrix of symbolic objects
symbolic data table
distance measure for boolean symbolic objects: H, U_2, U_3, U_4, C_1, SO_1, SO_2, SO_3, SO_4, SO_5; mixed symbolic objects: L_1, L_2
comparison function for C_1 and SO_1: D_1, D_2, D_3, D_4, D_5
gamma parameter for U_2 and U_3, gamma [0, 0.5]
power parameter for U_2 and U_3; power [1, 2, 3, ..]
distance measure for probabilistic symbolic objects: J, CHI, REN, CHER, LP
agregation function for J, CHI, REN, CHER, LP: P_1, P_2
parameter for Renyi (REN) and Chernoff (CHE) distance, s [0, 1)
parameter for Minkowski (LP) metric; p=1 - manhattan distance, p=2 - euclidean distance
numbers of variables used for calculation or NULL for all variables
weights of variables for Minkowski (LP) metrics
Andrzej Dudek andrzej.dudek@ue.wroc.pl, Justyna Wilk justyna.wilk@ue.wroc.pl Department of Econometrics and Computer Science, Wroclaw University of Economics, Poland http://keii.ue.wroc.pl/symbolicDA/
Distance measures for boolean symbolic objects:
H - Hausdorff's distance for objects described by interval-valued variables, U_2, U_3, U_4 - Ichino-Yaguchi's distance measures for objects described by interval-valued and/or multinominal variables, C_1, SO_1, SO_2, SO_3, SO_4, SO_5 - de Carvalho's distance measures for objects described by interval-valued and/or multinominal variables.
Distance measurement for probabilistic symbolic objects consists of two steps: 1. Calculation of distance between objects for each variable using componentwise distance measures: J (Kullback-Leibler divergence), CHI (Chi-2 divergence), REN (Renyi's divergence), CHER (Chernoff's distance), LP (modified Minkowski metrics). 2. Calculation of aggregative distance between objects based on componentwise distance measures using objectwise distance measure: P_1 (manhattan distance), P_2 (euclidean distance).
Distance measures for mixed symbolic objects - modified Minkowski metrics: L_1 (manhattan distance), L_2 (euclidean distance).
See file ../doc/dist_SDA.pdf for further details
NOTE !!!: In previous version of package this functian has been called dist.SDA.
Billard L., Diday E. (eds.) (2006), Symbolic Data Analysis, Conceptual Statistics and Data Mining, John Wiley & Sons, Chichester.
Bock H.H., Diday E. (eds.) (2000), Analysis of Symbolic Data. Explanatory methods for extracting statistical information from complex data, Springer-Verlag, Berlin.
Diday E., Noirhomme-Fraiture M. (eds.) (2008), Symbolic Data Analysis with SODAS Software, John Wiley & Sons, Chichester.
Ichino, M., & Yaguchi, H. (1994),Generalized Minkowski metrics for mixed feature-type data analysis. IEEE Transactions on Systems, Man, and Cybernetics, 24(4), 698-708. Available at: tools:::Rd_expr_doi("10.1109/21.286391").
Malerba D., Espozito F, Giovalle V., Tamma V. (2001), Comparing Dissimilarity Measures for Symbolic Data Analysis, "New Techniques and Technologies for Statistcs" (ETK NTTS'01), pp. 473-481.
Malerba, D., Esposito, F., Monopoli, M. (2002), Comparing dissimilarity measures for probabilistic symbolic objects, In: A. Zanasi, C.A. Brebbia, N.F.F. Ebecken, P. Melli (Eds.), Data Mining III, "Series Management Information Systems", Vol. 6, WIT Press, Southampton, pp. 31-40.
DClust
, index.G1d
; dist.Symbolic
in clusterSim
library
# LONG RUNNING - UNCOMMENT TO RUN
#data("cars",package="symbolicDA")
#dist<-dist_SDA(cars, type="U_3", gamma=0.3, power=2)
#print(dist)
Run the code above in your browser using DataLab