Learn R Programming

IDmining (version 1.0.7)

MINDID_FMC: Functional Measure of Clustering Using the Morisita Estimator of ID

Description

Computes the functional m-Morisita index for a given set of threshold values.

Usage

MINDID_FMC(XY, scaleQ, m=2, thd)

Arguments

XY

A \(N \times E\) matrix, data.frame or data.table where \(N\) is the number of data points and \(E\) is the number of variables (i.e. the input variables + the variable measured at each measurement station). The last column contains the variable measured at each measurement station. And each input variable is rescaled to the [0,1] interval by the function. Typically, the input variables are the X and Y coordinates of the measurement stations, but other or additional variables can be considered as well.

scaleQ

A vector containing the values of \(\ell^{-1}\) chosen by the user (see Details).

m

The value of the parameter m (by default: m=2).

thd

Either a single value or a vector. It contains the value(s) of the threshold(s).

Value

A vector containing the value(s) of the m-Morisita slope, \(S_m\), for each threshold value.

Details

  1. \(\ell\) is the edge length of the grid cells (or quadrats). Since the input variables (and consenquently the grid) are rescaled to the \([0,1]\) interval, \(\ell\) is equal to \(1\) for a grid consisting of only one cell.

  2. \(\ell^{-1}\) is the number of grid cells (or quadrats) along each axis of the Euclidean space in which the data points are embedded.

  3. \(\ell^{-1}\) is equal to \(Q^{(1/E)}\) where \(Q\) is the number of grid cells and \(E\) is the number of variables (or features).

  4. \(\ell^{-1}\) is directly related to \(\delta\) (see References).

  5. \(\delta\) is the diagonal length of the grid cells.

References

J. Golay, M. Kanevski, C. D. Vega Orozco and M. Leuenberger (2014). The multipoint Morisita index for the analysis of spatial patterns, Physica A 406:191<U+2013>202.

J. Golay and M. Kanevski (2015). A new estimator of intrinsic dimension based on the multipoint Morisita index, Pattern Recognition 48 (12):4070<U+2013>4081.

L. Telesca, J. Golay and M. Kanevski (2015). Morisita-based space-clustering analysis of Swiss seismicity, Physica A 419:40<U+2013>47.

Examples

Run this code
# NOT RUN {
bf    <- Butterfly(10000)
bf_SP <- bf[,c(1,2,9)]

m      <- 2
scaleQ <- 5:25
thd    <- quantile(bf_SP$Y,probs=c(0,0.1,0.2,0.3,
                                   0.4,0.5,0.6,
                                   0.7,0.8,0.9))

nbr_shuf    <- 100
Sm_thd_shuf <- matrix(0,length(thd),nbr_shuf)
for (i in 1:nbr_shuf){
  bf_SP_shuf      <- cbind(bf_SP[,1:2],sample(bf_SP$Y,length(bf_SP$Y)))
  Sm_thd_shuf[,i] <- MINDID_FMC(bf_SP_shuf, scaleQ, m, thd)
}
mean_shuf <- apply(Sm_thd_shuf,1,mean)

dev.new(width=6, height=4)
matplot(1:10,Sm_thd_shuf,type="l",lty=1,col=rgb(1,0,0,0.25),
        ylim=c(-0.05,0.05),ylab=bquote(S[.(m)]),xaxt="n",
        xlab="",cex.lab=1.2)
axis(1,1:10,labels = FALSE)
text(1:10,par("usr")[3]-0.01,srt=45,ad=1,
     labels=c("0_100", "10_100","20_100","30_100",
              "40_100","50_100","60_100",
              "70_100","80_100","90_100"),xpd=T,font=2,cex=1)
mtext("Thresholds",side=1,line=3.5,cex=1.2)
lines(1:10,mean_shuf,type="b",col="blue",pch=19)

legend.text<-c("Shuffled","mean")
legend.pch=c(NA,19)
legend.lwd=c(2,2)
legend.col=c("red","blue")
legend("topleft",legend=legend.text,pch=legend.pch,lwd=legend.lwd,
       col=legend.col,ncol=1,text.col="black",cex=1,box.lwd=1,bg="white")
# }

Run the code above in your browser using DataLab