Learn R Programming

gSeg (version 1.0)

gseg2: Graph-Based Change-Point Detection for Changed Interval

Description

This function finds an interval in the sequence where their underlying distribution differs from the rest of the sequence. It provides four graph-based test statistics.

Usage

gseg2(n, E, statistics=c("all","o","w","g","m"), l0=0.05*n, l1=0.95*n, pval.appr=TRUE,
 skew.corr=TRUE, pval.perm=FALSE, B=100)

Arguments

n

The number of observations in the sequence.

E

The edge matrix (a "number of edges" by 2 matrix) for the similarity graph. Each row contains the node indices of an edge.

statistics

The scan statistic to be computed. A character indicating the type of of scan statistic desired. The default is "all".

"all": specifies to compute all of the scan statistics: original, weighted, generalized, and max-type;

"o", "ori" or "original": specifies the original edge-count scan statistic;

"w" or "weighted": specifies the weighted edge-count scan statistic;

"g" or "generalized": specifies the generalized edge-count scan statistic; and

"m" or "max": specifies the max-type edge-count scan statistic.

l0

The minimum length of the interval to be considered as a changed interval.

l1

The maximum length of the interval to be considered as a changed interval.

pval.appr

If it is TRUE, the function outputs p-value approximation based on asymptotic properties.

skew.corr

This argument is useful only when pval.appr=TRUE. If skew.corr is TRUE, the p-value approximation would incorporate skewness correction.

pval.perm

If it is TRUE, the function outputs p-value from doing B permutations, where B is another argument that you can specify. Doing permutation could be time consuming, so use this argument with caution as it may take a long time to finish the permutation.

B

This argument is useful only when pval.perm=TRUE. The default value for B is 100.

Value

Returns a list scanZ with tauhat, Zmax, and a matrix of the scan statistics for each type of scan statistic specified. See below for more details.

tauhat

An estimate of the two ends of the changed interval.

Zmax

The test statistic (maximum of the scan statistics).

Z

A matrix of the original scan statistics (standardized counts) if statistic specified is "all" or "o".

Zw

A matrix of the weighted scan statistics (standardized counts) if statistic specified is "all" or "w".

S

A matrix of the generalized scan statistics (standardized counts) if statistic specified is "all" or "g".

M

A matrix of the max-type scan statistics (standardized counts) if statistic specified is "all" or "m".

R

A matrix of raw counts of the original scan statistic. This output only exists if the statistic specified is "all" or "o".

Rw

A matrix of raw counts of the weighted scan statistic. This output only exists if statistic specified is "all" or "w".

pval.appr

The approximated p-value based on asymptotic theory for each type of statistic specified.

pval.perm

This output exists only when the argument pval.perm is TRUE . It is the permutation p-value from B permutations and appears for each type of statistic specified (same for perm.curve, perm.maxZs, and perm.Z).

perm.curve

A B by 2 matrix with the first column being critical values corresponding to the p-values in the second column.

perm.maxZs

A sorted vector recording the test statistics in the B permutaitons.

perm.Z

A B by n-squared matrix with each row being the vectorized scan statistics from each permutaiton run.

See Also

gSeg, gseg1, gseg2_discrete

Examples

Run this code
# NOT RUN {
data(Example)
# Five examples, each example is a n-length sequence.
# Ei (i=1,...,5): an edge matrix representing a similarity graph constructed on the
# observations in the ith sequence.  
# Check '?gSeg' to see how the Ei's were constructed.

## E5 is an edge matrix representing a similarity graph.
# It is constructed on a sequence of length n=200 with a change in both mean
# and variance on an interval (tau1 = 155, tau2 = 185).
r5=gseg2(n,E5,statistics="all")

# }

Run the code above in your browser using DataLab