Learn R Programming

TraMineR (version 2.2-10)

seqecmpgroup: Identifying discriminating subsequences

Description

Identify and sort the most discriminating subsequences by their discriminating power.

Usage

seqecmpgroup(subseq, group, method="chisq", pvalue.limit=NULL,
             weighted = TRUE)

Value

An objet of type subseqelistchisq (subtype of subseqelist) with the following elements

subseq

Sorted list of found discriminating subsequences

eseq

The event sequence object on which the tests were computed

constraint

Time constraints used for searching the subsequences (see seqeconstraint)

labels

Levels (value labels) of the target group variable

type

Type of test used

data

A data frame with columns support, index (original rank of the subsequence, i.e., its position in the inputted subseq) and a pair of frequency and Pearson residual columns for each group

Arguments

subseq

A subseqelist object (list of subsequences) such as produced by seqefsub

group

Group membership, i.e., a variable or factor defining the groups which we want to discriminate

method

The discrimination method; one of "bonferroni" or "chisq"

pvalue.limit

Can be used to filter the results. Only subsequences with a p-value lower than this parameter are selected. If NULL all subsequences are returned (regardless of their p-values).

weighted

Logical. If TRUE, seqecmpgroup uses the weights specified in subseq, (see seqefsub).

Author

Matthias Studer (with Gilbert Ritschard for the help page)

Details

The following discrimination test functions are implemented: chisq, the Pearson Independence Chi-squared test, and bonferroni, the Pearson Independence Chi-squared test with Bonferroni correction.

References

Studer, M., Müller, N.S., Ritschard, G. & Gabadinho, A. (2010), "Classer, discriminer et visualiser des séquences d'événements", In Extraction et gestion des connaissances (EGC 2010), Revue des nouvelles technologies de l'information RNTI. Vol. E-19, pp. 37-48.

Ritschard, G., Bürgin, R., and Studer, M. (2014), "Exploratory Mining of Life Event Histories", In McArdle, J.J. & Ritschard, G. (eds) Contemporary Issues in Exploratory Data Mining in the Behavioral Sciences. Series: Quantitative Methodology, pp. 221-253. New York: Routledge.

See Also

See also plot.subseqelistchisq to plot the results

Examples

Run this code
data(actcal.tse)
actcal.eseq <- seqecreate(actcal.tse)

##Searching for frequent subsequences, that is, appearing at least 20 times
fsubseq <- seqefsub(actcal.eseq, pmin.support=0.01)

##searching for susbsequences discriminating the most men and women
data(actcal)
discr <- seqecmpgroup(fsubseq, group=actcal$sex, method="bonferroni")
##Printing the six most discriminating subsequences
print(discr[1:6])
##Plotting the six most discriminating subsequences
plot(discr[1:6])

Run the code above in your browser using DataLab