Learn R Programming

TraMineR (version 2.2-10)

seqdiff: Position-wise discrepancy analysis between groups of sequences

Description

The function analyses how the differences between groups of sequences evolve along the positions. It runs a sequence of discrepancy analyses on sliding windows.

Usage

seqdiff(seqdata, group, cmprange = c(0, 1),
  seqdist.args = list(method = "LCS", norm = "auto"), with.missing = FALSE,
  weighted = TRUE, squared = FALSE, seqdist_arg)

Value

A seqdiff object, with the following items:

stat

A data.frame with five statistics (Pseudo F, Pseudo Fbf, Pseudo R2, Bartlett, and Levene) for each time stamp of the sequence (see dissassoc)

discrepancy

A data.frame with, at each time position \(t\), the discrepancy within the whole set of sequences and within each group (defined by the group variable).

Arguments

seqdata

a state sequence object created with the seqdef function.

group

The group variable.

cmprange

Vector of two integers: Time range of the sliding windows. Comparison at \(t\) is computed on the window (\(t + \)cmprange[1], \(t + \)cmprange[2]).

seqdist.args

List of arguments passed to seqdist for computing the distances.

with.missing

Logical. If TRUE, missing values are considered as an additional state. If FALSE subsequences with missing values are removed from the analysis.

weighted

Logical. If TRUE, seqdiff uses the weights specified in seqdata.

squared

Logical. If TRUE the dissimilarities are squared for computing the discrepancy.

seqdist_arg

Deprecated. Use seqdist.args instead.

Author

Matthias Studer (with Gilbert Ritschard for the help page)

Details

The function analyses how the part of discrepancy explained by the group variable evolves along the position axis. It runs successively discrepancy analyses within a sliding time-window of range cmprange). At each position \(t\), the method uses seqdist to compute a distance matrix over the time-window (\(t + \)cmprange[1], \(t + \)cmprange[2]) and then derives the explained discrepancy on that window with dissassoc.

There are print and plot methods for the returned value.

References

Studer, M., G. Ritschard, A. Gabadinho and N. S. Müller (2011). Discrepancy analysis of state sequences, Sociological Methods and Research, Vol. 40(3), 471-510, tools:::Rd_expr_doi("10.1177/0049124111415372").

Studer, M., G. Ritschard, A. Gabadinho and N. S. Müller (2010) Discrepancy analysis of complex objects using dissimilarities. In F. Guillet, G. Ritschard, D. A. Zighed and H. Briand (Eds.), Advances in Knowledge Discovery and Management, Studies in Computational Intelligence, Volume 292, pp. 3-19. Berlin: Springer.

Studer, M., G. Ritschard, A. Gabadinho and N. S. Müller (2009) Analyse de dissimilarités par arbre d'induction. In EGC 2009, Revue des Nouvelles Technologies de l'Information, Vol. E-15, pp. 7-18.

See Also

dissassoc to analyse the association of the group variable with the whole sequence

Examples

Run this code
## Define a state sequence object
data(mvad)
## First 12 months of first 100 trajectories
mvad.seq <- seqdef(mvad[1:100, 17:28])

## Position-wise discrepancy analysis using
##  centered sliding windows of length 5.
mvad.diff <- seqdiff(mvad.seq, group=mvad$gcse5eq[1:100], cmprange=c(-2,2))
print(mvad.diff)
plot(mvad.diff, stat=c("Pseudo R2", "Levene"))
plot(mvad.diff, stat="discrepancy")

Run the code above in your browser using DataLab