seqsubsn: Number of distinct subsequences in a sequence.
Description
Computes the number of distinct subsequences in a sequence using Elzinga's algorithm.
Usage
seqsubsn(seqdata, DSS=TRUE, with.missing=FALSE)
Value
Vector with the number of distinct subsequences for each sequence in the input state sequence object.
Arguments
seqdata
a state sequence object as defined by the seqdef function.
DSS
if TRUE, the sequences of Distinct Successive States (DSS, see seqdss) are first extracted (e.g., the DSS contained in 'D-D-D-D-A-A-A-A-A-A-A-D' is 'D-A-D'), and the number of distinct subsequences in the DSS is computed. If FALSE, the number of distinct subsequences is computed from sequences as they appear in the input sequence object. Hence the number of distinct subsequences is in most cases much higher with the DSS=FALSE option.
with.missing
logical: should non-void missing values be treated as a regular state?
If FALSE (default) missing values are ignored.
Author
Alexis Gabadinho (with Gilbert Ritschard for the help page)
Details
The function first searches for missing states in the sequences and if found, adds the missing state to the alphabet for the extraction of the distinct subsequences. A missing state in a sequence is considered as the occurrence of an additional symbol of the alphabet, and two or more consecutive missing states are considered as two or more occurrences of the same state. The with.missing=TRUE argument is used for calling the seqdss function when DSS=TRUE.
data(actcal)
actcal.seq <- seqdef(actcal,13:24)
## Number of subsequences with DSS=TRUEseqsubsn(actcal.seq[1:10,])
## Number of subsequences with DSS=FALSEseqsubsn(actcal.seq[1:10,],DSS=FALSE)