Learn R Programming

PST (version 0.94.1)

generate: Generate sequences using a probabilistic suffix tree

Description

Generate sequences using a probabilistic suffix tree

Usage

# S4 method for PSTf
generate(object, l, n, s1, p1, method, L, cnames)

Value

A state sequence object (an object of class stslist) containing n sequences. This object can be passed as argument to all the functions for visualization and analysis provided by the TraMineR package.

Arguments

object

a probabilistic suffix tree, i.e., an object of class "PSTf" as returned by the pstree, prune or tune function.

l

integer. Length of the sequence(s) to generate.

n

integer. Number of the sequence(s) to generate.

s1

character. The first state in the sequences. The length of the vector should equal n. If specified, the first state in the sequence(s) is not randomly generated but taken from s1.

p1

numeric. An optional probability vector for generating the first position state in the sequence(s). If specified, the first state in the sequence(s) is randomly generated using the probability distribution in p1 instead of the probability distribution taken fron the root node of object.

method

character. If method=pmax, at each position the state having the highest probability is chosen. If method=prob, at each position the state is generated using the corresponding probability distribution taken from object.

L

integer: Maximal depth used to extract the probability distributions from the PST object.

cnames

character: Optional column (position) names for the returned state sequence object. By default, the names of the sequence object to which the model was fitted are used (slot "data" of the PST).

Author

Alexis Gabadinho

Details

As a probabilistic suffix tree (PST) represents a generating model, it can be used to generate artificial sequence data sets. Sequences are built by generating the states at each successive position. The process is similar to sequence prediction (see predict), except that the retrieved conditional probability distributions provided by the PST are used to generate a symbol instead of computing the probability of an existing state. For more details, see Gabadinho 2016.

References

Gabadinho, A. & Ritschard, G. (2016). Analyzing State Sequences with Probabilistic Suffix Trees: The PST R Package. Journal of Statistical Software, 72(3), pp. 1-39.

Examples

Run this code
data(s1)
s1.seq <- seqdef(s1)
S1 <- pstree(s1.seq, L=3)

## Generating 10 sequences
generate(S1, n=10, l=10, method="prob")

## First state is generated with p(a)=0.9 and p(b)=0.1
generate(S1, n=10, l=10, method="prob", p1=c(0.9, 0.1))

Run the code above in your browser using DataLab