query: Retrieve counts or next symbol probability distribution

Description

Retrieve counts or next symbol probability distribution from a node of a probabilistic suffix tree

Usage

# S4 method for PSTf
query(object, context, state, output = "prob", exact = FALSE)

Value

An object of class cprobd, with available round method.

Arguments

object: A probabilistic suffix tree, i.e an object of class "PSTf") as returned by the pstree, prune or tune function.
context: Character. The string labelling the node to retrieve. States must be separated by '-' as for example in 'a-a-b'. If the node labelled with this string does not exist in the tree, the node labelled with the longest suffix is searched for, and so on until an existing node is found.
state: character. If specified the probability of the specified state is returned instead of the whole distribution.
output: character. If output="prob" the probability distribution (or a single symbol distribution if state is specified) is returned. If output="counts" the counts on which the probability distribution is calculated are returned. If output="all" the node itself is returned, that is an object of class PSTr.
exact: logical. If TRUE, the information is returned only if the node labelled with context is present in the tree. That is, the longest suffix of context is not searched for if context is not in the tree.

Author

Alexis Gabadinho

Details

The PST is searched for the node labelled with context. If exact=FALSE, when the node does not exist the PST is searched for the longest suffix of context, and so on until a node corresponding to a suffix of context is found or the root node is reached. For more details, see Gabadinho 2016.

References

Gabadinho, A. & Ritschard, G. (2016). Analyzing State Sequences with Probabilistic Suffix Trees: The PST R Package. Journal of Statistical Software, 72(3), pp. 1-39.

Examples

Run this code

data(s1)
s1 <- seqdef(s1)
S1 <- pstree(s1, L=3)
## Retrieving from the node labelled 'a-a-a'
query(S1, "a-a-a")

## The node 'a-b-b-a' is not presetnin the tree, and the next symbol
## probability is retrieved from the node labelled 'b-b-a' (the longest
## suffix
query(S1, "a-b-b-a")

Run the code above in your browser using DataLab