The function uses initialized engines for words segmentation. You
can initialize multiple engines simultaneously using worker()
.
Public settings of workers can be got and modified using $
,
such as WorkerName$symbol = T
. Some private settings are fixed
when engine is initialized, and you can get then by
WorkerName$PrivateVarible
.
segment(code, jiebar, mod = NULL)
A Chinese sentence or the path of a text file.
jiebaR Worker.
change default result type, value can be "mix","hmm","query","full" or "mp"
There are four kinds of models:
Maximum probability segmentation model uses Trie tree to construct
a directed acyclic graph and uses dynamic programming algorithm. It
is the core segmentation algorithm. dict
and user
should be provided when initializing jiebaR worker.
Hidden Markov Model uses HMM model to determine status set and
observed set of words. The default HMM model is based on People's Daily
language library. hmm
should be provided when initializing
jiebaR worker.
MixSegment model uses both Maximum probability segmentation model
and Hidden Markov Model to construct segmentation. dict
,
hmm
and user
should be provided when initializing
jiebaR worker.
QuerySegment model uses MixSegment to construct segmentation and then
enumerates all the possible long words in the dictionary. dict
,
hmm
and qmax
should be provided when initializing
jiebaR worker.
There is a symbol <=
for this function.