segmentCN

A charactor vector of Chinese sentence.

strwords

One of 'default', 'jiebaR', 'hmm', 'fmm' and 'coreNLP'. Default is 'hmm'.

analyzer

Whether to recognise the nature of the words.

nature

Whether to keep symbols in the sentence. Default is TRUE, means no symbols kept.

nosymbol

Default is a string vector but we also can choose 'tm' 
to output a single string separated by space so that it can be used by <code><a rd-options="tm" href="/link/Corpus?package=Rwordseg&version=0.3-2&to=tm" data-mini-rdoc="tm::Corpus">Corpus</a></code> directly.

returnType

A function to segment Chinese text into words.

Provides interfaces and useful tools for Chinese word segmentation. Implements a segmentation algorithm based on Hidden Markov Model (HMM) in native R codes. Methods for HHMM-Based Chinese lexical analyzer are as described in : Hua-Ping Zhang et al., (2003) <doi:10.3115/1119250.1119280>.

Jian Li

Rwordseg

Chinese Word Segmentation

segmentCN function

Default is a string vector but we also can choose 'tm' 
to output a single string separated by space so that it can be used by <code><a rd-options='tm' href='Corpus'>Corpus</a></code> directly.

segmentCN: Sengment Chinese text.

Description

Usage

Arguments

Value

Examples