Learn R Programming

Rwordseg (version 0.3-2)

segmentCN: Sengment Chinese text.

Description

A function to segment Chinese text into words.

Usage

segmentCN(strwords, analyzer = c("default", "hmm", "jiebaR", "fmm",
  "coreNLP"), nature = FALSE, nosymbol = TRUE,
  returnType = c("vector", "tm"), ...)

Arguments

strwords

A charactor vector of Chinese sentence.

analyzer

One of 'default', 'jiebaR', 'hmm', 'fmm' and 'coreNLP'. Default is 'hmm'.

nature

Whether to recognise the nature of the words.

nosymbol

Whether to keep symbols in the sentence. Default is TRUE, means no symbols kept.

returnType

Default is a string vector but we also can choose 'tm' to output a single string separated by space so that it can be used by Corpus directly.

...

Other arguments.

Value

a vector of words (list if input is vecter) which have been segmented.

Examples

Run this code
# NOT RUN {
segmentCN("hello world!")

# }

Run the code above in your browser using DataLab