Learn R Programming

chinese.misc (version 0.2.3)

Miscellaneous Tools for Chinese Text Mining and More

Description

Efforts are made to make Chinese text mining easier, faster, and robust to errors. Document term matrix can be generated by only one line of code; detecting encoding, segmenting and removing stop words are done automatically. Some convenient tools are also supplied.

Copy Link

Version

Install

install.packages('chinese.misc')

Monthly Downloads

476

Version

0.2.3

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Jiang Wu

Last Published

September 11th, 2020

Functions in chinese.misc (0.2.3)

VCR

Copy and Paste from Excel-Like Files
as.numeric2

An Enhanced Version of as.numeric
V

Copy and Paste from Excel-Like Files
VC

Copy and Paste from Excel-Like Files
DEFAULT_control1

A Default Value for corp_or_dtm 1
as.character2

An Enhanced Version of as.character
VR

Copy and Paste from Excel-Like Files
VRC

Copy and Paste from Excel-Like Files
DEFAULT_cutter

A Default Cutter
DEFAULT_control2

A Default Value for corp_or_dtm 2
dictionary_dtm

Making DTM/TDM for Groups of Words
get_tmp_chi_locale

Check The Locale Functions are to Assume
seg_file

Convenient Tool to Segment Chinese Texts
get_tag_word

Extract Words of Some Certain Tags through Pos-Tagging
is_character_vector

A Convenient Version of is.character
dir_or_file

Collect Full Filenames from a Mix of Directories and Files
create_ttm

Create Term-Term Matrix (Term-Cooccurrence Matrix)
chinese.misc-package

Miscellaneous Tools for Chinese Text Mining and More
csv2txt

Write Texts in CSV into Many TXT/RTF Files
m2doc

Rewrite Terms and Frequencies into Many Files
slim_text

Remove Words through Speech Tagging
corp_or_dtm

Create Corpus or Document Term Matrix with 1 Line
output_dtm

Convert or Write DTM/TDM Object Quickly
is_positive_integer

A Convenient Version of is.integer
make_stoplist

Input a Filename and Return a Vector of Stop Words
scancn

Read a Text File by Auto-Detecting Encoding
txt2csv

Write Many Separated Files into a CSV
m3m

Convert Objects among matrix, dgCMatrix, simple_triplet_matrix, DocumentTermMatrix, TermDocumentMatrix
sparse_left

Check How many Words are Left under Certain Sparse Values
sort_tf

Find High Frequency Terms
tf2doc

Transform Terms and Frequencies into a Text
match_pattern

Extract Strings by Regular Expression Quickly
topic_trend

Simple Rise or Fall Trend of Several Years
word_cor

Word Correlation in DTM/TDM