Learn R Programming

stringi (version 1.1.3)

stri_opts_brkiter: Generate a List with BreakIterator Settings

Description

A convenience function to tune the ICU BreakIterator's behavior in some text boundary analysis functions, see stringi-search-boundaries.

Usage

stri_opts_brkiter(type, locale, skip_word_none, skip_word_number,
  skip_word_letter, skip_word_kana, skip_word_ideo, skip_line_soft,
  skip_line_hard, skip_sentence_term, skip_sentence_sep, ...)

Arguments

type
single string; break iterator type, one of character, line_break, sentence, or word; see stringi-search-boundaries
locale
single string, NULL or "" for default locale
skip_word_none
logical; perform no action for "words" that do not fit into any other categories
skip_word_number
logical; perform no action for words that appear to be numbers
skip_word_letter
logical; perform no action for words that contain letters, excluding hiragana, katakana, or ideographic characters
skip_word_kana
logical; perform no action for words containing kana characters
skip_word_ideo
logical; perform no action for words containing ideographic characters
skip_line_soft
logical; perform no action for soft line breaks, i.e. positions at which a line break is acceptable but not required
skip_line_hard
logical; perform no action for hard, or mandatory line breaks
skip_sentence_term
logical; perform no action for sentences ending with a sentence terminator (".", ",", "?", "!"), possibly followed by a hard separator (CR, LF, PS, etc.)
skip_sentence_sep
logical; perform no action for sentences that do not contain an ending sentence terminator, but are ended by a hard separator or end of input
...
any other arguments to this function are purposely ignored

Value

Returns a named list object. Omitted skip_* values act as they have been set to FALSE.

Details

The skip_* family of settings may be used to prevent performing any special actions on particular types of text boundaries, e.g. in case of the stri_locate_all_boundaries and stri_split_boundaries functions.

References

ubrk.h File Reference -- ICU4C API Documentation, http://icu-project.org/apiref/icu4c/ubrk_8h.html Boundary Analysis -- ICU User Guide, http://userguide.icu-project.org/boundaryanalysis

See Also

Other text_boundaries: stri_count_boundaries, stri_extract_all_boundaries, stri_locate_all_boundaries, stri_split_boundaries, stri_split_lines, stri_trans_tolower, stri_wrap, stringi-search-boundaries, stringi-search