nchar
takes a character vector as an argument and
returns a vector whose elements contain the sizes of
the corresponding elements of x
. nzchar
is a fast way to find out if elements of a character
vector are non-empty strings.
nchar(x, type = "chars", allowNA = FALSE, keepNA = FALSE)
nzchar(x, keepNA = FALSE)
c("bytes", "chars", "width")
. See Details.NA
be returned for invalid
multibyte strings or "bytes"
-encoded strings (rather than
throwing an error)?NA
be returned where ever
x
is NA
? If false, the (implicit or explicit)
default for nzchar()
and for R versions $<=$ 3.2.x,="" nchar() returns 2
, as that is the number of printing
characters used when strings are written to output, and
nzchar()
is TRUE
.
From R version 3.3.0 on, for nchar()
only, the default will
be NA
, which means to use keepNA = TRUE
unless
type
is "width"
. Used to be (implicitly) hard coded
to FALSE
in R versions $=$>
nchar
, an integer vector giving the sizes of each element.
For missing values (i.e., NA
, i.e., NA_character_
),
nchar()
returns NA_integer_
if keepNA
is
true, and 2
, the number of printing characters, if false.type = "width"
gives (an approximation to) the number of
columns used in printing each element in a terminal font, taking into
account double-width, zero-width and composing characters.If allowNA = TRUE
and an element is detected as invalid in a
multi-byte character set such as UTF-8, its number of characters and
the width will be NA
. Otherwise the number of characters will
be non-negative, so !is.na(nchar(x, "chars", TRUE))
is a test
of validity.A character string marked with "bytes"
encoding (see
Encoding
) has a number of bytes, but neither a known
number of characters nor a width, so the latter two types are
NA
if allowNA = TRUE
, otherwise an error.Names, dims and dimnames are copied from the input.For nzchar
, a logical vector of the same length as x
,
true if and only if the element has non-zero length; if the element is
NA
, nzchar()
is true when keepNA
is false, as by
default, and NA
otherwise.
type
argument):
bytes
chars
width
cat
will use to
print the string in a monospaced font. The same as chars
if this cannot be calculated.
These will often be the same, and almost always will be in single-byte
locales (but note how type
may influence NA
treatment
for keepNA = NA
). There will be differences between the first two with
multibyte character sequences, e.g.\ifelse{latex}{\out{~}}{ } in UTF-8 locales.
The internal equivalent of the default method of
as.character
is performed on x
(so there is no
method dispatch). If you want to operate on non-vector objects
passing them through deparse
first will be required.
strwidth
giving width of strings for plotting;
paste
, substr
, strsplit
x <- c("asfef", "qwerty", "yuiop[", "b", "stuff.blah.yech")
nchar(x)
# 5 6 6 1 15
nchar(deparse(mean))
# 18 17 <-- unless mean differs from base::mean
x[3] <- NA; x
nchar(x, keepNA= TRUE) # 5 6 NA 1 15
nchar(x, keepNA=FALSE) # 5 6 2 1 15
stopifnot(identical(nchar(x, "w", keepNA = NA),
nchar(x, keepNA = FALSE)),
identical(is.na(x), is.na(nchar(x, keepNA=NA))))
Run the code above in your browser using DataLab