Quotes: Quotes

Description

Descriptions of the various uses of quoting in R.

Arguments

Character constants

Single and double quotes delimit character constants. They can be used interchangeably but double quotes are preferred (and character constants are printed using double quotes), so single quotes are normally only used to delimit character constants containing double quotes. Backslash is used to start an escape sequence inside character constants. Escaping a character not in the following table is an error. Single quotes need to be escaped by backslash in single-quoted strings, and double quotes in double-quoted strings.

newline

carriage return

tab

backspace

alert (bell)

form feed

vertical tab

backslash \

ASCII apostrophe '

ASCII quotation mark "

ASCII grave accent (backtick) `

\nnn

character with given octal code (1, 2 or 3 digits)

\xnn

character with given hex code (1 or 2 hex digits)

\unnnn

Unicode character with given code (1--4 hex digits)

\Unnnnnnnn

Unicode character with given code (1--8 hex digits)

Alternative forms for the last two are \u{nnnn} and \U{nnnnnnnn}. All except the Unicode escape sequences are also supported when reading character strings by scan and read.table if allowEscapes = TRUE. Unicode escapes can be used to enter Unicode characters not in the current locale's charset (when the string will be stored internally in UTF-8). The parser does not allow the use of both octal/hex and Unicode escapes in a single string. These forms will also be used by print.default when outputting non-printable characters (including backslash). Embedded nuls are not allowed in character strings, so using escapes (such as \0) for a nul will result in the string being truncated at that point (usually with a warning).

Names and Identifiers

Identifiers consist of a sequence of letters, digits, the period (.) and the underscore. They must not start with a digit nor underscore, nor with a period followed by a digit. Reserved words are not valid identifiers. The definition of a letter depends on the current locale, but only ASCII digits are considered to be digits. Such identifiers are also known as syntactic names and may be used directly in R code. Almost always, other names can be used provided they are quoted. The preferred quote is the backtick (`), and deparse will normally use it, but under many circumstances single or double quotes can be used (as a character constant will often be converted to a name). One place where backticks may be essential is to delimit variable names in formulae: see formula.

Details

Three types of quotes are part of the syntax of R: single and double quotation marks and the backtick (or back quote, `). In addition, backslash is used to escape the following character inside character constants.

Examples

Run this code


'single quotes can be used more-or-less interchangeably'
"with double quotes to create character vectors"

## Single quotes inside single-quoted strings need backslash-escaping.
## Ditto double quotes inside double-quoted strings.
##
identical('"It\'s alive!", he screamed.',
          "\"It's alive!\", he screamed.") # same

## Backslashes need doubling, or they have a special meaning.
x <- "In ALGOL, you could do logical AND with /\\."
print(x)      # shows it as above ("input-like")
writeLines(x) # shows it as you like it ;-)

## Single backslashes followed by a letter are used to denote
## special characters like tab(ulator)s and newlines:
x <- "long\tlines can be\nbroken with newlines"
writeLines(x) # see also ?strwrap

## Backticks are used for non-standard variable names.
## (See make.names and ?Reserved for what counts as
## non-standard.)
`x y` <- 1:5
`x y`
d <- data.frame(`1st column` = rchisq(5, 2), check.names = FALSE)
d$`1st column`

## Backslashes followed by up to three numbers are interpreted as
## octal notation for ASCII characters.
"\110\145\154\154\157\40\127\157\162\154\144\41"

## \x followed by up to two numbers is interpreted as
## hexadecimal notation for ASCII characters.
(hw1 <- "\x48\x65\x6c\x6c\x6f\x20\x57\x6f\x72\x6c\x64\x21")

## Mixing octal and hexadecimal in the same string is OK
(hw2 <- "\110\x65\154\x6c\157\x20\127\x6f\162\x6c\144\x21")

## \u is also hexadecimal, but supported up to 4 numbers,
## using Unicode specification.  In the previous example,
## you can simply replace \x with \u.
(hw3 <- "\u48\u65\u6c\u6c\u6f\u20\u57\u6f\u72\u6c\u64\u21")

## The last three are all identical to
hw <- "Hello World!"
stopifnot(identical(hw, hw1), identical(hw1, hw2), identical(hw2, hw3))

## Using Unicode makes more sense for non-latin characters.
(nn <- "\u0126\u0119\u1114\u022d\u2001\u03e2\u0954\u0f3f\u13d3\u147b\u203c")

## Mixing \x and \u throws a _parse_ error (which is not catchable!)
## Not run: 
#   "\x48\u65\x6c\u6c\x6f\u20\x57\u6f\x72\u6c\x64\u21"
# ## End(Not run)
##   -->   Error: mixing Unicode and octal/hex escapes .....

## \U works like \u, but supports up to eight numbers.
## So we can replace \u with \U in the previous example.
n2 <- "\U0126\U0119\U1114\U022d\U2001\U03e2\U0954\U0f3f\U13d3\U147b\U203c"
stopifnot(identical(nn, n2))

## Under systems supporting multi-byte locales (and not Windows),
## \U also supports the rarer characters outside the usual 16^4 range.
## See the R language manual,
## https://cran.r-project.org/doc/manuals/r-release/R-lang.html#Literal-constants
## and bug 16098 https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16098
"\U1d4d7" # On Windows this gives the incorrect value of "\Ud4d7"

## nul characters (for terminating strings in C) are not allowed (parse errors)
## Not run: 
#   "foo\0bar"     # Error: nul character not allowed (line 1)
#   "foo\u0000bar" # same error
# ## End(Not run)

Run the code above in your browser using DataLab