Resulting names are unique and consist only of the _
character, numbers, and letters.
Capitalization preferences can be specified using the case
parameter.
Accented characters are transliterated to ASCII. For example, an "o" with a German umlaut over it becomes "o", and the Spanish character "enye" becomes "n".
This function takes and returns a data.frame, for ease of piping with
`%>%`
. For the underlying function that works on a character vector
of names, see make_clean_names
.
clean_names(dat, ...)# S3 method for data.frame
clean_names(dat, ...)
# S3 method for default
clean_names(dat, ...)
# S3 method for sf
clean_names(dat, ...)
# S3 method for tbl_graph
clean_names(dat, ...)
the input data.frame.
Arguments passed on to make_clean_names
case
The desired target case (default is "snake"
) will be
passed to snakecase::to_any_case()
with the exception of "old_janitor",
which exists only to support legacy code (it preserves the behavior of
clean_names()
prior to addition of the "case" argument (janitor
versions <= 0.3.1). "old_janitor" is not intended for new code. See
to_any_case
for a wide variety of supported cases,
including "sentence" and "title" case.
replace
A named character vector where the name is replaced by the value.
ascii
Convert the names to ASCII (TRUE
, default) or not
(FALSE
).
use_make_names
Should make.names()
be applied to ensure that the
output is usable as a name without quoting? (Avoiding make.names()
ensures that the output is locale-independent but quoting may be required.)
sep_in
(short for separator input) if character, is interpreted as a
regular expression (wrapped internally into stringr::regex()
).
The default value is a regular expression that matches any sequence of
non-alphanumeric values. All matches will be replaced by underscores
(additionally to "_"
and " "
, for which this is always true, even
if NULL
is supplied). These underscores are used internally to split
the strings into substrings and specify the word boundaries.
transliterations
A character vector (if not NULL
). The entries of this argument
need to be elements of stringi::stri_trans_list()
(like "Latin-ASCII", which is often useful) or names of lookup tables (currently only "german" is supported). In the order of the entries the letters of the input
string will be transliterated via stringi::stri_trans_general()
or replaced via the
matches of the lookup table. When named character elements are supplied as part of `transliterations`, anything that matches the names is replaced by the corresponding value.
You should use this feature with care in case of case = "parsed"
, case = "internal_parsing"
and
case = "none"
, since for upper case letters, which have transliterations/replacements
of length 2, the second letter will be transliterated to lowercase, for example Oe, Ae, Ss, which
might not always be what is intended. In this case you can make usage of the option to supply named elements and specify the transliterations yourself.
parsing_option
An integer that will determine the parsing_option.
1: "RRRStudio" -> "RRR_Studio"
2: "RRRStudio" -> "RRRS_tudio"
3: "RRRStudio" -> "RRRSStudio"
. This will become for example "Rrrstudio"
when we convert to lower camel case.
-1, -2, -3: These parsing_options
's will suppress the conversion after non-alphanumeric values.
0: no parsing
numerals
A character specifying the alignment of numerals ("middle"
, left
, right
, asis
or tight
). I.e. numerals = "left"
ensures that no output separator is in front of a digit.
Returns the data.frame with clean names.
clean_names()
is intended to be used on data.frames
and data.frame
like objects. For this reason there are methods to
support using clean_names()
on sf
and tbl_graph
(from
tidygraph
) objects. For cleaning named lists and vectors, consider
using make_clean_names()
.
# NOT RUN {
# not run:
# clean_names(poorly_named_df)
# or pipe in the input data.frame:
# poorly_named_df %>% clean_names()
# if you prefer camelCase variable names:
# poorly_named_df %>% clean_names(., "small_camel")
# not run:
# library(readxl)
# read_excel("messy_excel_file.xlsx") %>% clean_names()
# }
Run the code above in your browser using DataLab