Learn R Programming

base (version 3.5.1)

Sys.glob: Wildcard Expansion on File Paths

Description

Function to do wildcard expansion (also known as ‘globbing’) on file paths.

Usage

Sys.glob(paths, dirmark = FALSE)

Arguments

paths

character vector of patterns for relative or absolute filepaths. Missing values will be ignored.

dirmark

logical: should matches to directories from patterns that do not already end in / or \ have a slash appended? May not be supported on all platforms.

Value

A character vector of matched file paths. The order is system-specific (but in the order of the elements of paths): it is normally collated in either the current locale or in byte (ASCII) order; however, on Windows collation is in the order of Unicode points.

Directory errors are normally ignored, so the matches are to accessible file paths (but not necessarily accessible files).

Details

This expands wildcards in file paths. For precise details, see your system's documentation on the glob system call. There is a POSIX 1003.2 standard (see http://pubs.opengroup.org/onlinepubs/9699919799/functions/glob.html) but some OSes will go beyond this. The R implementation will always do tilde expansion.

All systems should interpret * (match zero or more characters), ? (match a single character) and (probably) [ (begin a character class or range). The handling of paths ending with a separator is system-dependent. On a POSIX-2008 compliant OS they will match directories (only), but as they are not valid filepaths on Windows, they match nothing there. (Earlier POSIX standards allowed them to match files.)

The rest of these details are indicative (and based on the POSIX standard).

If a filename starts with . this may need to be matched explicitly: for example Sys.glob("*.RData") may or may not match .RData but will not usually match .aa.RData. Note that this is platform-dependent: e.g.on Solaris Sys.glob("*.*") matches . and ...

[ begins a character class. If the first character in [...] is not !, this is a character class which matches a single character against any of the characters specified. The class cannot be empty, so ] can be included provided it is first. If the first character is !, the character class matches a single character which is none of the specified characters. Whether . in a character class matches a leading . in the filename is OS-dependent.

Character classes can include ranges such as [A-Z]: include - as a character by having it first or last in a class. (The interpretation of ranges should be locale-specific, so the example is not a good idea in an Estonian locale.)

One can remove the special meaning of ?, * and [ by preceding them by a backslash (except within a character class). The glob system call is not part of Windows, and we supply a partial emulation.

Wildcards are * (match zero or more characters) and ? (match a single character). If a filename starts with . this must be matched explicitly (on Windows, but note that this is platform-dependent).

[ begins a character class. If the first character in [...] is not !, this is a character class which matches a single character against any of the characters specified. The class cannot be empty, so ] can be included provided it is first. If the first character is !, the character class matches a single character which is none of the specified characters. Whether . in a character class matches a leading . in the filename is OS-dependent.

Character classes can include ranges such as [A-Z]: include - as a character by having it first or last in a class. (In the current implementation ranges are in the numeric order of Unicode code points.)

One can remove the special meaning of ?, * and [ by preceding them by a backslash (except within a character class). Note that on Windows ? and * are not valid in file names, so this is mainly for consistency with other platforms.

File paths in Windows are interpreted with separator \ or /. Paths with a drive but relative (such as c:foo\bar) are tricky, but an attempt is made to handle them correctly. An attempt is made to handle UNC paths starting with a double backslash. UTF-8-encoded paths not valid in the current locale can be used.

See Also

path.expand.

Quotes for handling backslashes in character strings.

Examples

Run this code
# NOT RUN {
Sys.glob(file.path(R.home(), "library", "*", "R", "*.rdx"))
# }

Run the code above in your browser using DataLab