Learn R Programming

sos (version 2.1-8)

findFn: Search Help Pages

Description

Returns a data.frame from RSiteSearch(string, "function") which can be sorted and subsetted by user specifications and viewed in an HTML table. The default sort puts first packages with the most matches (Count), with ties broken using the sum of the match scores for all the hits in that package (TotalScore), etc.

Usage

findFn(string, maxPages = 100, sortby = NULL, 
        verbose = 1, ...)

Value

an object of class

c('findFn', 'data.frame') with columns and attributes as follows:

Columns

  • Count Total number of matches downloaded in this package

  • MaxScore maximum of the Score over all help pages selected within each Package. See Score below or the Namazu website (link below) for more information on how the score is determined.

  • TotalScore sum of the Score over all help pages selected within each Package. See Score below or the Namazu website (link below) for more information on how the score is determined.

  • Package Name of the package containing a help page meeting the search criteria

  • Function Name of the help page found that meets the indicated search criterion.

  • Date Date of the help page

  • Score Score returned by RSiteSearch, discussed in the Namazu website (link below).

  • Description Title of the help page

  • Link Universal Resource Locator (URL) for the help page

Attributes

  • matches an integer = total number of matches found by the search. This typically will exceed the number of rows found, because the search algorithm sometimes finds things that are not help pages for packages.

  • PackageSummary a data.frame with one row for each package and columns Package, Count, MaxScore, TotalScore, and Date, sorted as in the sort. argument.

  • string the string argument in the call.

  • call the matched call

Arguments

string

A character string. See RSiteSearch.

maxPages

The maximum number of pages to download assuming 20 links per page.

sortby

a character vector specifying how the data.frame returned should be sorted. Default = c('Count', 'MaxScore', 'TotalScore', 'Package', 'Score', 'Function') to sort descending on numerics and ascending on alphanumerics. Specifying sortby = c('c', 't', 'm') is equivalent to c('Count', 'TotalScore', 'MaxScore', 'Package', 'Score', 'Function').

verbose

an integer: if 0, no output is printed to the console. The default 1 displays an initial line with the number of pages to be retrieved and the number of matches obtained; if the number of matches to be downloaded is less, this also is displayed on the initial line. This is followed by a second line counting the pages downloaded.

If greater than 1, additional information is provided on the download process.

...

ignored

Author

Spencer Graves, Sundar Dorai-Raj, Romain Francois. Duncan Murdoch suggested the "???" alias for findFn and contributed the code for it.

Special thanks to Gennadiy Starostin, Vienna University of Economics and Business (Wirtschaftsuniversitaet Wien), who in early 2021 took over maintenance of the RSiteSearch data base, updated its structure, and rewrote findFn to match.

Special thanks to Jonathan Baron, and Andy Liaw. Baron maintained the RSiteSearch data base for many years. Liaw and Baron created the RSiteSearch function in the utils package. Thanks also to Patrice Kiener of `InModelia` in Paris, France, who helped me fix some syntax problems stemming from changes in how an itemized list is described in a *.Rd file.

Details

findFn searches the help pages of packages covered by the RSiteSearch archives. To restrict the search to only packages installed locally, use help.search.

1. Access the RSiteSearch engine with string, restricting to "functions", storing Score, Package, Function, Date, Description, and Link in a data.frame.

2. Compute Count, MaxScore and TotalScore for each Package accessed. Combine them in a matrix PackageSummary.

3. Sort PackageSummary in the order defined by the occurrence of c('Count', 'MaxScore', 'TotalScore', 'Package') in sortby.

4. Merge PackageSummary with the data.frame of search matches.

5. Sort the combined data.frame as defined by sort..

6. Make the result have class c("findFn", "data.frame") and add attributes matches, PackageSummary, string, and call.

7. Done.

References

http://www.namazu.org/doc/tips.html.en#weight - reference on determining Score

See Also

help.search to search only installed packages. RSiteSearch, download.file findFn searches only "Target: Functions" from that site, ignoring the R-help archives.

For alternative R search capabilities, see:

* "Searching R Packages" on Wikiversity

* Julia Silge, John C. Nash, and Spencer Graves (2018) Navigating the R Package Universe, The R Journal, 10(2) 558-563.

* https://search.r-project.org for a list of alternative R search capabilities, each of which may be best for different types of inquiries.

* findFunction for a completely different function with a similar name.

Examples

Run this code
# Skip these tests on CRAN,
# because they take more than 5 seconds
if(!CRAN()){

z <- try(findFn("spline", maxPages = 2))
# alternative
zq <- try(???spline(2))

# Confirm z == zq except for 'call' 
  attr(z, 'call') <- NULL
  attr(zq, 'call') <- NULL

if(!(inherits(z, "try-error") ||
     inherits(zq, "try-error"))){

stopifnot(
all.equal(z, zq)
)

# To search for 2 terms, not necessarily together:
RSS <- try(findFn('RSiteSearch function', 1))
matches(RSS)

# To search for an exact string, use braces:
RSS. <- try(findFn('{RSiteSearch function}', 1))
matches(RSS.) # list(nrow = 0, matches = 0)

# example in which resulting page has some unicode characters
Lambert <- try(findFn("Lambert"))
Lambert

# Example that "found  2  link(s) without dates" on 2021-06-26
webScr <- try(findFn('web scraping'))

# Example that "found 0 matches" on 2021-09-06
try(findFn('{open history map}'))
}
}

Run the code above in your browser using DataLab