'filters' are arguments of the form field logical value that are used
to narrow down the number of records returned by a specific query.
For example, it is common for users to request records from a particular year
(year == 2020), or to return all records except for fossils
(basisOfRecord != "FossilSpecimen").
The result of galah_filter can be passed to the filters
argument in atlas_occurrences(), atlas_species() or
atlas_counts(). galah_filter uses non-standard evaluation (NSE),
and is designed to be as compatible as possible with dplyr::filter
syntax.
galah_filter(..., profile = NULL)filters, in the form field logical value
string: (optional) a data quality profile to apply to the
records. See show_all_profiles() for valid profiles. By default
no profile is applied.
An object of class data.frame and galah_filter,
containing filter values.
Create a custom filter for records of interest
filters <- galah_filter(
    basisOfRecord == "HumanObservation",
    year >= 2010,
    stateProvince == "New South Wales")
Add the default ALA data quality profile
filters <- galah_filter(
    basisOfRecord == "HumanObservation",
    year >= 2020,
    stateProvince == "New South Wales",
    profile = "ALA")
Use filters to exclude particular values
filter <- galah_filter(year >= 2010 & year != 2021)atlas_counts(filter = filter) #> # A tibble: 1 x 1 #> count #> <int> #> 1 43916661
Separating statements with a comma is equivalent to an AND statement
galah_filter(year >= 2010 & year < 2020) # is the same as: galah_filter(year >= 2010, year < 2020)
All statements must include the field name
galah_filter(year == 2010 | year == 2021) # this works (note double equals) galah_filter(year == c(2010, 2021)) # same as above galah_filter(year == 2010 | 2021) # this fails
It is possible to use an object to specify required values
# Numeric exampleyear_value <- 2010
galah_call() %>% galah_filter(year > year_value) %>% atlas_counts() #> # A tibble: 1 x 1 #> count #> <int> #> 1 42816943
# Categorical examplebasis_of_record <- c("HumanObservation", "MaterialSample")
galah_call() %>% galah_filter(basisOfRecord == basis_of_record) %>% atlas_counts() #> # A tibble: 1 x 1 #> count #> <int> #> 1 82809464
solr supports range queries on text as well as numbers. The following
queries all Australian States and Territories alphabetically after "Tasmania"
galah_call() %>% galah_filter(cl22 >= "Tasmania") %>% atlas_counts() #> # A tibble: 1 x 1 #> count #> <int> #> 1 30230213
All statements passed to galah_filter() (except the profile
argument) take the form of field - logical - value. Permissible examples include:
= or == (e.g. year = 2020)
!=, e.g. year != 2020)
> or >= (e.g. year >= 2020)
< or <= (e.g. year <= 2020)
OR statements (e.g. year == 2018 | year == 2020)
AND statements (e.g. year >= 2000 & year <= 2020)
In some cases R will fail to parse inputs with a single equals sign
(=), particularly where statements are separated by & or
|. This problem can be avoided by using a double-equals (==) instead.
search_taxa() and galah_geolocate() for other ways to restrict
the information returned by atlas_occurrences() and related functions. Use
search_fields() to find fields that
you can filter by, and search_field_values() to find what values
of those filters are available.