'filters' are arguments of the form field logical value
that are used
to narrow down the number of records returned by a specific query.
For example, it is common for users to request records from a particular year
(year == 2020
), or to return all records except for fossils
(basisOfRecord != "FossilSpecimen"
).
The result of galah_filter
can be passed to the filters
argument in atlas_occurrences()
, atlas_species()
or
atlas_counts()
. galah_filter
uses non-standard evaluation (NSE),
and is designed to be as compatible as possible with dplyr::filter
syntax.
galah_filter(..., profile = NULL)
filters, in the form field logical value
string
: (optional) a data quality profile to apply to the
records. See show_all_profiles()
for valid profiles. By default
no profile is applied.
An object of class data.frame
and galah_filter
,
containing filter values.
Create a custom filter for records of interest
filters <- galah_filter( basisOfRecord == "HumanObservation", year >= 2010, stateProvince == "New South Wales")
Add the default ALA data quality profile
filters <- galah_filter( basisOfRecord == "HumanObservation", year >= 2020, stateProvince == "New South Wales", profile = "ALA")
Use filters to exclude particular values
filter <- galah_filter(year >= 2010 & year != 2021)atlas_counts(filter = filter) #> # A tibble: 1 x 1 #> count #> <int> #> 1 43916661
Separating statements with a comma is equivalent to an AND
statement
galah_filter(year >= 2010 & year < 2020) # is the same as: galah_filter(year >= 2010, year < 2020)
All statements must include the field name
galah_filter(year == 2010 | year == 2021) # this works (note double equals) galah_filter(year == c(2010, 2021)) # same as above galah_filter(year == 2010 | 2021) # this fails
It is possible to use an object to specify required values
# Numeric exampleyear_value <- 2010
galah_call() %>% galah_filter(year > year_value) %>% atlas_counts() #> # A tibble: 1 x 1 #> count #> <int> #> 1 42816943
# Categorical examplebasis_of_record <- c("HumanObservation", "MaterialSample")
galah_call() %>% galah_filter(basisOfRecord == basis_of_record) %>% atlas_counts() #> # A tibble: 1 x 1 #> count #> <int> #> 1 82809464
solr
supports range queries on text as well as numbers. The following
queries all Australian States and Territories alphabetically after "Tasmania"
galah_call() %>% galah_filter(cl22 >= "Tasmania") %>% atlas_counts() #> # A tibble: 1 x 1 #> count #> <int> #> 1 30230213
All statements passed to galah_filter()
(except the profile
argument) take the form of field - logical - value. Permissible examples include:
=
or ==
(e.g. year = 2020
)
!=
, e.g. year != 2020
)
>
or >=
(e.g. year >= 2020
)
<
or <=
(e.g. year <= 2020
)
OR
statements (e.g. year == 2018 | year == 2020
)
AND
statements (e.g. year >= 2000 & year <= 2020
)
In some cases R
will fail to parse inputs with a single equals sign
(=
), particularly where statements are separated by &
or
|
. This problem can be avoided by using a double-equals (==
) instead.
search_taxa()
and galah_geolocate()
for other ways to restrict
the information returned by atlas_occurrences()
and related functions. Use
search_fields()
to find fields that
you can filter by, and search_field_values()
to find what values
of those filters are available.