galah is an R interface to biodiversity data hosted by
the ‘living atlases’; a set of organisations that share a common
codebase, and act as nodes of the Global Biodiversity Information
Facility (GBIF). These organisations
collate and store observations of individual life forms, using the ‘Darwin Core’ data standard.
galah enables users to locate and download species
observations, taxonomic information, record counts, or associated media
such as images or sounds. Users can restrict their queries to particular
taxa or locations by specifying which columns and rows are returned by a
query, or by restricting their results to observations that meet
particular quality-control criteria. With a few minor exceptions, all
functions return a tibble as their standard format.
To install from CRAN:
install.packages("galah")Or install the development version from GitHub:
install.packages("remotes")
remotes::install_github("AtlasOfLivingAustralia/galah")Load the package
library(galah)By default, galah downloads information from the Atlas
of Living Australia (ALA). To show the full list of Atlases currently
supported by galah, use show_all(atlases).
show_all(atlases)## # A tibble: 11 × 4
##    region         institution                                                             acronym url                         
##    <chr>          <chr>                                                                   <chr>   <chr>                       
##  1 Australia      Atlas of Living Australia                                               ALA     https://www.ala.org.au      
##  2 Austria        Biodiversitäts-Atlas Österreich                                         BAO     https://biodiversityatlas.at
##  3 Brazil         Sistemas de Informações sobre a Biodiversidade Brasileira               SiBBr   https://sibbr.gov.br        
##  4 Estonia        eElurikkus                                                              <NA>    https://elurikkus.ee        
##  5 France         Portail français d'accès aux données d'observation sur les espèces      OpenObs https://openobs.mnhn.fr/    
##  6 Global         Global Biodiversity Information Facility                                GBIF    https://gbif.org            
##  7 Guatemala      Sistema Nacional de Información sobre Diversidad Biológica de Guatemala SNIBgt  https://snib.conap.gob.gt   
##  8 Portugal       GBIF Portugal                                                           GBIF.pt https://www.gbif.pt         
##  9 Spain          GBIF Spain                                                              GBIF.es https://www.gbif.es         
## 10 Sweden         Swedish Biodiversity Data Infrastructure                                SBDI    https://biodiversitydata.se 
## 11 United Kingdom National Biodiversity Network                                           NBN     https://nbn.org.ukUse galah_config() to set the Atlas to use. This will
automatically populate the server configuration for your selected Atlas.
By default, the atlas is Australia.
galah_config(atlas = "United Kingdom")Functions that return data from the chosen atlas have the prefix
atlas_; e.g. to find the total number of records in the
atlas, use:
galah_config(atlas = "ALA")
atlas_counts()## # A tibble: 1 × 1
##       count
##       <int>
## 1 112998157To pass more complex queries, start with the
galah_call() function and pipe additional arguments to
modify the query. modifying functions have a galah_ prefix
and support non-standard evaluation (NSE).
galah_call() |> 
  galah_filter(year >= 2020) |> 
  atlas_counts()## # A tibble: 1 × 1
##      count
##      <int>
## 1 15111799Alternatively, you can use a subset of dplyr verbs to
pipe your queries, assuming you start with
galah_call().
galah_call() |>
  filter(year >= 2020) |> 
  group_by(year) |>
  count()## # A tibble: 4 × 2
##   year    count
##   <chr>   <int>
## 1 2021  7155246
## 2 2020  6418993
## 3 2022  1536146
## 4 2023     1414To narrow the search to a particular taxonomic group, use
galah_identify() or identify. Note that this
function only accepts scientific names and is not case sensitive. It’s
good practice to first use search_taxa() to check that the
taxa you provide returns the correct taxonomic results.
search_taxa("reptilia") # Check whether taxonomic info is correct## # A tibble: 1 × 9
##   search_term scientific_name taxon_concept_id                                                          rank  match_type kingdom  phylum   class    issues 
##   <chr>       <chr>           <chr>                                                                     <chr> <chr>      <chr>    <chr>    <chr>    <chr>  
## 1 reptilia    REPTILIA        https://biodiversity.org.au/afd/taxa/682e1228-5b3c-45ff-833b-550efd40c399 class exactMatch Animalia Chordata Reptilia noIssuegalah_call() |>
  galah_filter(year >= 2020) |> 
  galah_identify("reptilia") |> 
  atlas_counts()## # A tibble: 1 × 1
##    count
##    <int>
## 1 100936The most common use case for galah is to download
‘occurrence’ records; observations of plants or animals made by
contributors to the atlas. To download, first register with the relevant
atlas, then provide your registration email. For GBIF queries, you will
need to provide the email, username, and password that you have
registered with GBIF.
galah_config(email = "email@email.com")Then you can customise records you require and query the atlas in question:
result <- galah_call() |>
  galah_identify("Litoria") |>
  galah_filter(year >= 2020, cl22 == "Tasmania") |>
  galah_select(basisOfRecord, group = "basic") |>
  atlas_occurrences()
result |> head()## # A tibble: 6 × 9
##   decimalLatitude decimalLongitude eventDate           scientificName    taxonConceptID                                                   recor…¹ dataR…² occur…³ basis…⁴
##             <dbl>            <dbl> <dttm>              <chr>             <chr>                                                            <chr>   <chr>   <chr>   <chr>  
## 1           -43.4             147. 2020-09-04 14:00:00 Litoria ewingii   https://biodiversity.org.au/afd/taxa/d0e897bb-e6f5-4654-a511-1c… e8045b… FrogID  PRESENT HUMAN_…
## 2           -43.4             146. 2020-01-01 13:00:00 Litoria ewingii   https://biodiversity.org.au/afd/taxa/d0e897bb-e6f5-4654-a511-1c… 44187a… FrogID  PRESENT HUMAN_…
## 3           -43.4             146. 2020-01-01 13:00:00 Litoria ewingii   https://biodiversity.org.au/afd/taxa/d0e897bb-e6f5-4654-a511-1c… bc34a7… FrogID  PRESENT HUMAN_…
## 4           -43.4             146. 2020-01-01 13:00:00 Litoria ewingii   https://biodiversity.org.au/afd/taxa/d0e897bb-e6f5-4654-a511-1c… ca4707… FrogID  PRESENT HUMAN_…
## 5           -43.4             146. 2020-01-01 13:00:00 Litoria burrowsae https://biodiversity.org.au/afd/taxa/e9f41129-f946-4e81-889b-2e… 9c71f5… FrogID  PRESENT HUMAN_…
## 6           -43.4             146. 2020-01-01 13:00:00 Litoria ewingii   https://biodiversity.org.au/afd/taxa/d0e897bb-e6f5-4654-a511-1c… 4bbaad… FrogID  PRESENT HUMAN_…
## # … with abbreviated variable names ¹recordID, ²dataResourceName, ³occurrenceStatus, ⁴basisOfRecordCheck out our other vignettes for more detail on how to use these functions.