The atlas_ functions are used to return data from the
atlas chosen using galah_config(). They are:
atlas_countsatlas_occurrencesatlas_speciesatlas_mediaatlas_taxonomyThe final atlas_ function - atlas_citation
- is unusual in that it does not return any new data. Instead it
provides a citation for an existing dataset ( downloaded using
atlas_occurrences) that has an associated DOI. The other
functions are described below.
atlas_counts() provides summary counts on records in the
specified atlas, without needing to download all the records.
galah_config(atlas = "Australia")
# Total number of records in the ALA
atlas_counts()## # A tibble: 1 × 1
##       count
##       <int>
## 1 112555050In addition to the filter arguments, it has an optional
group_by argument, which provides counts binned by the
requested field.
galah_call() |>
  galah_group_by(kingdom) |>
  atlas_counts()## # A tibble: 10 × 2
##    kingdom      count
##    <chr>        <int>
##  1 Animalia  85432793
##  2 Plantae   23468480
##  3 Fungi      2076156
##  4 Chromista   853644
##  5 Protista    144729
##  6 Bacteria     71362
##  7 Protozoa      3211
##  8 Eukaryota     1340
##  9 Archaea       1106
## 10 Virus          486A common use case of atlas data is to identify which species occur in
a specified region, time period, or taxonomic group.
atlas_species() is similar to search_taxa, in
that it returns taxonomic information and unique identifiers in a
tibble. It differs in not being able to return information
on taxonomic levels other than the species; but also in being more
flexible by supporting filtering:
species <- galah_call() |>
  galah_identify("Rodentia") |>
  galah_filter(stateProvince == "Northern Territory") |>
  atlas_species()
  
species |> head()## # A tibble: 6 × 10
##   kingdom  phylum   class    order    family  genus        species                     author            species_guid       verna…¹
##   <chr>    <chr>    <chr>    <chr>    <chr>   <chr>        <chr>                       <chr>             <chr>              <chr>  
## 1 Animalia Chordata Mammalia Rodentia Muridae Mesembriomys Mesembriomys gouldii        (J.E. Gray, 1843) https://biodivers… Black-…
## 2 Animalia Chordata Mammalia Rodentia Muridae Zyzomys      Zyzomys argurus             (Thomas, 1889)    https://biodivers… Common…
## 3 Animalia Chordata Mammalia Rodentia Muridae Pseudomys    Pseudomys hermannsburgensis (Waite, 1896)     https://biodivers… Sandy …
## 4 Animalia Chordata Mammalia Rodentia Muridae Notomys      Notomys alexis              Thomas, 1922      https://biodivers… Spinif…
## 5 Animalia Chordata Mammalia Rodentia Muridae Melomys      Melomys burtoni             (Ramsay, 1887)    https://biodivers… Grassl…
## 6 Animalia Chordata Mammalia Rodentia Muridae Mus          Mus musculus                Linnaeus, 1758    https://biodivers… House …
## # … with abbreviated variable name ¹vernacular_nameTo download occurrence data you will need to specify your email in
galah_config(). This email must be associated with an
active ALA account. See more information in the config
section
galah_config(email = "your_email@email.com", atlas = "Australia")Download occurrence records for Eolophus roseicapilla
occ <- galah_call() |>
  galah_identify("Eolophus roseicapilla") |>
  galah_filter(
    stateProvince == "Australian Capital Territory",
    year >= 2010,
    profile = "ALA"
  ) |>
  galah_select(institutionID, group = "basic") |>
  atlas_occurrences()## Warning: One or more parsing issues, call `problems()` on your data frame for details, e.g.:
##   dat <- vroom(...)
##   problems(dat)occ |> head()## # A tibble: 6 × 9
##   decimalLatitude decimalLongitude eventDate           scientificName        taxonConceptID         recor…¹ dataR…² occur…³ insti…⁴
##             <dbl>            <dbl> <dttm>              <chr>                 <chr>                  <chr>   <chr>   <chr>   <lgl>  
## 1           -35.9             149. 2020-09-12 14:00:00 Eolophus roseicapilla https://biodiversity.… 17f46d… eBird … PRESENT NA     
## 2           -35.9             149. 2021-09-27 14:00:00 Eolophus roseicapilla https://biodiversity.… dbb711… eBird … PRESENT NA     
## 3           -35.9             149. 2012-01-18 13:00:00 Eolophus roseicapilla https://biodiversity.… 4f7cd7… BirdLi… PRESENT NA     
## 4           -35.9             149. 2017-03-17 13:00:00 Eolophus roseicapilla https://biodiversity.… 3236c4… eBird … PRESENT NA     
## 5           -35.9             149. 2020-11-14 13:00:00 Eolophus roseicapilla https://biodiversity.… ef2b90… eBird … PRESENT NA     
## 6           -35.8             149. 2021-04-02 13:00:00 Eolophus roseicapilla https://biodiversity.… 45a589… eBird … PRESENT NA     
## # … with abbreviated variable names ¹recordID, ²dataResourceName, ³occurrenceStatus, ⁴institutionIDIn addition to text data describing individual occurrences and their
attributes, ALA stores images, sounds and videos associated with a given
record. Metadata on these records can be downloaded to R
using atlas_media() and the same set of filters as the
other data download functions.
media_data <- galah_call() |>
  galah_identify("Eolophus roseicapilla") |>
  galah_filter(
    year == 2020,
    cl22 == "Australian Capital Territory") |>
  atlas_media()
  
media_data |> head()## # A tibble: 6 × 20
##   decimalLati…¹ decim…² eventDate           scien…³ taxon…⁴ recor…⁵ dataR…⁶ occur…⁷ multi…⁸ media…⁹ mime_…˟ size_…˟ date_…˟ date_…˟
##           <dbl>   <dbl> <dttm>              <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>     <int> <chr>   <chr>  
## 1         -35.6    149. 2020-08-04 01:50:00 Eoloph… https:… 063bb0… iNatur… PRESENT Image   2f4d32… image/… 2654217 2020-0… 2020-0…
## 2         -35.6    149. 2020-08-04 01:50:00 Eoloph… https:… 063bb0… iNatur… PRESENT Image   734074… image/… 2422643 2020-0… 2020-0…
## 3         -35.6    149. 2020-08-04 01:50:00 Eoloph… https:… 063bb0… iNatur… PRESENT Image   89171c… image/… 2212660 2020-0… 2020-0…
## 4         -35.6    149. 2020-08-04 01:50:00 Eoloph… https:… 063bb0… iNatur… PRESENT Image   e681d3… image/… 3414736 2020-0… 2020-0…
## 5         -35.5    149. 2020-08-26 01:53:00 Eoloph… https:… 286841… iNatur… PRESENT Image   1295c2… image/…  863158 2021-0… 2021-0…
## 6         -35.5    149. 2020-10-14 02:34:00 Eoloph… https:… 064a39… iNatur… PRESENT Image   f97686… image/…  955916 2020-1… 2020-1…
## # … with 6 more variables: height <int>, width <int>, creator <chr>, license <chr>, data_resource_uid <chr>, occurrence_id <chr>,
## #   and abbreviated variable names ¹decimalLatitude, ²decimalLongitude, ³scientificName, ⁴taxonConceptID, ⁵recordID,
## #   ⁶dataResourceName, ⁷occurrenceStatus, ⁸multimedia, ⁹media_id, ˟mime_type, ˟size_in_bytes, ˟date_uploaded, ˟date_takenTo actually download the media files to your computer, use [collect_media()].
atlas_taxonomy provides a way to build taxonomic trees
from one clade down to another using ALA’s internal taxonomy. Specify
which taxonomic level your tree will go down to with
galah_down_to.
classes <- galah_call() |>
  galah_identify("chordata") |>
  galah_down_to(class) |>
  atlas_taxonomy()This function is unique within galah as it is the only
function that returns a data.tree, rather than a
tibble.
##                             levelName
## 1  Chordata                          
## 2   ¦--Cephalochordata               
## 3   ¦   °--Amphioxi                  
## 4   ¦--Craniata                      
## 5   ¦   °--Agnatha                   
## 6   ¦       ¦--Cephalasipidomorphi   
## 7   ¦       °--Myxini                
## 8   ¦--Tunicata                      
## 9   ¦   ¦--Appendicularia            
## 10  ¦   ¦--Ascidiacea                
## 11  ¦   °--Thaliacea                 
## 12  °--Vertebrata                    
## 13      °--Gnathostomata             
## 14          ¦--Amphibia              
## 15          ¦--Aves                  
## 16          ¦--Mammalia              
## 17          ¦--Pisces                
## 18          ¦   ¦--Actinopterygii    
## 19          ¦   ¦--Chondrichthyes    
## 20          ¦   ¦--Cephalaspidomorphi
## 21          ¦   °--Sarcopterygii     
## 22          °--ReptiliaAlthough the tree format is useful, converting to a
data.frame is straightforward.
data.tree::ToDataFrameTypeCol(classes, type = "rank") |> head()##   rank_phylum  rank_subphylum rank_superclass rank_informal          rank_class
## 1    Chordata Cephalochordata            <NA>          <NA>            Amphioxi
## 2    Chordata        Craniata         Agnatha          <NA> Cephalasipidomorphi
## 3    Chordata        Craniata         Agnatha          <NA>              Myxini
## 4    Chordata        Tunicata            <NA>          <NA>      Appendicularia
## 5    Chordata        Tunicata            <NA>          <NA>          Ascidiacea
## 6    Chordata        Tunicata            <NA>          <NA>           ThaliaceagalahVarious aspects of the galah package can be customized. To preserve
configuration for future sessions, set profile_path to a
location of a .Rprofile file.
To download occurrence records, you will need to provide an email address registered with the ALA. You can create an account here. Once an email is registered with the ALA, it should be stored in the config:
galah_config(email = "myemail@gmail.com")galah can cache most results to local files. This means
that if the same code is run multiple times, the second and subsequent
iterations will be faster.
By default, this caching is session-based, meaning that the local files are stored in a temporary directory that is automatically deleted when the R session is ended. This behaviour can be altered so that caching is permanent, by setting the caching directory to a non-temporary location.
galah_config(cache_directory = "example/dir")By default, caching is turned off. To turn caching on, run
galah_config(caching = FALSE)ALA requires that you provide a reason when downloading occurrence
data (via the galah atlas_occurrences() function). The
reason is set as “scientific research” by default, but you can change
this using galah_config(). See
show_all_reasons() for valid download reasons.
galah_config(download_reason_id = your_reason_id)If things aren’t working as expected, more detail (particularly about
web requests and caching behaviour) can be obtained by setting the
verbose configuration option:
galah_config(verbose = TRUE)