The halk package is built to implement the hierarchical age-length key (HALK) method. This method creates a series of aggregate age-length keys using data from across levels, and then assigns age from length using the aggregate age-length key from the most specific level.
The HALK is a data-borrowing age assignment method primarily used in fisheries ecology. It extends the traditional method of an age-length key (ALK) by borrowing data across time, space, or any other nested level to create aggregate, nested ALKs used to assign age to fish based on empirically measured length. For example, a HALK can be created by borrowing age-length data from a single lake across time, from other waterbodies within a certain area, or any other nested categorical level. Subsequent surveys that are then sampled for length, but not age, can be assigned ages using the HALK.
A HALK is created by passing paired age-length data to the
make_halk
function. There are two main arguments to this
function: data
, which represents the paired age-length
data, and levels
, which is a character vector of the column
names that represent the different nested levels in the HALK. For
example, if the following data were used in HALK creation, you could
pass any combination of spp
, county
, and
waterbody
as levels:
#> # A tibble: 6 × 5
#> spp county waterbody age length
#> <chr> <chr> <chr> <int> <dbl>
#> 1 bluegill county_A lake_a 0 1
#> 2 bluegill county_A lake_a 0 1
#> 3 bluegill county_A lake_a 0 0.9
#> 4 bluegill county_A lake_a 0 1
#> 5 bluegill county_A lake_a 0 1.1
#> 6 bluegill county_A lake_a 0 1.1
This will fit a HALK based on the user specified levels. Say that we
include spp
, county
and waterbody
as levels to the function make_halk
. This will create an
ALK for each waterbody, each county, and then a species-wide global
ALK.
spp_county_wb_alk <- make_halk(
wb_spp_laa_data,
levels = c("spp", "county", "waterbody")
)
head(spp_county_wb_alk)
#> # A tibble: 6 × 4
#> spp county waterbody alk
#> <chr> <chr> <chr> <list>
#> 1 bluegill county_A lake_a <alk [10 × 10]>
#> 2 bluegill county_A lake_b <alk [12 × 10]>
#> 3 bluegill county_A lake_c <alk [11 × 10]>
#> 4 bluegill county_A <NA> <alk [13 × 10]>
#> 5 bluegill county_B lake_a <alk [11 × 10]>
#> 6 bluegill county_B lake_b <alk [12 × 10]>
The returned tibble contains a list-column named alk
that stores an ALK for each level provided to the levels
argument (note that the ALK for county_A
has an NA in the
waterbody column indicating that this is a county-wide ALK). Each object
in this list-column is simply an ALK that is created using all data from
the level indicated by the respective non-NA columns in that row.
# Bluegill ALK for lake_a in county_A, from row #1 above
head(spp_county_wb_alk$alk[[1]])
#> # A tibble: 6 × 10
#> length age0 age1 age2 age3 age4 age5 age6 age7 age8
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 0 1 0 0 0 0 0 0 0 0
#> 2 1 1 0 0 0 0 0 0 0 0
#> 3 3 0 1 0 0 0 0 0 0 0
#> 4 4 0 0.918 0.0816 0 0 0 0 0 0
#> 5 5 0 0.0870 0.870 0.0435 0 0 0 0 0
#> 6 6 0 0 0.756 0.244 0 0 0 0 0
The halk package makes it easy to assign ages to length data from a
HALK using the assign_ages
function. Once you have created
a HALK, simply pass it to assign_ages
along with the length
data you wish to have ages estimated on—make sure that your length data
has all columns used in the levels
argument used in
make_halk
.
est_ages <- assign_ages(wb_spp_length_data, spp_county_wb_alk)
head(est_ages)
#> # A tibble: 6 × 7
#> spp county waterbody length est.age alk alk.n
#> <chr> <chr> <chr> <dbl> <dbl> <chr> <int>
#> 1 bluegill county_A lake_a 1 0 waterbody 371
#> 2 bluegill county_A lake_a 1 0 waterbody 371
#> 3 bluegill county_A lake_a 1.1 0 waterbody 371
#> 4 bluegill county_A lake_a 1.1 0 waterbody 371
#> 5 bluegill county_A lake_a 1 0 waterbody 371
#> 6 bluegill county_A lake_a 1.2 0 waterbody 371
Notice below that there are lakes in the est_ages
object
that were not present in the original length-at-age data used to create
the spp_county_wb_alk
object (lake_x in county_A). Ages for
the lengths in lake_x in this example were assigned using the ALK from
the county level (specifically, the ALK from county_A) as noted in the
alk
column.
head(est_ages[est_ages$waterbody == "lake_x", ])
#> # A tibble: 6 × 7
#> spp county waterbody length est.age alk alk.n
#> <chr> <chr> <chr> <dbl> <dbl> <chr> <int>
#> 1 bluegill county_A lake_x 1 0 county 1088
#> 2 bluegill county_A lake_x 1.3 0 county 1088
#> 3 bluegill county_A lake_x 1 0 county 1088
#> 4 bluegill county_A lake_x 1.1 0 county 1088
#> 5 bluegill county_A lake_x 1.2 0 county 1088
#> 6 bluegill county_A lake_x 1.2 0 county 1088