fastymd is a package for working with Year-Month-Day (YMD) style date objects. It provides extremely fast passing of character strings and numeric values to date objects as well as fast decomposition of these in to their year, month and day components. The underlying algorithms follow the approach of Howard Hinnant for calculating days from the UNIX Epoch of Gregorian Calendar dates and vice versa.
The API won’t give any surprises:
library(fastymd)
cdate <- c("2025-04-16", "2025-04-17")
(res <- fymd(cdate))
#> [1] "2025-04-16" "2025-04-17"
res == as.Date(cdate)
#> [1] TRUE TRUE
get_ymd(res)
#> year month day
#> 1 2025 4 16
#> 2 2025 4 17
fymd(2025, 4, 16) == res[1L]
#> [1] TRUE
Invalid dates will return NA
and a warning:
fymd(2021, 02, 29) # not a leap year
#> NAs introduced due to invalid month and/or day combinations.
#> [1] NA
More interesting is the handling of output after a valid date. Consider the following timestamp:
timelt <- as.POSIXlt(Sys.time(), tz = "UTC")
(timestamp <- strftime(timelt , "%Y-%m-%dT%H:%M:%S%z"))
#> [1] "2025-04-24T21:51:27+0000"
By default the time element is ignored:
(res <- fymd(timestamp))
#> [1] "2025-04-24"
res == as.Date(timestamp, tz = "UTC")
#> [1] TRUE
This ignoring of the timestamp is both good and bad. For timestamps it makes
perfect sense, but perhaps you have simple dates and a concern that some are
corrupted. For these we can use the strict
argument:
cdate <- "2025-04-16nonsense "
fymd(cdate)
#> [1] "2025-04-16"
fymd(cdate, strict = TRUE)
#> NAs introduced due to invalid date strings.
#> [1] NA
The character method of fymd()
parses input strings in a fixed, year, month
and day order. These values must be digits but can be separated by any non-digit
character. This is similar in spirit to the fastDate()
function in Simon
Urbanek’s fasttime package, using
pure text parsing and no system calls for maximum speed.
For extremely fast passing of POSIX style timestamps you will struggle to beat the performance of fasttime. This works fantastically for timestamps that do not need validation and are within the date range supported by the package (currently 1970-01-01 through to the year 2199).
fymd()
fills the, admittedly small, niche where you want fast parsing of YMD
strings along with date validation and support for a wider range of dates from
the Proleptic Gregorian calendar
(currently we support years in the range [-9999, 9999]
). This additional
capability does come with a small performance penalty but, hopefully, this has
been kept to a minimum and the implementation remains competitive.
library(microbenchmark)
# 1970-01-01 (UNIX epoch) to "2199-01-01"
dates <- seq.Date(from = .Date(0), to = fymd("2199-01-01"), by = "day")
# comparison timings for fymd (character method)
cdates <- format(dates)
(res_c <- microbenchmark(
fasttime = fasttime::fastDate(cdates),
fastymd = fymd(cdates),
ymd = ymd::ymd(cdates),
lubridate = lubridate::ymd(cdates),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fasttime 530.626 535.6445 590.102 540.0530 546.215 3573.341 100
#> fastymd 759.244 769.6580 797.582 777.3625 784.922 2186.300 100
#> ymd 4420.079 4514.3255 4617.550 4535.2805 4614.213 5878.654 100
#> lubridate 5016.447 5180.8155 6189.872 5331.4980 6651.158 36841.468 100
# comparison timings for fymd (numeric method)
ymd <- get_ymd(dates)
(res_n <- microbenchmark(
fastymd = fymd(ymd[[1]], ymd[[2]], ymd[[3]]),
lubridate = lubridate::make_date(ymd[[1]], ymd[[2]], ymd[[3]]),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fastymd 343.595 345.2425 367.1786 348.3880 370.139 462.517 100
#> lubridate 537.478 542.0665 769.6049 547.9725 1061.470 3038.528 100
# comparison timings for year getter
(res_get_year <- microbenchmark(
fastymd = get_year(dates),
ymd = ymd::year(dates),
lubridate = lubridate::year(dates),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fastymd 482.164 496.652 574.9981 503.0590 507.7575 1960.116 100
#> ymd 499.396 506.810 573.2474 511.8395 517.6060 3277.326 100
#> lubridate 7609.029 7619.865 8252.9036 7625.8210 7706.4820 39390.940 100
# comparison timings for month getter
(res_get_month <- microbenchmark(
fastymd = get_month(dates),
ymd = ymd::month(dates),
lubridate = lubridate::month(dates),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fastymd 449.323 464.4660 506.680 469.0645 472.5910 2012.544 100
#> ymd 532.418 537.7585 550.637 540.5080 544.4505 786.986 100
#> lubridate 8219.113 8271.7070 8832.283 8300.4755 9651.9500 11677.499 100
# comparison timings for mday getter
(res_get_mday <- microbenchmark(
fastymd = get_mday(dates),
ymd = ymd::mday(dates),
lubridate = lubridate::day(dates),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fastymd 450.595 462.7430 570.538 467.9580 473.047 2562.345 100
#> ymd 537.688 542.0065 568.018 543.6995 546.725 1931.322 100
#> lubridate 7552.984 7571.8345 7757.858 7582.7800 7631.016 9653.524 100