This vignette documents the shipped release-validation study for
SelectBoost.quantile. The goal is not to claim universal
superiority, but to show how the current prototype behaves against two
direct baselines:
lasso)lasso_tuned)selectboost_quantile() with tau-aware screening,
stronger tuning, complementary-pairs stability selection, capped
neighborhoods, and a hybrid support scoreThe included benchmark artifacts were generated with:
default_quantile_benchmark_scenarios()tau = c(0.25, 0.5, 0.75)selectboost_quantile(..., B = 8, step_num = 0.5, screen = "auto", tune_lambda = "cv", lambda_rule = "one_se", lambda_inflation = 1.25, complementary_pairs = TRUE, max_group_size = 15, nlambda = 8)threshold = 0.55summary_path <- system.file(
"extdata",
"validation",
"quantile_benchmark_release_summary.csv",
package = "SelectBoost.quantile"
)
raw_path <- system.file(
"extdata",
"validation",
"quantile_benchmark_release_raw.csv",
package = "SelectBoost.quantile"
)
resolve_validation_path <- function(installed_path, filename) {
if (nzchar(installed_path) && file.exists(installed_path)) {
return(installed_path)
}
candidates <- c(
file.path("inst", "extdata", "validation", filename),
file.path("..", "inst", "extdata", "validation", filename)
)
candidates <- candidates[file.exists(candidates)]
if (!length(candidates)) {
stop("Could not locate shipped validation artifact: ", filename, call. = FALSE)
}
candidates[[1]]
}
summary_path <- resolve_validation_path(summary_path, "quantile_benchmark_release_summary.csv")
raw_path <- resolve_validation_path(raw_path, "quantile_benchmark_release_raw.csv")
validation_summary <- utils::read.csv(summary_path, stringsAsFactors = FALSE)
validation_raw <- utils::read.csv(raw_path, stringsAsFactors = FALSE)
validation_summary$family <- sub("_tau_.*$", "", validation_summary$scenario)
validation_summary$is_high_dim <- grepl("^high_dim", validation_summary$scenario)
validation_summary$mean_f1 <- with(
validation_summary,
ifelse(
(2 * mean_tp + mean_fp + mean_fn) > 0,
2 * mean_tp / (2 * mean_tp + mean_fp + mean_fn),
NA_real_
)
)The first table averages the scenario-level summaries across the full
shipped grid, including the n < p stress regime.
overall <- aggregate(
cbind(mean_tpr, mean_fdr, mean_f1, failure_rate, mean_runtime_sec) ~ method,
data = validation_summary,
FUN = mean
)
knitr::kable(overall, digits = 3)| method | mean_tpr | mean_fdr | mean_f1 | failure_rate | mean_runtime_sec |
|---|---|---|---|---|---|
| lasso | 0.856 | 0.655 | 0.484 | 0 | 0.005 |
| lasso_tuned | 0.900 | 0.735 | 0.383 | 0 | 0.069 |
| selectboost | 0.734 | 0.063 | 0.808 | 0 | 3.794 |
Across the full grid, tuned lasso has the highest average
true-positive rate, but it also carries the highest average
false-discovery rate. The current selectboost_quantile()
release is markedly more conservative: it gives up some recall, but in
exchange it sharply lowers the false-discovery rate across the shipped
benchmark grid and yields the best average F1 score.
The high_dim family remains difficult, but it is no
longer a failure mode in the earlier sense of selecting almost
everything. The improved SelectBoost workflow now returns much sparser
and more stable supports than either lasso baseline.
high_dim <- subset(validation_summary, is_high_dim)
high_dim_overall <- aggregate(
cbind(mean_tpr, mean_fdr, mean_f1, failure_rate, mean_support_size) ~ method,
data = high_dim,
FUN = mean
)
knitr::kable(high_dim_overall, digits = 3)| method | mean_tpr | mean_fdr | mean_f1 | failure_rate | mean_support_size |
|---|---|---|---|---|---|
| lasso | 0.708 | 0.804 | 0.300 | 0 | 22.167 |
| lasso_tuned | 0.722 | 0.817 | 0.270 | 0 | 26.250 |
| selectboost | 0.611 | 0.067 | 0.738 | 0 | 3.917 |
The main remaining tradeoff is recall:
selectboost_quantile() is much cleaner than the lasso
baselines in high_dim, but it is still more conservative
and can miss weaker signals. Even so, on the shipped study it achieves
the best mean F1 score in that regime because it avoids the large
false-positive burden of the lasso baselines. This is the main reason
the package is best described as a polished v2 prototype
rather than a finished methodological endpoint.
failure_rows <- subset(validation_summary, failure_rate > 0)
if (nrow(failure_rows)) {
knitr::kable(failure_rows[, c(
"scenario",
"method",
"failure_rate",
"mean_tpr",
"mean_fdr",
"mean_support_size"
)], digits = 3)
} else {
cat("No method failures were recorded in the shipped study.\n")
}
#> No method failures were recorded in the shipped study.From a source checkout, regenerate benchmark artifacts into a temporary directory with:
out_dir <- file.path(tempdir(), "SelectBoost.quantile-validation")
system2(
"Rscript",
c("inst/scripts/run_quantile_benchmark.R", out_dir, "4", "0.55")
)The script loads the local package automatically when run from a
source tree. It writes raw results, aggregated summaries, and a
sessionInfo record to the chosen output directory. If no
output directory is supplied, it defaults to a subdirectory of
tempdir(). In the current source tree, that rerun uses the
screening, stronger lambda, complementary-pairs stability,
neighborhood-cap, and hybrid-support defaults defined in the package
benchmark helper.