Type: Package
Title: Cumulative Link Models with 'CmdStanR'
Version: 0.1.0
Author: Tomotaka Momozaki [aut, cre]
Maintainer: Tomotaka Momozaki <momozaki.stat@gmail.com>
Description: Fits cumulative link models (CLMs) for ordinal categorical data using 'CmdStanR'. Supports various link functions including logit, probit, cloglog, loglog, cauchit, and flexible parametric links such as Generalized Extreme Value (GEV), Asymmetric Exponential Power (AEP), and Symmetric Power. Models are pre-compiled using the 'instantiate' package for fast execution without runtime compilation. Methods are described in Agresti (2010, ISBN:978-0-470-08289-8), Wang and Dey (2011) <doi:10.1007/s10651-010-0154-8>, and Naranjo, Perez, and Martin (2015) <doi:10.1007/s11222-014-9449-1>.
License: MIT + file LICENSE
URL: https://t-momozaki.github.io/clmstan/, https://github.com/t-momozaki/clmstan
BugReports: https://github.com/t-momozaki/clmstan/issues
Depends: R (≥ 4.0.0)
Imports: instantiate, posterior, bayesplot, loo, stats
Suggests: testthat (≥ 3.0.0), knitr, rmarkdown, ordinal, cmdstanr (≥ 0.9.0)
Additional_repositories: https://stan-dev.r-universe.dev
SystemRequirements: CmdStan (https://mc-stan.org/users/interfaces/cmdstan)
Encoding: UTF-8
RoxygenNote: 7.3.3
Config/testthat/edition: 3
VignetteBuilder: knitr
NeedsCompilation: yes
Packaged: 2026-02-07 00:22:17 UTC; t_momozaki
Repository: CRAN
Date/Publication: 2026-02-10 20:20:02 UTC

clmstan: Cumulative Link Models with CmdStanR

Description

Fit cumulative link models (CLMs) for ordinal categorical data using CmdStanR. The package supports various link functions including standard links (logit, probit, cloglog, loglog, cauchit) and flexible parametric links (GEV, AEP, Symmetric Power, Aranda-Ordaz, log-gamma).

Models are pre-compiled using the instantiate package for fast execution without runtime compilation.

Main functions

Methods

Author(s)

Maintainer: Tomotaka Momozaki momozaki.stat@gmail.com

References

Flexible Link Functions:

Aranda-Ordaz, F. J. (1981). On two families of transformations to additivity for binary response data. Biometrika, 68(2), 357-363.

Li, D., Wang, X., & Dey, D. K. (2019). Power link functions in an ordinal regression model with Gaussian process priors. Environmetrics, 30(6), e2564.

Prentice, R. L. (1976). A generalization of the probit and logit methods for dose response curves. Biometrics, 32(4), 761-768.

Wang, X. & Dey, D. K. (2011). Generalized extreme value regression for ordinal response data. Environmental and Ecological Statistics, 18(4), 619-634.

Naranjo, L., Pérez, C. J., & Martín, J. (2015). Bayesian analysis of some models that use the asymmetric exponential power distribution. Statistics and Computing, 25(3), 497-514.

See Also

Useful links:


Asymmetric Exponential Power (AEP) link CDF

Description

The AEP distribution with alpha = 0.5 (fixed for identifiability).

Usage

aep_cdf(x, theta1, theta2)

Arguments

x

Numeric vector

theta1

Left tail parameter (theta1 > 0)

theta2

Right tail parameter (theta2 > 0)

Details

For x <= 0: F(x) = 0.5 * (1 - P(1/theta1, u1)) where u1 = (|x| * 2 * Gamma(1 + 1/theta1))^theta1

For x > 0: F(x) = 0.5 + 0.5 * P(1/theta2, u2) where u2 = (x * 2 * Gamma(1 + 1/theta2))^theta2

P(a, x) is the regularized incomplete gamma function (pgamma).

Special case: theta1 = theta2 gives a symmetric distribution. Note: theta = 2 has a Gaussian kernel but is NOT equal to probit due to scaling.

Reference: Naranjo et al. (2015) Statistics and Computing

Value

CDF values


Description

Apply custom prior specifications for link parameters

Usage

apply_custom_link_priors(result, link_prior)

Arguments

result

Current result list

link_prior

List of custom prior specifications

Value

Updated result list with custom priors


Apply user priors to Stan data

Description

Merges user-specified priors with default values and updates the Stan data list.

Usage

apply_priors_to_stan_data(stan_data, prior, threshold, full = FALSE)

Arguments

stan_data

A list of Stan data prepared by prepare_stan_data_*()

prior

A clm_prior object or NULL

threshold

The threshold structure ("flexible", "equidistant", "symmetric")

full

Whether this is a full model (with link parameter estimation)

Value

Updated stan_data list with prior values


Aranda-Ordaz asymmetric link CDF

Description

F(x; lambda) = 1 - (1 + exp(x))^(-lambda)

Usage

aranda_ordaz_cdf(x, lambda)

Arguments

x

Numeric vector

lambda

Shape parameter (lambda > 0)

Details

Special cases:

Reference: Aranda-Ordaz (1981) Biometrika

Value

CDF values


Combine Multiple Prior Specifications

Description

Combines multiple prior() objects into a single prior specification list.

Usage

## S3 method for class 'clm_prior_spec'
c(...)

Arguments

...

prior() objects to combine.

Value

An object of class "clm_prior_list" containing all prior specifications.

Examples

# Combine multiple priors
priors <- c(
  prior(normal(0, 2.5), class = "b"),
  prior(normal(0, 10), class = "Intercept"),
  prior(gamma(2, 0.1), class = "df")
)
print(priors)

Cauchy Distribution for Prior Specification

Description

Creates a Cauchy distribution object for use with prior().

Usage

cauchy(mu = 0, sigma = 1)

Arguments

mu

Location parameter. Default: 0

sigma

Scale parameter. Must be positive. Default: 1

Value

An object of class "clm_dist" representing a Cauchy distribution.

See Also

prior(), normal(), gamma(), student_t()

Examples

# Create a Cauchy prior (weakly informative)
cauchy(0, 2.5)

# Use with prior()
prior(cauchy(0, 2.5), class = "b")

Unified CDF dispatcher for clmstan

Description

Computes the cumulative distribution function (CDF) for any supported link function. This is the R equivalent of unified_F() in Stan.

Usage

clm_cdf(x, link, link_param = list())

Arguments

x

Numeric vector of values

link

Link function name (character)

link_param

List of link parameters (for flexible links)

Value

CDF values F(x)


Prior Specification for clmstan

Description

Create prior specifications for cumulative link models in clmstan.

Default priors:

Link parameter priors (when estimated):

Link Parameter Default Prior
tlink df gamma(2, 0.1)
aranda_ordaz lambda gamma(0.5, 0.5)
gev xi normal(0, 2)
sp r gamma(0.5, 0.5)
log_gamma lambda normal(0, 1)
aep theta1, theta2 gamma(2, 1)

Usage

clm_prior(
  beta_sd = NULL,
  c_sd = NULL,
  c1_mu = NULL,
  c1_sd = NULL,
  d_alpha = NULL,
  d_beta = NULL,
  cpos_sd = NULL,
  df_alpha = NULL,
  df_beta = NULL,
  lambda_ao_alpha = NULL,
  lambda_ao_beta = NULL,
  lambda_lg_mu = NULL,
  lambda_lg_sd = NULL,
  xi_mu = NULL,
  xi_sd = NULL,
  r_alpha = NULL,
  r_beta = NULL,
  theta1_alpha = NULL,
  theta1_beta = NULL,
  theta2_alpha = NULL,
  theta2_beta = NULL
)

Arguments

beta_sd

SD for normal prior on regression coefficients. Default: 2.5 (weakly informative)

c_sd

SD for normal prior on cutpoints (flexible threshold). Default: 10

c1_mu

Mean for normal prior on first cutpoint (equidistant threshold). Default: 0

c1_sd

SD for normal prior on first cutpoint (equidistant threshold). Default: 10

d_alpha

Gamma shape for interval d (equidistant threshold). Default: 2

d_beta

Gamma rate for interval d (equidistant threshold). Default: 0.5

cpos_sd

SD for half-normal prior on positive cutpoints (symmetric threshold). Default: 5

df_alpha

Gamma shape for tlink df. Default: 2

df_beta

Gamma rate for tlink df. Default: 0.1

lambda_ao_alpha

Gamma shape for aranda_ordaz lambda. Default: 0.5

lambda_ao_beta

Gamma rate for aranda_ordaz lambda. Default: 0.5

lambda_lg_mu

Normal mean for log_gamma lambda. Default: 0

lambda_lg_sd

Normal SD for log_gamma lambda. Default: 1

xi_mu

Normal mean for GEV xi. Default: 0

xi_sd

Normal SD for GEV xi. Default: 2

r_alpha

Gamma shape for SP r. Default: 0.5

r_beta

Gamma rate for SP r. Default: 0.5

theta1_alpha

Gamma shape for AEP theta1. Default: 2

theta1_beta

Gamma rate for AEP theta1. Default: 1

theta2_alpha

Gamma shape for AEP theta2. Default: 2

theta2_beta

Gamma rate for AEP theta2. Default: 1

Value

An object of class "clm_prior" containing prior specifications.

Examples

# Create a prior object (does not require Stan)
my_prior <- clm_prior(beta_sd = 2, c_sd = 5)
print(my_prior)

## Not run: 
# Examples below require CmdStan and compiled Stan models
data(wine, package = "ordinal")

# Default priors (no customization needed)
fit <- clm_stan(rating ~ temp, data = wine,
                chains = 2, iter = 500, warmup = 250, refresh = 0)

# Custom prior for regression coefficients
fit2 <- clm_stan(rating ~ temp, data = wine,
                 prior = clm_prior(beta_sd = 1),
                 chains = 2, iter = 500, warmup = 250, refresh = 0)

## End(Not run)

Fit a Cumulative Link Model using CmdStanR

Description

Fit a Cumulative Link Model using CmdStanR

Usage

clm_stan(
  formula,
  data,
  link = "logit",
  base = "logit",
  threshold = "flexible",
  link_param = NULL,
  prior = NULL,
  chains = 4,
  iter = 2000,
  warmup = NULL,
  ...
)

Arguments

formula

A formula specifying the model (response ~ predictors)

data

A data frame containing the variables in the formula

link

Link function. One of "logit" (default), "probit", "cloglog", "loglog", "cauchit", "tlink", "gev", "aep", "sp", "aranda_ordaz", "log_gamma"

base

Base distribution for SP link. One of "logit" (default), "probit", "cloglog", "loglog", "cauchit", "tlink". Ignored for other link functions.

threshold

Threshold structure. One of "flexible" (default), "equidistant", "symmetric"

link_param

A list of link parameters. For flexible links, values can be:

  • Numeric: Use as fixed value (e.g., list(df = 8))

  • "estimate": Estimate the parameter with Bayesian inference

prior

Prior specification. Can be either:

chains

Number of MCMC chains (default: 4)

iter

Total iterations per chain (default: 2000)

warmup

Warmup iterations per chain. If NULL (default), uses floor(iter/2)

...

Additional arguments passed to cmdstanr::sample()

Value

An object of class "clmstan"

Examples

## Not run: 
# Fit a proportional odds model
library(ordinal)
data(wine)
fit <- clm_stan(rating ~ temp + contact, data = wine, link = "logit")
print(fit)

# Fit with t-link (fixed df)
fit_t <- clm_stan(rating ~ temp, data = wine, link = "tlink",
                  link_param = list(df = 8))

# Fit with GEV link (estimate xi)
fit_gev <- clm_stan(rating ~ temp, data = wine, link = "gev",
                    link_param = list(xi = "estimate"))

## End(Not run)

clmstan S3 Class

Description

The clmstan class represents a fitted cumulative link model. It contains the CmdStanR fit object and additional metadata.

Slots

fit

The CmdStanMCMC object from cmdstanr

formula

The model formula

data

The original data frame

link

The link function used

base

The base distribution (for SP link)

threshold

The threshold structure

link_param

Link parameter settings (for flexible links)

full

TRUE if link parameters were estimated (Bayesian inference), FALSE if they were fixed at user-specified values

K

Number of response categories (cached from data)

N

Number of observations (extracted from data for efficiency)

P

Number of predictors (extracted from design matrix)


Complementary log-log CDF

Description

F(x) = 1 - exp(-exp(x))

Usage

cloglog_cdf(x)

Arguments

x

Numeric vector

Details

This corresponds to the Gumbel (maximum) distribution.

Value

CDF values


Extract coefficients from clmstan objects

Description

Returns posterior point estimates (mean or median) for all model parameters.

Usage

## S3 method for class 'clmstan'
coef(object, type = c("mean", "median"), ...)

Arguments

object

A clmstan object

type

Type of point estimate: "mean" (default) or "median"

...

Additional arguments (ignored)

Value

A named numeric vector with:

Examples

## Not run: 
fit <- clm_stan(rating ~ temp, data = wine)
coef(fit)
coef(fit, type = "median")

## End(Not run)


Compute category probabilities for all draws

Description

Compute category probabilities for all draws

Usage

compute_probs_all(draws, X, link, base_link_param, K, full = FALSE)

Arguments

draws

Output from extract_prediction_draws()

X

Design matrix (N x P)

link

Link function name

base_link_param

Fixed link parameters

K

Number of categories

full

Whether link parameters were estimated

Value

Array of S x N x K probabilities


Compute category probabilities for a single draw

Description

Compute category probabilities for a single draw

Usage

compute_probs_single(c_vec, eta_vec, link, link_param, K)

Arguments

c_vec

Vector of K-1 cutpoints

eta_vec

Vector of N linear predictors

link

Link function name

link_param

Link parameters (for this draw)

K

Number of categories

Value

Matrix of N x K probabilities


Compute relative effective sample size for log-likelihood

Description

Compute relative effective sample size for log-likelihood

Usage

compute_r_eff(object, log_lik, cores = 1)

Arguments

object

A clmstan object

log_lik

Log-likelihood matrix (S x N)

cores

Number of cores for parallel computation

Value

A vector of relative effective sample sizes (length N)


Convert Prior Specification to Internal Format

Description

Internal function to convert distribution-based prior specifications to the internal format used by apply_priors_to_stan_data().

Usage

convert_prior_spec_to_legacy(prior)

Arguments

prior

A clm_prior_spec, clm_prior_list, or clm_prior object

Value

A list with clm_prior class containing prior parameter values


MCMC Diagnostics for clmstan objects

Description

Provides a summary of MCMC convergence diagnostics including HMC-specific diagnostics (divergences, treedepth, E-BFMI) and general convergence measures (Rhat, ESS).

Usage

diagnostics(object, ...)

## S3 method for class 'clmstan'
diagnostics(
  object,
  detail = FALSE,
  rhat_threshold = 1.01,
  ess_threshold = 400,
  ...
)

Arguments

object

A clmstan object

...

Additional arguments (ignored)

detail

Logical. If TRUE, show full parameter-level diagnostics table. If FALSE (default), show only summary and any problematic parameters.

rhat_threshold

Threshold for flagging high Rhat values. Default 1.01.

ess_threshold

Threshold for flagging low ESS values. Default 400.

Details

The function checks for the following issues:

Value

Invisibly returns a list containing:

Examples

## Not run: 
fit <- clm_stan(rating ~ temp, data = wine)
diagnostics(fit)
diagnostics(fit, detail = TRUE)

## End(Not run)


Extract ACF values from clmstan object

Description

Computes autocorrelation function (ACF) values for MCMC chains and returns them in a tidy data frame format.

Usage

extract_acf(object, pars = NULL, lags = 20, ...)

Arguments

object

A clmstan object

pars

Character vector of parameter names. If NULL (default), uses beta, c_transformed (except first), and beta0.

lags

Maximum number of lags to compute. Default is 20.

...

Additional arguments (ignored)

Details

The ACF measures how correlated each draw is with previous draws in the same chain. High autocorrelation at many lags indicates slow mixing and the need for more samples or reparameterization.

Ideally, ACF should drop to near zero within a few lags. Persistent high autocorrelation suggests the sampler is exploring the posterior slowly.

Value

A data frame with columns:

Examples

## Not run: 
fit <- clm_stan(rating ~ temp, data = wine)
acf_df <- extract_acf(fit)
head(acf_df)

# Plot ACF for specific parameters
library(ggplot2)
acf_df |>
  dplyr::filter(parameter == "beta[1]") |>
  ggplot(aes(x = lag, y = acf, color = factor(chain))) +
  geom_line() +
  geom_hline(yintercept = 0, linetype = "dashed")

## End(Not run)


Description

Extract link parameter draws for full model

Usage

extract_link_param_draws(object)

Arguments

object

A clmstan object

Value

List of link parameter vectors


Description

Extract posterior summary for link parameters

Usage

extract_link_params_summary(object, probs = c(0.025, 0.5, 0.975))

Arguments

object

A clmstan object

probs

Quantile probabilities

Value

A data frame with posterior summaries, or NULL if no estimated params


Extract log-likelihood matrix from clmstan object

Description

Extract log-likelihood matrix from clmstan object

Usage

extract_log_lik(object)

Arguments

object

A clmstan object

Value

An S x N matrix of log-likelihood values, where S is the number of posterior samples and N is the number of observations.


Extract posterior summary for specified parameters

Description

Extract posterior summary for specified parameters

Usage

extract_posterior_summary(fit, pars, probs = c(0.025, 0.5, 0.975))

Arguments

fit

A CmdStanMCMC object

pars

Character vector of parameter names

probs

Quantile probabilities

Value

A data frame with posterior summaries


Extract posterior draws needed for prediction

Description

Extract posterior draws needed for prediction

Usage

extract_prediction_draws(object, ndraws = NULL)

Arguments

object

A clmstan object

ndraws

Number of draws to extract

Value

List with c (cutpoints), beta (coefficients), link_params (if full model)


Extract y_rep from Stan fit

Description

Extract y_rep from Stan fit

Usage

extract_y_rep(object, ndraws = NULL)

Arguments

object

A clmstan object

ndraws

Number of draws to extract

Value

S x N integer matrix


Fitted values for clmstan objects

Description

Returns expected category probabilities for each observation. This is equivalent to predict(object, type = "probs", summary = TRUE).

Usage

## S3 method for class 'clmstan'
fitted(
  object,
  newdata = NULL,
  summary = TRUE,
  robust = FALSE,
  probs = c(0.025, 0.975),
  ndraws = NULL,
  ...
)

Arguments

object

A clmstan object returned by clm_stan().

newdata

Optional data frame for prediction. If NULL (default), predictions are made for the original training data.

summary

Logical. If TRUE (default), return summary statistics (mean, SD, quantiles). If FALSE, return raw posterior draws.

robust

Logical. If TRUE, use median instead of mean for point estimates. Default is FALSE.

probs

Numeric vector of probabilities for quantiles. Default is c(0.025, 0.975) for 95% credible intervals.

ndraws

Number of posterior draws to use. If NULL (default), all available draws are used.

...

Additional arguments (currently ignored).

Value

If summary = TRUE (default): A data frame with N rows and columns for each category probability (P[Y=1], P[Y=2], etc.). If summary = FALSE: An S x N x K array of probability draws.

See Also

predict.clmstan(), posterior_predict.clmstan()


Flat (Improper Uniform) Prior Distribution

Description

Creates a flat (improper uniform) distribution object for use with prior(). A flat prior assigns equal probability density to all values, which is improper (does not integrate to 1) but can be used when the likelihood provides sufficient information for identification.

Usage

flat()

Value

An object of class "clm_dist" representing a flat distribution.

Note

Flat priors are supported for:

Using flat priors may lead to improper posteriors if the likelihood does not provide sufficient information. For thresholds with ordered constraints, Stan's internal transformation provides implicit regularization.

See Also

prior(), normal(), student_t(), cauchy()

Examples

# Create a flat prior for regression coefficients
prior(flat(), class = "b")

# Flat prior for thresholds (flexible)
prior(flat(), class = "Intercept")

Format a distribution object as a string

Description

Format a distribution object as a string

Usage

format_dist(x)

Arguments

x

A clm_dist object

Value

A character string representation


Gamma Distribution for Prior Specification

Description

Creates a gamma distribution object for use with prior().

Usage

gamma(alpha, beta)

Arguments

alpha

Shape parameter of the gamma distribution. Must be positive.

beta

Rate parameter of the gamma distribution. Must be positive.

Value

An object of class "clm_dist" representing a gamma distribution.

Note

This function masks base::gamma(). To use the base gamma function, use base::gamma() explicitly.

See Also

prior(), normal(), student_t(), cauchy()

Examples

# Create a gamma prior
gamma(2, 0.1)

# Use with prior() for degrees of freedom
prior(gamma(2, 0.1), class = "df")

Description

Get base (fixed) link parameters from object

Usage

get_base_link_params(object)

Arguments

object

A clmstan object

Value

List of fixed link parameters


Get draws for plotting with bayesplot

Description

Returns draws for default parameters (excluding log_lik, y_rep, eta, raw c).

Usage

get_clmstan_draws(object, pars = NULL)

Arguments

object

A clmstan object

pars

Character vector of parameter names (NULL for default)

Value

A draws_array suitable for bayesplot


Get coefficient names from design matrix

Description

Get coefficient names from design matrix

Usage

get_coef_names(object)

Arguments

object

A clmstan object

Value

A character vector of coefficient names


Description

Get default prior for a link parameter

Usage

get_default_link_prior(link, param)

Arguments

link

A character string specifying the link function

param

A character string specifying the parameter name

Value

A list with prior specification (family, args)


Get default priors for a threshold structure

Description

Get default priors for a threshold structure

Usage

get_default_priors(threshold = "flexible")

Arguments

threshold

A character string specifying the threshold structure ("flexible", "equidistant", or "symmetric")

Value

A list with default prior values


Get default priors for threshold parameters

Description

Get default priors for threshold parameters

Usage

get_default_threshold_prior(threshold)

Arguments

threshold

A character string specifying the threshold structure

Details

Default priors:

Value

A list with default prior specifications


Description

Get link parameters for a specific draw

Usage

get_link_param_for_draw(link_param_draws, s, base_param)

Arguments

link_param_draws

List of link parameter vectors

s

Draw index

base_param

Base (fixed) parameters

Value

Link parameters for draw s


Description

Get required parameters for a link function

Usage

get_link_params(link)

Arguments

link

A character string specifying the link function

Value

A character vector of required parameter names


Description

Converts a link function name to the integer code used in Stan.

Usage

get_link_type(link)

Arguments

link

A character string specifying the link function

Details

Link type mapping:

Value

An integer (1-11) representing the link type


Get Stan model name based on threshold and full model flag

Description

Get Stan model name based on threshold and full model flag

Usage

get_model_name(threshold, full = FALSE)

Arguments

threshold

A character string specifying the threshold structure

full

Logical indicating whether link parameters are estimated

Value

A character string with the Stan model name


Get base type number for SP link

Description

Converts a base distribution name to the integer code used in Stan for SP link.

Usage

get_sp_base_type(base)

Arguments

base

A character string specifying the base distribution

Details

Base type mapping (symmetric distributions only, per Li et al. 2019):

Value

An integer (1-6) representing the base type


Generate threshold names in ordinal::clm style

Description

Creates threshold names like "1|2", "2|3", etc. based on factor levels.

Usage

get_threshold_names(object)

Arguments

object

A clmstan object

Value

A character vector of threshold names


Get threshold parameter information

Description

Get threshold parameter information

Usage

get_threshold_params(threshold)

Arguments

threshold

A character string specifying the threshold structure

Details

Threshold parameters:

Value

A list with parameter names and their Stan types


Generalized Extreme Value (GEV) link CDF

Description

F(x; xi) = exp(-(1 + xi * x)^(-1/xi))

Usage

gev_cdf(x, xi)

Arguments

x

Numeric vector

xi

Shape parameter

Details

Special cases:

Reference: Wang & Dey (2011)

Value

CDF values


Check if object is clmstan

Description

Check if object is clmstan

Usage

is.clmstan(x)

Arguments

x

An object to test

Value

TRUE if x is a clmstan object


Description

Check if a link function requires parameters

Usage

is_flexible_link(link)

Arguments

link

A character string specifying the link function

Value

TRUE if the link requires additional parameters


Description

clmstan supports the following link functions for cumulative link models:

Standard links (no additional parameters):

Flexible links (with additional parameters):

Link Parameter Specification

Flexible link parameters can be either fixed or estimated (inferred).

Fixed parameters: Specify a numeric value

clm_stan(y ~ x, link = "tlink", link_param = list(df = 8))
clm_stan(y ~ x, link = "gev", link_param = list(xi = 0))  # equals loglog
clm_stan(y ~ x, link = "aep", link_param = list(theta1 = 2, theta2 = 2))  # symmetric

Estimated parameters: Use "estimate" (with default prior)

clm_stan(y ~ x, link = "tlink", link_param = list(df = "estimate"))
clm_stan(y ~ x, link = "gev", link_param = list(xi = "estimate"))

Custom priors: Combine "estimate" with prior argument

clm_stan(y ~ x, link = "gev",
         link_param = list(xi = "estimate"),
         prior = prior(normal(0, 0.3), class = "xi"))

Default Priors for Link Parameters

When using "estimate", the following default priors are used:

Link Parameter Default Prior Notes
tlink df gamma(2, 0.1) Mode around 10, allows heavy tails
aranda_ordaz lambda gamma(0.5, 0.5) Centered near 1 (logit)
gev xi normal(0, 2) Weakly informative, Wang & Dey (2011)
sp r gamma(0.5, 0.5) Centered near 1 (base distribution)
log_gamma lambda normal(0, 1) Centered at 0 (probit)
aep theta1 gamma(2, 1) Mode at 1, symmetric at theta1=theta2
aep theta2 gamma(2, 1) Mode at 1, symmetric at theta1=theta2

SP Link Details (Li et al., 2019)

The Symmetric Power link uses a symmetric base distribution F_0, specified via the base argument. Supported bases:

Note: Li et al. (2019) define F_0 as a CDF "whose corresponding PDF is symmetric about 0".


Log-gamma link CDF

Description

Based on the log-gamma distribution:

Usage

loggamma_cdf(x, lambda)

Arguments

x

Numeric vector

lambda

Shape parameter

Details

Reference: Prentice (1976) Biometrics

Value

CDF values


Log-log CDF

Description

F(x) = exp(-exp(-x))

Usage

loglog_cdf(x)

Arguments

x

Numeric vector

Details

This corresponds to the Gumbel (minimum) distribution. It is the reflection of cloglog: loglog(x) = 1 - cloglog(-x)

Value

CDF values


Leave-One-Out Cross-Validation for clmstan objects

Description

Computes approximate leave-one-out cross-validation (LOO-CV) for a fitted cumulative link model using Pareto smoothed importance sampling (PSIS).

Usage

## S3 method for class 'clmstan'
loo(x, ..., r_eff = NULL, cores = getOption("mc.cores", 1), save_psis = FALSE)

Arguments

x

A clmstan object returned by clm_stan.

...

Additional arguments passed to loo.

r_eff

A vector of relative effective sample sizes for each observation, or NULL (default) to compute them automatically using relative_eff. Set to NA to skip r_eff computation (faster but diagnostics may be over-optimistic).

cores

The number of cores to use for parallel computation. Defaults to getOption("mc.cores", 1).

save_psis

If TRUE, the PSIS object is saved in the returned object. This is required for some downstream functions like E_loo(). Default is FALSE.

Details

The function extracts the log-likelihood matrix (log_lik) computed in the generated quantities block of the Stan model and passes it to loo.

Pareto k diagnostics: Observations with high Pareto k values (k > 0.7) indicate potential problems with the LOO approximation for those observations. Use plot() on the returned object to visualize the Pareto k values.

Model comparison: Use loo_compare to compare multiple models. Models with higher elpd_loo are preferred.

Value

An object of class c("psis_loo", "loo") containing:

See Also

waic.clmstan for WAIC computation, loo for details on the LOO algorithm, loo_compare for model comparison.

Examples

## Not run: 
fit <- clm_stan(rating ~ temp, data = wine)
loo_result <- loo(fit)
print(loo_result)
plot(loo_result)

# Compare two models
fit1 <- clm_stan(rating ~ temp, data = wine, link = "logit")
fit2 <- clm_stan(rating ~ temp, data = wine, link = "probit")
loo::loo_compare(loo(fit1), loo(fit2))

## End(Not run)


Create design matrix

Description

Creates the design matrix for predictors (without intercept column).

Usage

make_design_matrix(formula, data)

Arguments

formula

A formula object or terms (RHS only)

data

A data frame

Value

A design matrix (N x P)


Map Parameter Class and Distribution to Legacy Parameter Names

Description

Map Parameter Class and Distribution to Legacy Parameter Names

Usage

map_class_to_params(class, dist)

Arguments

class

The parameter class

dist

A clm_dist object

Value

A named list of legacy parameter values


Check if any link parameter requires estimation

Description

Check if any link parameter requires estimation

Usage

needs_full_model(link_param)

Arguments

link_param

A list of link parameters

Value

TRUE if any parameter is set to "estimate"


Create a clmstan object (internal constructor)

Description

This is an internal constructor called by clm_stan(). Users should not call this function directly.

Usage

new_clmstan(
  fit,
  formula,
  data,
  link,
  base,
  threshold,
  link_param = NULL,
  full = FALSE,
  K,
  N,
  P
)

Arguments

fit

A CmdStanMCMC object

formula

The model formula

data

The original data

link

The link function

base

The base distribution (for SP link)

threshold

The threshold structure

link_param

Link parameter settings

full

Whether full model was used

K

Number of categories

N

Number of observations

P

Number of predictors

Value

An object of class "clmstan"


Normal Distribution for Prior Specification

Description

Creates a normal distribution object for use with prior().

Usage

normal(mu = 0, sigma = 1)

Arguments

mu

Mean of the normal distribution. Default: 0

sigma

Standard deviation of the normal distribution. Must be positive. Default: 1

Value

An object of class "clm_dist" representing a normal distribution.

See Also

prior(), gamma(), student_t(), cauchy()

Examples

# Create a normal prior
normal(0, 2.5)

# Use with prior()
prior(normal(0, 2.5), class = "b")

Parse formula for CLM

Description

Parse formula for CLM

Usage

parse_clm_formula(formula, data)

Arguments

formula

A formula object

data

A data frame

Value

A list with response and predictor_formula


Plot method for clmstan objects

Description

Produces diagnostic plots using the bayesplot package.

Usage

## S3 method for class 'clmstan'
plot(
  x,
  type = c("trace", "dens", "hist", "areas", "intervals", "acf"),
  pars = NULL,
  ...
)

Arguments

x

A clmstan object

type

Type of plot: "trace" (default), "dens", "hist", "areas", "intervals", or "acf" (autocorrelation).

pars

Character vector of parameter names to plot. If NULL, plots beta, c_transformed (except first), and beta0.

...

Additional arguments passed to bayesplot functions. For "acf" type, you can use lags to control the number of lags.

Value

A ggplot object

Examples

## Not run: 
fit <- clm_stan(rating ~ temp, data = wine)
plot(fit)                    # trace plots
plot(fit, type = "dens")     # density plots
plot(fit, type = "intervals") # credible intervals
plot(fit, type = "acf")      # autocorrelation plots
plot(fit, pars = "beta")     # only beta parameters

## End(Not run)


Posterior predictive distribution for clmstan objects

Description

Draws from the posterior predictive distribution. For each posterior sample, a predicted category is sampled from the categorical distribution with the predicted probabilities.

Usage

posterior_predict.clmstan(object, newdata = NULL, ndraws = NULL, ...)

Arguments

object

A clmstan object returned by clm_stan().

newdata

Optional data frame for prediction. If NULL (default), predictions are made for the original training data.

ndraws

Number of posterior draws to use. If NULL (default), all available draws are used.

...

Additional arguments (currently ignored).

Value

An integer matrix of dimension S x N containing predicted categories (1 to K), where S is the number of posterior draws and N is the number of observations.

See Also

predict.clmstan(), fitted.clmstan()


Predict method for clmstan objects

Description

Generates predictions from a fitted cumulative link model.

Usage

## S3 method for class 'clmstan'
predict(
  object,
  newdata = NULL,
  type = c("class", "probs"),
  summary = TRUE,
  robust = FALSE,
  probs = c(0.025, 0.975),
  ndraws = NULL,
  ...
)

Arguments

object

A clmstan object returned by clm_stan().

newdata

Optional data frame for prediction. If NULL (default), predictions are made for the original training data.

type

Type of prediction:

  • "class": Predicted category (most likely class)

  • "probs": Predicted probabilities for each category

summary

Logical. If TRUE (default), return summary statistics (mean, SD, quantiles). If FALSE, return raw posterior draws.

robust

Logical. If TRUE, use median instead of mean for point estimates. Default is FALSE.

probs

Numeric vector of probabilities for quantiles. Default is c(0.025, 0.975) for 95% credible intervals.

ndraws

Number of posterior draws to use. If NULL (default), all available draws are used.

...

Additional arguments (currently ignored).

Value

Depending on type and summary:

See Also

fitted.clmstan() for expected probabilities, posterior_predict.clmstan() for posterior predictive samples.


Description

Fills in default values for link parameters and validates them.

Usage

prepare_link_params(link, link_param = NULL)

Arguments

link

Link function name

link_param

A list of user-specified link parameters

Value

A list with all link parameters (df, lambda, xi, r, base_type, theta1, theta2)


Description

Determines which parameters to estimate vs. fix, and sets up priors.

Usage

prepare_link_params_full(link, link_param = NULL, link_prior = NULL)

Arguments

link

Link function name

link_param

A list of user-specified link parameters

link_prior

A list of custom prior specifications

Value

A list with estimation flags, fixed values, and prior hyperparameters


Prepare design matrix for prediction

Description

Prepare design matrix for prediction

Usage

prepare_prediction_matrix(object, newdata)

Arguments

object

A clmstan object

newdata

Data frame for prediction

Value

Design matrix (N x P)


Prepare data for Stan model

Description

Prepare data for Stan model

Usage

prepare_stan_data(
  formula,
  data,
  link = "logit",
  link_param = NULL,
  prior_beta_sd = 2.5,
  prior_c_sd = 10
)

Arguments

formula

A formula specifying the model

data

A data frame containing the variables

link

Link function name

link_param

A list of link parameters (for flexible links)

prior_beta_sd

Prior SD for regression coefficients (default: 2.5)

prior_c_sd

Prior SD for cutpoints (default: 10)

Value

A list suitable for passing to CmdStan


Dispatch to appropriate data preparation function

Description

Routes to the correct prepare function based on threshold structure and whether link parameters are being estimated.

Usage

prepare_stan_data_dispatch(
  formula,
  data,
  link,
  base,
  link_param,
  threshold,
  full,
  prior
)

Arguments

formula

A formula specifying the model

data

A data frame containing the variables

link

Link function name

base

Base distribution for SP link

link_param

A list of link parameters

threshold

Threshold structure

full

Whether to use full model (estimate link parameters)

prior

A clm_prior object or NULL for default priors

Value

A list suitable for passing to CmdStan


Prepare data for equidistant threshold Stan model

Description

Creates a Stan data list for cumulative link models with equidistant (equally spaced) thresholds: c_k = c_1 + (k-1) * d

Usage

prepare_stan_data_equidistant(
  formula,
  data,
  link = "logit",
  link_param = NULL,
  prior_beta_sd = 2.5,
  prior_c1_mu = 0,
  prior_c1_sd = 10,
  prior_d_alpha = 2,
  prior_d_beta = 0.5
)

Arguments

formula

A formula specifying the model

data

A data frame containing the variables

link

Link function name

link_param

A list of link parameters (for flexible links)

prior_beta_sd

Prior SD for regression coefficients (default: 2.5)

prior_c1_mu

Prior mean for first threshold c1 (default: 0)

prior_c1_sd

Prior SD for first threshold c1 (default: 10)

prior_d_alpha

Gamma prior shape for interval d (default: 2)

prior_d_beta

Gamma prior rate for interval d (default: 0.5)

Value

A list suitable for passing to CmdStan (clm_equidistant.stan)


Prepare data for full Stan model with link parameter inference

Description

Prepare data for full Stan model with link parameter inference

Usage

prepare_stan_data_full(
  formula,
  data,
  link = "logit",
  link_param = NULL,
  prior_beta_sd = 2.5,
  prior_c_sd = 10,
  link_prior = NULL
)

Arguments

formula

A formula specifying the model

data

A data frame containing the variables

link

Link function name

link_param

A list of link parameters. Values can be:

  • numeric: Use as fixed value

  • "estimate": Estimate the parameter with default prior

prior_beta_sd

Prior SD for regression coefficients (default: 2.5)

prior_c_sd

Prior SD for cutpoints (default: 10)

link_prior

A list of custom prior specifications for link parameters

Value

A list suitable for passing to CmdStan (clm_full.stan)


Prepare data for symmetric threshold Stan model

Description

Creates a Stan data list for cumulative link models with symmetric thresholds centered at 0: c[k] = -c[K-k]

Usage

prepare_stan_data_symmetric(
  formula,
  data,
  link = "logit",
  link_param = NULL,
  prior_beta_sd = 2.5,
  prior_cpos_sd = 5
)

Arguments

formula

A formula specifying the model

data

A data frame containing the variables

link

Link function name

link_param

A list of link parameters (for flexible links)

prior_beta_sd

Prior SD for regression coefficients (default: 2.5)

prior_cpos_sd

Prior SD for positive thresholds (default: 5)

Details

Examples:

Value

A list suitable for passing to CmdStan (clm_symmetric.stan)


Print method for clm_dist objects

Description

Print method for clm_dist objects

Usage

## S3 method for class 'clm_dist'
print(x, ...)

Arguments

x

A clm_dist object

...

Additional arguments (ignored)

Value

Invisibly returns the input clm_dist object.


Print method for clm_prior objects

Description

Print method for clm_prior objects

Usage

## S3 method for class 'clm_prior'
print(x, ...)

Arguments

x

A clm_prior object

...

Additional arguments (ignored)

Value

Invisibly returns the input clm_prior object.


Print method for clm_prior_list objects

Description

Print method for clm_prior_list objects

Usage

## S3 method for class 'clm_prior_list'
print(x, ...)

Arguments

x

A clm_prior_list object

...

Additional arguments (ignored)

Value

Invisibly returns the input clm_prior_list object.


Print method for clm_prior_spec objects

Description

Print method for clm_prior_spec objects

Usage

## S3 method for class 'clm_prior_spec'
print(x, ...)

Arguments

x

A clm_prior_spec object

...

Additional arguments (ignored)

Value

Invisibly returns the input clm_prior_spec object.


Print method for clmstan objects

Description

Print method for clmstan objects

Usage

## S3 method for class 'clmstan'
print(x, ...)

Arguments

x

A clmstan object

...

Additional arguments (ignored)

Value

Invisibly returns x


Print method for summary.clmstan objects

Description

Print method for summary.clmstan objects

Usage

## S3 method for class 'summary.clmstan'
print(x, ...)

Arguments

x

A summary.clmstan object

...

Additional arguments (ignored)

Value

Invisibly returns x


Specify Prior Distributions

Description

Specify prior distributions for model parameters using distribution functions.

Usage

prior(prior, class = "b", coef = "")

Arguments

prior

A distribution object created by normal(), gamma(), student_t(), or cauchy().

class

The parameter class. Valid classes are:

  • "b": Regression coefficients (beta)

  • "Intercept": Cutpoints/thresholds (flexible)

  • "c1": First cutpoint (equidistant)

  • "d": Threshold interval (equidistant)

  • "cpos": Positive cutpoints (symmetric)

  • "df": Degrees of freedom (tlink)

  • "lambda_ao": Lambda parameter (aranda_ordaz)

  • "lambda_lg": Lambda parameter (log_gamma)

  • "xi": Xi parameter (gev)

  • "r": R parameter (sp)

  • "theta1", "theta2": Theta parameters (aep)

coef

Optional coefficient name (for future extension).

Value

An object of class "clm_prior_spec" representing the prior specification.

See Also

normal(), gamma(), student_t(), cauchy(), clm_prior()

Examples

# Specify a normal prior for regression coefficients
prior(normal(0, 2.5), class = "b")

# Specify a gamma prior for degrees of freedom
prior(gamma(2, 0.1), class = "df")

# Combine multiple priors
c(
  prior(normal(0, 2.5), class = "b"),
  prior(normal(0, 10), class = "Intercept")
)

Symmetric Power (SP) link CDF

Description

The SP link applies a power transformation to a base CDF:

Usage

sp_cdf(x, r, base, df = 8)

Arguments

x

Numeric vector

r

Power parameter (r > 0)

base

Base distribution name

df

Degrees of freedom (for tlink base)

Details

For r <= 1: F_sp(x) = F_0(x/r)^r For r > 1: F_sp(x) = 1 - F_0(-r*x)^(1/r)

Special case: r = 1 gives the base distribution F_0.

Reference: Li et al. (2019) Environmetrics

Value

CDF values


Student-t Distribution for Prior Specification

Description

Creates a Student-t distribution object for use with prior().

Usage

student_t(df = 3, mu = 0, sigma = 1)

Arguments

df

Degrees of freedom. Must be positive. Default: 3

mu

Location parameter. Default: 0

sigma

Scale parameter. Must be positive. Default: 1

Value

An object of class "clm_dist" representing a Student-t distribution.

See Also

prior(), normal(), gamma(), cauchy()

Examples

# Create a Student-t prior with heavy tails
student_t(3, 0, 2.5)

# Use with prior()
prior(student_t(3, 0, 2.5), class = "b")

Summarize class prediction draws

Description

Summarize class prediction draws

Usage

summarize_class_draws(class_draws, robust = FALSE, probs = c(0.025, 0.975))

Arguments

class_draws

S x N matrix of predicted classes

robust

Use median instead of mean

probs

Quantile probabilities

Value

Data frame with summary statistics


Summarize probability draws

Description

Summarize probability draws

Usage

summarize_probs_draws(probs_array, robust = FALSE)

Arguments

probs_array

S x N x K array of probability draws

robust

Use median instead of mean

Value

Data frame with mean probabilities per category


Summary method for clmstan objects

Description

Summary method for clmstan objects

Usage

## S3 method for class 'clmstan'
summary(object, probs = c(0.025, 0.5, 0.975), digits = 3, ...)

Arguments

object

A clmstan object

probs

Quantile probabilities for credible intervals

digits

Number of significant digits for display

...

Additional arguments (ignored)

Value

An object of class "summary.clmstan" containing:


Description

Get supported link functions

Usage

supported_links(type = c("all", "standard", "flexible"))

Arguments

type

Character string specifying which links to return:

  • "all" (default): All supported link functions

  • "standard": Standard links without additional parameters

  • "flexible": Flexible links with additional parameters

Value

A character vector of supported link function names

Examples

supported_links()
supported_links("standard")
supported_links("flexible")

Get supported threshold structures

Description

Get supported threshold structures

Usage

supported_thresholds()

Value

A character vector of supported threshold structure names

Examples

supported_thresholds()

Validate full model parameters

Description

Checks that estimation flags make sense for the specified link.

Usage

validate_full_params(link, params)

Arguments

link

Link function name

params

Full model parameters list


Description

Check if a link function is valid

Usage

validate_link(link)

Arguments

link

A character string specifying the link function

Value

TRUE if valid, otherwise throws an error


Description

Validate link parameters

Usage

validate_link_params(link, params)

Arguments

link

Link function name

params

A list of link parameters


Validate prediction inputs

Description

Validate prediction inputs

Usage

validate_prediction_input(object, newdata, ndraws)

Arguments

object

A clmstan object

newdata

New data frame (or NULL)

ndraws

Number of draws to use


Validate prior specification for a model

Description

Checks that the prior specification is compatible with the model settings.

Usage

validate_prior(prior, threshold, link, link_param)

Arguments

prior

A clm_prior object or NULL

threshold

The threshold structure

link

The link function

link_param

The link parameters

Value

TRUE if valid, otherwise throws a warning


Validate that the distribution is compatible with the parameter class

Description

Validate that the distribution is compatible with the parameter class

Usage

validate_prior_class_dist(class, dist)

Arguments

class

The parameter class

dist

A clm_dist object


Validate prior values

Description

Validate prior values

Usage

validate_prior_values(prior)

Arguments

prior

A list of prior values


Validate threshold specification

Description

Validate threshold specification

Usage

validate_threshold(threshold, K = NULL)

Arguments

threshold

A character string specifying the threshold structure

K

Number of ordinal categories (optional)

Details

Validation rules:

Value

TRUE if valid, otherwise throws an error


Widely Applicable Information Criterion for clmstan objects

Description

Computes the Widely Applicable Information Criterion (WAIC) for a fitted cumulative link model.

Usage

## S3 method for class 'clmstan'
waic(x, ...)

Arguments

x

A clmstan object returned by clm_stan.

...

Additional arguments (currently ignored).

Details

WAIC is an alternative to LOO-CV that is asymptotically equivalent to leave-one-out cross-validation. However, LOO-CV with PSIS is generally preferred because:

For most purposes, loo.clmstan is recommended over WAIC.

Value

An object of class c("waic", "loo") containing:

See Also

loo.clmstan for LOO-CV (recommended), waic for details on WAIC computation, loo_compare for model comparison.

Examples

## Not run: 
fit <- clm_stan(rating ~ temp, data = wine)
waic_result <- waic(fit)
print(waic_result)

## End(Not run)