This vignette contains answers to some frequently asked questions
about the package bpnreg
for analyzing Bayesian projected
normal circular regression and mixed-effects models. Answers are given
and illustrated with the Motor
and Maps
datasets.
To obtain more information about the Motor
and
Maps
datasets the following code can be used:
A circular regression model for the Motor
data can be
fit using the bpnr
function as follows:
Note that categorical variables should be of class
factor
in order to be handled correctly by bpnreg.
A circular mixed-effects model for the Maps
data can be
fit using the bpnme
function as follows:
Note that categorical variables should be of class
factor
in order to be handled correctly by bpnreg.
An interaction effect between the Cond
and
AvAmp
variables can be included in the regression model in
the following two ways:
The input data should be formatted as a standard R
data.frame
or dplyr
tbl()
.
In case of missing values in the input data an error will be returned. In case of mixed effects models subgroups do not need to be of the same size, e.g. in case of repeated measures data not all individuals need to have been observed at each measurement occasion.
The dependent variable should be a circular variable measured on a scale from 0 to 2\(\pi\) radians or -\(\pi\) to \(\pi\) radians. An warning message if returned in case the dependent variable contains values outside these ranges.
The package does not currently contain an option to include a user-specified priors. Priors for regression coefficients and fixed-effect coefficients used in the current version of the package are uninformative normal distributions, \(N(0, 10000)\).
No. Unlike packages for mixed-effects models such as
lme4
and nlme
the mixed-effects model in
bpnreg
may only contain one nesting variable/grouping
factor.
A seed can be specified in the bpnr
and
bpnme
functions using the seed
option as
follows:
The coef_circ()
function can be used to extract model
summaries for both categorical and continuous predictors. The units in
which the results are displayed can be chosen to be either radians or
degrees.
E.g. circular coefficients for the circular regression model above can be obtained as follows:
coef_circ()
function:To obtain circular coefficients in degrees for the continuous
variable AvAmp
in the circular regression model above are
obtained as:
#> mean mode sd LB HPD UB HPD
#> AvAmp ax 81.60680954 74.107655480 115.262007 -1.541343e+02 276.10492950
#> AvAmp ac 37.29736965 -25.096914448 72.470701 -4.541639e+01 145.44390534
#> AvAmp bc -1.67909812 -0.612950343 152.053102 -2.614163e+01 17.27149184
#> AvAmp AS 0.08110837 0.001181389 1.532724 -2.631771e-01 0.31289926
#> AvAmp SAM 0.05039833 0.005540038 1.478331 3.711198e-05 0.06059786
#> AvAmp SSDO 0.25235872 1.756779390 1.963975 -2.636442e+00 2.85796861
The output returns summary statistics for the posterior distributions
for several parameters of the circular regression line of the
AvAmp
variable.
These parameters are interpreted as follows:
ax
= the location of the inflection point of the
regression curve on the axis of the predictor.ac
= the location of the inflection point of the
regression curve on the axis of the circular outcome.bc
= the slope of the tangent line at the inflection
point. An increase of 1 unit of the predictor at the inflection point
leads to a bc
change in the circular outcome.AS
= the average slopes of the circular regression. An
increase of 1 unit of the predictor leads to a AS
change in
the circular outcome on average.SAM
= the circular regression slopes at the mean.An
increase of 1 unit of the predictor leads to a SAM
change
in the circular outcome at the average predictor value.SSDO
= the signed shortest distance to the origin.A more detailed explanation of the above parameters is given in Cremers, Mulder & Klugkist (2018).
To obtain circular coefficients in degrees for the categorical
variable Cond
we use:
#> $Means
#> mean mode sd LB UB
#> (Intercept) 47.60691 47.46081 12.47107 23.71734 70.70371
#> Condsemi.imp 15.93874 15.29835 22.01006 -28.42677 57.42232
#> Condimp 28.31828 37.56589 25.50625 -26.30714 77.40392
#> Condsemi.impCondimp -62.81396 -71.20733 59.58785 -170.56951 85.77068
#>
#> $Differences
#> mean mode sd LB UB
#> Condsemi.imp 31.74145 38.90969 26.89524 -19.95152 86.87231
#> Condimp 19.46819 22.77203 30.56586 -42.88731 80.42033
#> Condsemi.impCondimp 115.96819 141.80603 57.61487 -56.90659 203.30989
The output returns summary statistics for the posterior distributions of the circular means for all categories and combination of categories of the categorical variables in the model, as well as differences between these means.
By using the fit()
function on a bpnr
or
bpnme
object 5 different fit statistics together with the
(effective) number of parameters they are based on can be obtained.
fit(fit)
#> Statistic Parameters
#> lppd -57.1423 8.000000
#> DIC 130.1116 8.024107
#> DIC.alt 132.1696 9.053110
#> WAIC1 129.8904 7.802881
#> WAIC2 131.6491 8.682262
All five fit statistics are computed as in Gelman et.al. (2014). The
lppd
is an estimate of the expected log predictive density,
the DIC
is the Deviance Information Criterion, the
DIC_alt
is a version of the DIC that uses a slightly
different definition of the effective number of parameters, the
WAIC1
and WAIC2
are the two versions of the
Watanabe-Akaike or Widely Available Information Criterion presented in
Gelman et.al. (2014).
Raw posterior estimates are stored in the following objects:
a.x
= posterior samples for the the locations of the
inflection point of the regression curve on the axis of the
predictor.a.c
= posterior samples for the the locations of the
inflection point of the regression curve on the axis of the circular
outcome.b.c
= posterior samples for the slopes of the tangent
line at the inflection point.AS
= posterior samples for the average slopes of the
circular regression.SAM
= posterior samples for the circular regression
slopes at the mean.SSDO
= posterior samples for the signed shortest
distance to the origin.circ.diff
= posterior samples for the circular
differences between intercept and other categories of categorical
variables.beta1
= posterior samples for the fixed effects
coefficients for the first component.beta2
= posterior samples for the fixed effects
coefficients for the second component.In circular mixed-effects models the following additional parameters can be obtained:
b1
= posterior samples for the random effects
coefficients for the first component.b2
= posterior samples for the random effects
coefficients for the second component.circular.ri
= posterior samples for the circular random
intercepts for each individual.omega1
= posterior samples for the random effect
variances of the first component.omega2
= posterior samples for the random effect
variances of the first component.cRS
= posterior samples for the circular random slope
variance.cRI
= posterior samples of the mean resultant length of
the circular random intercept, a measure of concentration.E.g. to obtain the first six posterior samples for beta1
and a.x
we use the following code:
head(fit$beta1)
#> (Intercept) Condsemi.imp Condimp AvAmp
#> [1,] 0.3454734 0.5982656 0.4816842 0.0140926098
#> [2,] 0.9452224 0.5676390 -0.7671751 -0.0067083928
#> [3,] 0.7600679 0.2982861 -0.1715743 -0.0060381475
#> [4,] 1.3186083 -0.3477482 -0.5248282 0.0003941233
#> [5,] 1.3624594 -0.4803601 -0.8509008 -0.0059952280
#> [6,] 0.9594075 0.3900853 -0.3079756 -0.0073404834
In circular mixed-effects models the following parameters contain the individual random effects and random effect variances:
b1
= posterior samples for the random effects
coefficients for the first component.b2
= posterior samples for the random effects
coefficients for the second component.circular.ri
= posterior samples for the circular random
intercepts for each individual.omega1
= posterior samples for the random effect
variances of the first component.omega2
= posterior samples for the random effect
variances of the first component.cRS
= posterior samples for the circular random slope
variance (for bc
).cRI
= posterior samples of the mean resultant length of
the circular random intercept, a measure of concentration.E.g. if we want to obtain the posterior mean for the mean resultant length of the circular random intercept we use the following code:
This estimate is very close to 1, meaning there is almost no variation in the individual circular random intercepts. We can check this by plotting the posterior means of the individual circular random intercepts (in degrees):
apply(fitme$circular.ri, 1, mean_circ)*180/pi
#> [1] -13.18457 -13.20623 -13.39053 -13.28647 -13.28007 -13.29426 -13.19401
#> [8] -13.27823 -13.11418 -13.29843 -13.34283 -13.38288 -13.28338 -13.29185
#> [15] -13.18482 -13.24974 -13.20127 -13.16453 -13.47421 -13.15180
Indeed the circular random intercepts of the 20 individuals in the
Maps
data lie very close together.
An explanation of the computation of random intercept and slope variances can be found in the supplementary material of Cremers, Pennings, Mainhard & Klugkist (2021).
Because projected normal models are heterogeneous models, i.e. they simultaneously model mean and variance, we can in addition to investigating effects on the circular mean also investigate effects on the circular variance.
Kendall (1974) gives the following formula for computation of the circular variance in a projected normal model:
\[1 - \hat{\rho} = 1 - \sqrt{\pi\xi/2}\exp{-\xi}[I_0(\xi) + I_1(\xi)]\]
where \(\xi = ||\boldsymbol{\mu}||^2\) and \(I_\nu()\) is the modified Bessel function of the first kind and order \(\nu\). For the effect of a variable \(x\) on the variance, \(\boldsymbol{\mu} = (\beta_0^I + \beta_1^Ix,\beta_0^{II} + \beta_1^{II}x)\), where \(\beta_0^I\), \(\beta_1^I\), \(\beta_0^{II}\) and \(\beta_1^{II}\) are the intercepts and slopes of the first and second linear component respectively.
Automated summaries for effects on the circular variance are however
not yet implemented in bpnreg
and will need to be computed
explicitly using the raw posterior samples. E.g. the effect of the trial
type on the circular variance can be computed as follows:
a1 <- fitme$beta1[,"(Intercept)"]
a2 <- fitme$beta2[,"(Intercept)"]
b1 <- fitme$beta1[,"Trial.type1"]
b2 <- fitme$beta2[,"Trial.type1"]
zeta_standard <- sqrt((a1)^2 + (a2 + b2)^2)^2/4
var_standard <- 1 - sqrt((pi * zeta_standard)/2) * exp(-zeta_standard) *
(besselI(zeta_standard, 0) + besselI(zeta_standard, 1))
zeta_probe <- sqrt((a1 + b1)^2 + (a2 + b2)^2)^2/4
var_probe <- 1 - sqrt((pi * zeta_probe)/2) * exp(-zeta_probe) *
(besselI(zeta_probe, 0) + besselI(zeta_probe, 1))
standard <- c(mode_est(var_standard),
mean(var_standard),
sd(var_standard),
hpd_est(var_standard))
probe <- c(mode_est(var_probe),
mean(var_probe),
sd(var_probe),
hpd_est(var_probe))
results <- rbind(standard, probe)
colnames(results) <- c("mode", "mean", "sd", "HPD LB", "HPD UB")
rownames(results) <- c("standard", "probe")
The posterior estimates of the circular variance for the standard and probe trials are:
results
#> mode mean sd HPD LB HPD UB
#> standard 0.08664707 0.10494372 0.02334300 0.05901261 0.1471963
#> probe 0.06332233 0.07409204 0.01829064 0.04351343 0.1090560
From these results we conclude that the circular variance for the standard and probe trials is not significantly different, the 95% Highest Posterior Density (HPD) intervals overlap.
Cremers, J. , Mulder, K.T. & Klugkist, I. (2018). Circular Interpretation of Regression Coefficients. British Journal of Mathematical and Statistical Psychology, 71(1), 75-95.
Cremers, J., Pennings, H.J.M., Mainhard, T. & Klugkist, I. (2021). Circular Modelling of Circumplex Measurements for Interpersonal Behavior. Assessment, 28(2), 585-600.
Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A. & Rubin, D. (2014). Bayesian Data Analysis, 3rd ed.
Kendall, D.G. (1974). Pole-seeking Brownian motion and bird navigation. Journal of the Royal Statistical Society. Series B, 37, 97–133.