PMM2 for Time Series: AR, MA, ARMA, and ARIMA Models

Introduction

Time series models are fundamental tools in economics, finance, engineering, and many other fields. Classical estimation methods like Conditional Sum of Squares (CSS) and Maximum Likelihood (ML) assume Gaussian innovations. However, real-world time series often exhibit:

Heavy tails (e.g., financial returns)
Skewness (e.g., environmental data, commodity prices)
Outliers (e.g., measurement errors)

The PMM2 method extends to time series models, providing more efficient parameter estimates when innovations deviate from normality. This vignette demonstrates PMM2 applications to:

Autoregressive (AR) models
Moving Average (MA) models
ARMA models
ARIMA models with differencing

Setup

library(EstemPMM)
set.seed(42)

Part 1: Autoregressive (AR) Models

AR(1) Model with Skewed Innovations

We’ll simulate an AR(1) process with gamma-distributed innovations to demonstrate PMM2’s robustness.

# Model parameters
n <- 300
phi <- 0.6  # AR coefficient
intercept <- 5

# Generate skewed innovations using gamma distribution
# (Gamma(shape=2) has skewness = sqrt(2) ≈ 1.41)
innovations <- rgamma(n, shape = 2, rate = 2) - 1  # Center around 0

# Generate AR(1) series
x <- numeric(n)
x[1] <- intercept + innovations[1]
for(i in 2:n) {
  x[i] <- intercept + phi * (x[i-1] - intercept) + innovations[i]
}

# Plot the time series
plot(x, type = "l", main = "AR(1) Process with Skewed Innovations",
     ylab = "Value", xlab = "Time", col = "steelblue", lwd = 1.5)
abline(h = intercept, col = "red", lty = 2)

Estimate AR(1) Model: PMM2 vs CSS

# Fit using PMM2
fit_pmm2_ar1 <- ar_pmm2(x, order = 1, method = "pmm2", include.mean = TRUE)

# Fit using CSS (classical method)
fit_css_ar1 <- ar_pmm2(x, order = 1, method = "css", include.mean = TRUE)

# Compare coefficients
cat("True AR coefficient:", phi, "\n\n")
#> True AR coefficient: 0.6

cat("CSS Estimate:\n")
#> CSS Estimate:
print(coef(fit_css_ar1))
#> intercept       ar1 
#> 4.8947563 0.5923856

cat("\nPMM2 Estimate:\n")
#> 
#> PMM2 Estimate:
print(coef(fit_pmm2_ar1))
#> intercept       ar1 
#> 4.8947563 0.5772581

# Summary with diagnostics
summary(fit_pmm2_ar1)
#> PMM2 time series estimation results
#> Model type: AR(1)
#> Call:
#> ts_pmm2(x = x, order = order, model_type = "ar", method = method, 
#>     max_iter = max_iter, tol = tol, include.mean = include.mean, 
#>     initial = initial, na.action = na.action, regularize = regularize, 
#>     reg_lambda = reg_lambda, verbose = verbose)
#> 
#> Coefficients:
#> AR:       ar1 
#> 0.5772581 
#> Intercept:  4.894756 
#> 
#> Central moments of initial residuals:
#>   m2 = 0.4505474 
#>   m3 = 0.3340687 
#>   m4 = 0.8558866 
#> 
#> Theoretical characteristics of PMM2 (S = 2):
#>   c3 = 1.104651 
#>   c4 = 1.216337 
#>   g  = 0.6206076  (expected ratio Var[PMM2]/Var[OLS])
#> 
#> Algorithm information:
#>   Convergence status: TRUE 
#>   Iterations: 3

Observation: PMM2 typically provides estimates closer to the true value when innovations are non-Gaussian.

AR(2) Model

For higher-order AR models, PMM2 maintains its efficiency advantages.

# AR(2) model: x_t = 0.5*x_{t-1} - 0.3*x_{t-2} + e_t
n <- 400
phi <- c(0.5, -0.3)

# Generate t-distributed innovations (heavy tails)
innovations <- rt(n, df = 4)

# Generate AR(2) series
x <- arima.sim(n = n, model = list(ar = phi), innov = innovations)

# Fit models
fit_pmm2_ar2 <- ar_pmm2(x, order = 2, method = "pmm2", include.mean = FALSE)
fit_css_ar2 <- ar_pmm2(x, order = 2, method = "css", include.mean = FALSE)

# Compare
comparison_ar2 <- data.frame(
  Parameter = c("phi1", "phi2"),
  True = phi,
  CSS = coef(fit_css_ar2),
  PMM2 = coef(fit_pmm2_ar2)
)

print(comparison_ar2, row.names = FALSE)
#>  Parameter True        CSS       PMM2
#>       phi1  0.5  0.4771510  0.4854599
#>       phi2 -0.3 -0.3013664 -0.2482538

Part 2: Moving Average (MA) Models

MA models are challenging because innovations are not directly observable. PMM2 uses CSS residuals as starting estimates, then refines parameters.

MA(1) Model

# MA(1) model: x_t = e_t + theta*e_{t-1}
n <- 300
theta <- 0.7

# Generate chi-squared innovations (right-skewed)
innovations <- rchisq(n, df = 3) - 3

# Generate MA(1) series
x <- numeric(n)
x[1] <- innovations[1]
for(i in 2:n) {
  x[i] <- innovations[i] + theta * innovations[i-1]
}

# Fit MA(1) models
fit_pmm2_ma1 <- ma_pmm2(x, order = 1, method = "pmm2", include.mean = FALSE)
fit_css_ma1 <- ma_pmm2(x, order = 1, method = "css", include.mean = FALSE)

# Compare estimates
cat("True MA coefficient:", theta, "\n\n")
#> True MA coefficient: 0.7

cat("CSS Estimate:\n")
#> CSS Estimate:
print(coef(fit_css_ma1))
#>       ma1 
#> 0.7012471

cat("\nPMM2 Estimate:\n")
#> 
#> PMM2 Estimate:
print(coef(fit_pmm2_ma1))
#>     ma1 
#> 0.63738

# Check convergence
cat("\nPMM2 Convergence:", fit_pmm2_ma1@convergence, "\n")
#> 
#> PMM2 Convergence: TRUE
cat("Iterations:", fit_pmm2_ma1@iterations, "\n")
#> Iterations: 4

MA(2) Model

# MA(2) model
n <- 400
theta <- c(0.6, 0.3)

# Mixed distribution: 80% normal + 20% contamination
innovations <- ifelse(runif(n) < 0.8,
                      rnorm(n),
                      rnorm(n, mean = 0, sd = 3))

# Generate MA(2) series
x <- arima.sim(n = n, model = list(ma = theta), innov = innovations)

# Fit models
fit_pmm2_ma2 <- ma_pmm2(x, order = 2, method = "pmm2", include.mean = FALSE)
fit_css_ma2 <- ma_pmm2(x, order = 2, method = "css", include.mean = FALSE)

# Comparison
comparison_ma2 <- data.frame(
  Parameter = c("theta1", "theta2"),
  True = theta,
  CSS = coef(fit_css_ma2),
  PMM2 = coef(fit_pmm2_ma2)
)

print(comparison_ma2, row.names = FALSE)
#>  Parameter True       CSS      PMM2
#>     theta1  0.6 0.5813574 0.5655106
#>     theta2  0.3 0.2955922 0.3126379

Monte Carlo Evidence: For MA(2) models with skewed errors, PMM2 achieves 26-32% variance reduction compared to CSS (see roadmap documentation).

Part 3: ARMA Models

ARMA models combine AR and MA components, providing flexible modeling for stationary series.

ARMA(1,1) Model

# ARMA(1,1) model
n <- 500
phi <- 0.5    # AR coefficient
theta <- 0.4  # MA coefficient

# Generate exponentially distributed innovations (right-skewed)
innovations <- rexp(n, rate = 1) - 1

# Generate ARMA(1,1) series
x <- arima.sim(n = n,
               model = list(ar = phi, ma = theta),
               innov = innovations)

# Fit models
fit_pmm2_arma <- arma_pmm2(x, order = c(1, 1), method = "pmm2", include.mean = FALSE)
fit_css_arma <- arma_pmm2(x, order = c(1, 1), method = "css", include.mean = FALSE)

# Extract coefficients
cat("True parameters: AR =", phi, ", MA =", theta, "\n\n")
#> True parameters: AR = 0.5 , MA = 0.4

cat("CSS Estimates:\n")
#> CSS Estimates:
print(coef(fit_css_arma))
#>       ar1       ma1 
#> 0.4562385 0.4457135

cat("\nPMM2 Estimates:\n")
#> 
#> PMM2 Estimates:
print(coef(fit_pmm2_arma))
#>       ar1       ma1 
#> 0.4581265 0.4642835

# Summary
summary(fit_pmm2_arma)
#> PMM2 time series estimation results
#> Model type: ARMA(1,1)
#> Call:
#> ts_pmm2(x = x, order = order, model_type = "arma", method = method, 
#>     max_iter = max_iter, tol = tol, include.mean = include.mean, 
#>     initial = initial, na.action = na.action, regularize = regularize, 
#>     reg_lambda = reg_lambda, verbose = verbose)
#> 
#> Coefficients:
#> AR:       ar1 
#> 0.4581265 
#> MA:       ma1 
#> 0.4642835 
#> 
#> Central moments of initial residuals:
#>   m2 = 1.133364 
#>   m3 = 2.59006 
#>   m4 = 11.39505 
#> 
#> Theoretical characteristics of PMM2 (S = 2):
#>   c3 = 2.146622 
#>   c4 = 5.871092 
#>   g  = 0.4145685  (expected ratio Var[PMM2]/Var[OLS])
#> 
#> Algorithm information:
#>   Convergence status: TRUE 
#>   Iterations: 3

Diagnostic Plots for ARMA Models

# Plot residuals and ACF
plot(fit_pmm2_arma)
title("ARMA(1,1) Model Diagnostics")

The diagnostic plots help verify:

Residuals vs Time: Should show no patterns
ACF of Residuals: Should be within confidence bands (white noise)
Q-Q Plot: Assess residual distribution

Part 4: ARIMA Models with Differencing

For non-stationary series, ARIMA models incorporate differencing to achieve stationarity.

ARIMA(1,1,1) Model

# Generate non-stationary series requiring differencing
n <- 400
phi <- 0.4
theta <- 0.3

# Skewed innovations
innovations <- rgamma(n, shape = 3, rate = 3) - 1

# Generate integrated series: ARIMA(1,1,1)
# First generate stationary ARMA(1,1)
x_stationary <- arima.sim(n = n,
                          model = list(ar = phi, ma = theta),
                          innov = innovations)

# Integrate once (cumulative sum)
x <- cumsum(x_stationary)

# Plot the non-stationary series
plot(x, type = "l", main = "ARIMA(1,1,1) Process",
     ylab = "Value", xlab = "Time", col = "darkgreen", lwd = 1.5)

Fitting ARIMA Models

# Fit ARIMA(1,1,1) using PMM2
fit_pmm2_arima <- arima_pmm2(x, order = c(1, 1, 1), method = "pmm2", include.mean = FALSE)

# Fit using classical CSS method
fit_css_arima <- arima_pmm2(x, order = c(1, 1, 1), method = "css", include.mean = FALSE)

# Compare coefficients
cat("True parameters: AR =", phi, ", MA =", theta, "\n\n")
#> True parameters: AR = 0.4 , MA = 0.3

cat("CSS Estimates:\n")
#> CSS Estimates:
print(coef(fit_css_arima))
#>       ar1       ma1 
#> 0.4355905 0.2550518

cat("\nPMM2 Estimates:\n")
#> 
#> PMM2 Estimates:
print(coef(fit_pmm2_arima))
#>       ar1       ma1 
#> 0.3892825 0.3213544

# Summary
summary(fit_pmm2_arima)
#> PMM2 time series estimation results
#> Model type: ARIMA(1,1,1)
#> Call:
#> ts_pmm2(x = x, order = order, model_type = "arima", method = method, 
#>     max_iter = max_iter, tol = tol, include.mean = include.mean, 
#>     initial = initial, na.action = na.action, regularize = regularize, 
#>     reg_lambda = reg_lambda, verbose = verbose)
#> 
#> Coefficients:
#> AR:       ar1 
#> 0.3892825 
#> MA:       ma1 
#> 0.3213544 
#> 
#> Central moments of initial residuals:
#>   m2 = 0.3635943 
#>   m3 = 0.222833 
#>   m4 = 0.5000721 
#> 
#> Theoretical characteristics of PMM2 (S = 2):
#>   c3 = 1.016375 
#>   c4 = 0.7826694 
#>   g  = 0.6287674  (expected ratio Var[PMM2]/Var[OLS])
#> 
#> Algorithm information:
#>   Convergence status: TRUE 
#>   Iterations: 3

Monte Carlo Evidence: ARIMA(1,1,1) models show 24-52% variance reduction with PMM2 vs CSS when innovations are non-Gaussian.

Part 5: Bootstrap Inference for Time Series

Bootstrap methods provide robust confidence intervals for time series parameters.

# Perform bootstrap inference for the ARMA(1,1) model
boot_results <- ts_pmm2_inference(fit_pmm2_arma,
                                  B = 500,
                                  seed = 123,
                                  method = "block",
                                  block_length = 20,
                                  parallel = FALSE)

# Display summary with confidence intervals
summary(boot_results)
#>     Estimate        Std.Error          t.value         p.value         
#>  Min.   :0.4581   Min.   :0.04639   Min.   :7.206   Min.   :0.000e+00  
#>  1st Qu.:0.4597   1st Qu.:0.05090   1st Qu.:7.873   1st Qu.:1.443e-13  
#>  Median :0.4612   Median :0.05541   Median :8.541   Median :2.887e-13  
#>  Mean   :0.4612   Mean   :0.05541   Mean   :8.541   Mean   :2.887e-13  
#>  3rd Qu.:0.4627   3rd Qu.:0.05992   3rd Qu.:9.209   3rd Qu.:4.330e-13  
#>  Max.   :0.4643   Max.   :0.06443   Max.   :9.877   Max.   :5.773e-13  
#>     conf.low        conf.high     
#>  Min.   :0.2537   Min.   :0.4923  
#>  1st Qu.:0.2693   1st Qu.:0.4950  
#>  Median :0.2850   Median :0.4978  
#>  Mean   :0.2850   Mean   :0.4978  
#>  3rd Qu.:0.3006   3rd Qu.:0.5005  
#>  Max.   :0.3162   Max.   :0.5033

The bootstrap results provide:

Standard errors for each parameter
95% confidence intervals (percentile and bias-corrected)
Convergence diagnostics across bootstrap samples

Part 6: Comparing Methods with Monte Carlo

The package provides tools for systematic comparison through Monte Carlo simulation.

# Define model specifications
specs <- list(
  list(
    model = "ma",
    order = 1,
    theta = 0.7,
    label = "MA(1)",
    innovations = list(type = "gamma", shape = 2)
  ),
  list(
    model = "arma",
    order = c(1, 1),
    theta = list(ar = 0.5, ma = 0.4),
    label = "ARMA(1,1)",
    innovations = list(type = "student_t", df = 4)
  )
)

# Run Monte Carlo comparison
mc_results <- pmm2_monte_carlo_compare(
  model_specs = specs,
  methods = c("css", "pmm2"),
  n = 300,
  n_sim = 1000,
  seed = 456,
  progress = FALSE
)

# Display results
print(mc_results$summary)
print(mc_results$gain)

Interpretation:

MSE_ratio < 1.0 indicates PMM2 has lower variance than CSS
Efficiency gains typically range from 10-50% depending on innovation distribution

Part 7: Prediction with Time Series Models

PMM2-fitted models support forecasting:

# Forecast 10 steps ahead for the ARMA(1,1) model
forecast_horizon <- 10

pred_raw <- predict(fit_pmm2_arma, n.ahead = forecast_horizon)
predictions <- if (is.list(pred_raw)) as.numeric(pred_raw$pred) else as.numeric(pred_raw)

# Display forecasts
cat("Forecasts for next", forecast_horizon, "periods:\n")
#> Forecasts for next 10 periods:
print(predictions)
#>  [1] -0.6577242285 -0.3013208876 -0.1380430785 -0.0632411901 -0.0289724640
#>  [6] -0.0132730530 -0.0060807371 -0.0027857467 -0.0012762243 -0.0005846722

# Plot original series with forecasts
n_plot <- 100
plot_data <- tail(x, n_plot)
plot(seq_along(plot_data), plot_data, type = "l",
     xlim = c(1, n_plot + forecast_horizon),
     ylim = range(c(plot_data, predictions)),
     main = "ARMA(1,1) Forecasts",
     xlab = "Time", ylab = "Value",
     col = "steelblue", lwd = 1.5)

# Add forecasts
lines(n_plot + seq_len(forecast_horizon), predictions,
      col = "red", lwd = 2, lty = 2)

legend("topleft",
       legend = c("Observed", "Forecast"),
       col = c("steelblue", "red"),
       lty = c(1, 2), lwd = c(1.5, 2))

Practical Guidelines

When to Use PMM2 for Time Series

Recommended when:

Innovations exhibit skewness or heavy tails (check residuals from classical fit)
Working with financial data (returns often non-Gaussian)
Environmental time series (precipitation, pollution levels)
Engineering measurements with asymmetric errors
Sample size n ≥ 200 for reliable moment estimation

Stick with classical methods when:

Innovations are approximately Gaussian (check Q-Q plot)
Very small sample sizes (n < 100)
Computational speed is critical
Model is for pedagogical purposes only

Model Selection Strategy

Visualize the time series and check for stationarity
Fit classical model (CSS/ML) as baseline
Examine residuals for non-normality:
- Skewness: pmm_skewness(residuals)
- Kurtosis: pmm_kurtosis(residuals)
- Visual: Q-Q plots, histograms
If residuals are non-Gaussian, refit with PMM2
Compare standard errors using bootstrap
Validate with out-of-sample forecasting

Computational Considerations

PMM2 is iterative and may take longer than classical methods:

AR models: Typically fast (5-15 iterations)
MA models: Moderate (10-30 iterations depending on order)
ARMA/ARIMA: Can be slower (20-50 iterations)

For large datasets (n > 5000), consider:

Using verbose = FALSE to reduce output
Starting with classical estimates as initial parameter
Reducing max_iter if convergence is slow

Summary Statistics: Efficiency Gains

Based on extensive Monte Carlo studies documented in the package:

Model Type	Innovation Distribution	Variance Reduction	Sample Size
AR(1)	Gamma (skew = 1.4)	15-20%	200-1000
MA(1)	Chi-squared (df=3)	15-23%	200-1000
MA(2)	Mixed contamination	26-32%	200-1000
ARMA(1,1)	Student-t (df=4)	30-45%	200-1000
ARIMA(1,1,1)	Exponential (shifted)	24-52%	300-1000

Key Insight: The more non-Gaussian the innovations, the larger the efficiency gain from PMM2.

Advanced Topics

Custom Innovation Distributions

You can use pmm2_monte_carlo_compare() with custom distributions:

# Custom innovation generator
my_innovations <- function(n) {
  # Mixture: 90% Gaussian + 10% extreme outliers
  base <- rnorm(n)
  outliers <- sample(c(-5, 5), n, replace = TRUE, prob = c(0.05, 0.05))
  ifelse(runif(n) < 0.9, base, outliers)
}

# Use in Monte Carlo
specs <- list(
  list(
    model = "ar",
    order = 1,
    theta = 0.6,
    innovations = list(generator = my_innovations)
  )
)

mc_results <- pmm2_monte_carlo_compare(specs, methods = c("css", "pmm2"),
                                       n = 300, n_sim = 500)

Conclusion

The PMM2 method for time series provides:

Robust estimation for AR, MA, ARMA, and ARIMA models
10-50% variance reduction when innovations deviate from normality
Seamless integration with standard R workflows
Comprehensive diagnostics and bootstrap inference

For practitioners working with real-world time series exhibiting non-Gaussian behavior, PMM2 offers a principled approach to achieving more precise parameter estimates without sacrificing interpretability.

Next Steps

Bootstrap Inference: See vignette("03-bootstrap-inference") for detailed statistical inference
Linear Regression: See vignette("01-pmm2-introduction") for PMM2 basics
Advanced Applications: Explore package demos with demo(package = "EstemPMM")

References

Zabolotnii, S., Warsza, Z.L., Tkachenko, O. (2018). “Polynomial Estimation of Linear Regression Parameters for the Asymmetric PDF of Errors”. Springer, AUTOMATION 2018.
Package functions: ?ar_pmm2, ?ma_pmm2, ?arma_pmm2, ?arima_pmm2, ?ts_pmm2_inference
Monte Carlo tools: ?pmm2_monte_carlo_compare
GitHub: https://github.com/SZabolotnii/EstemPMM