Introduction to PMM2: Polynomial Maximization Method

Serhii Zabolotnii

2025-11-04

The Problem: When OLS Falls Short

Ordinary Least Squares (OLS) is the workhorse of regression analysis, providing Best Linear Unbiased Estimators (BLUE) under the Gauss-Markov assumptions. However, when error distributions deviate from normality—particularly when they exhibit skewness—OLS estimates, while still unbiased, lose statistical efficiency. This means:

Practical Implications

Consider scenarios where non-normal errors are common:

In these contexts, relying solely on OLS can lead to suboptimal inference.


The Solution: Polynomial Maximization Method (PMM2)

The Polynomial Maximization Method of order 2 (PMM2) is a moment-based estimation technique that leverages higher-order moments of residuals to achieve:

  1. Lower variance estimates compared to OLS when errors are non-Gaussian
  2. Robustness to skewness and kurtosis in error distributions
  3. Asymptotic efficiency gains validated through Monte Carlo studies

Theoretical Foundation

PMM2 works by constructing a system of polynomial equations based on central moments:

\[ Z_j(\beta) = \sum_{i=1}^{n} x_{ij} \left[ A \cdot \hat{y}_i^2 + B \cdot \hat{y}_i + C \right] = 0 \]

where:

The algorithm iteratively solves this system using a gradient-based approach, starting from OLS estimates.

Reference: Zabolotnii et al. (2018) - “Polynomial Estimation of Linear Regression Parameters for the Asymmetric PDF of Errors” (Springer, AUTOMATION 2018)


Getting Started with EstemPMM

Let’s explore PMM2 with practical examples using the EstemPMM package.

library(EstemPMM)
set.seed(123)

Example 1: Simple Linear Regression with Skewed Errors

We’ll simulate data where errors follow a skewed distribution (using \(\chi^2\)) to demonstrate PMM2’s advantages.

Data Generation

# Sample size
n <- 200

# Generate predictor
x <- rnorm(n, mean = 5, sd = 2)

# True parameters
beta_0 <- 10
beta_1 <- 2.5

# Generate highly skewed errors using chi-squared distribution
# (chi-square with df=3 has skewness ~ 1.63)
errors <- rchisq(n, df = 3) - 3  # Center to have mean ~ 0

# Response variable
y <- beta_0 + beta_1 * x + errors

# Create data frame
dat <- data.frame(y = y, x = x)

Visualize Error Distribution

# Fit OLS to get residuals for visualization
fit_ols_temp <- lm(y ~ x, data = dat)
residuals_ols <- residuals(fit_ols_temp)

# Plot histogram of residuals
par(mfrow = c(1, 2))
hist(residuals_ols, breaks = 30, main = "Residual Distribution",
     xlab = "Residuals", col = "lightblue", border = "white")
qqnorm(residuals_ols, main = "Q-Q Plot")
qqline(residuals_ols, col = "red", lwd = 2)

par(mfrow = c(1, 1))

# Calculate skewness and kurtosis
cat(sprintf("Skewness: %.3f\n", pmm_skewness(residuals_ols)))
#> Skewness: 1.865
cat(sprintf("Excess Kurtosis: %.3f\n", pmm_kurtosis(residuals_ols) - 3))
#> Excess Kurtosis: 1.710

The histogram and Q-Q plot clearly show right skewness, violating the normality assumption that underpins OLS optimality.


Model Fitting: PMM2 vs OLS

# Fit PMM2 model
fit_pmm2 <- lm_pmm2(y ~ x, data = dat, verbose = FALSE)

# Fit OLS model for comparison
fit_ols <- lm(y ~ x, data = dat)

# Display coefficients
cat("=== OLS Estimates ===\n")
#> === OLS Estimates ===
print(coef(fit_ols))
#> (Intercept)           x 
#>   10.188412    2.423676

cat("\n=== PMM2 Estimates ===\n")
#> 
#> === PMM2 Estimates ===
print(coef(fit_pmm2))
#> [1] 9.980738 2.465210

cat("\n=== True Values ===\n")
#> 
#> === True Values ===
cat("Intercept:", beta_0, "\n")
#> Intercept: 10
cat("Slope:    ", beta_1, "\n")
#> Slope:     2.5

Bootstrap Inference for Variance Comparison

To rigorously compare estimator efficiency, we use bootstrap inference:

# Perform bootstrap inference
boot_results <- pmm2_inference(fit_pmm2, formula = y ~ x, data = dat,
                               B = 500)

# Display summary with confidence intervals
summary(boot_results)
#>     Estimate       Std.Error          t.value         p.value     conf.low    
#>  Min.   :2.465   Min.   :0.06404   Min.   :27.25   Min.   :0   Min.   :2.348  
#>  1st Qu.:4.344   1st Qu.:0.13959   1st Qu.:30.06   1st Qu.:0   1st Qu.:4.072  
#>  Median :6.223   Median :0.21514   Median :32.87   Median :0   Median :5.796  
#>  Mean   :6.223   Mean   :0.21514   Mean   :32.87   Mean   :0   Mean   :5.796  
#>  3rd Qu.:8.102   3rd Qu.:0.29069   3rd Qu.:35.68   3rd Qu.:0   3rd Qu.:7.520  
#>  Max.   :9.981   Max.   :0.36624   Max.   :38.49   Max.   :0   Max.   :9.244  
#>    conf.high     
#>  Min.   : 2.597  
#>  1st Qu.: 4.615  
#>  Median : 6.634  
#>  Mean   : 6.634  
#>  3rd Qu.: 8.652  
#>  Max.   :10.670

The bootstrap results provide:

Key Observation: PMM2 standard errors are typically 10-30% smaller than OLS when errors are skewed, indicating higher precision.


Example 2: Multiple Regression

PMM2 extends naturally to multiple predictors.

# Generate multiple predictors
n <- 250
x1 <- rnorm(n, mean = 0, sd = 1)
x2 <- runif(n, min = -2, max = 2)
x3 <- rexp(n, rate = 0.5)

# True model
beta <- c(5, 1.2, -0.8, 2.0)  # Intercept, x1, x2, x3
errors <- rt(n, df = 4)  # t-distribution with heavy tails

y <- beta[1] + beta[2]*x1 + beta[3]*x2 + beta[4]*x3 + errors

# Create data frame
dat_multi <- data.frame(y = y, x1 = x1, x2 = x2, x3 = x3)

# Fit models
fit_pmm2_multi <- lm_pmm2(y ~ x1 + x2 + x3, data = dat_multi, verbose = FALSE)
fit_ols_multi <- lm(y ~ x1 + x2 + x3, data = dat_multi)

# Compare coefficients
comparison <- data.frame(
  Parameter = c("Intercept", "x1", "x2", "x3"),
  True = beta,
  OLS = coef(fit_ols_multi),
  PMM2 = coef(fit_pmm2_multi)
)

print(comparison, row.names = FALSE)
#>  Parameter True        OLS      PMM2
#>  Intercept  5.0  4.9484846  4.939106
#>         x1  1.2  1.2531710  1.249675
#>         x2 -0.8 -0.8080797 -0.792286
#>         x3  2.0  2.0101585  2.014457

Example 3: Polynomial Regression

PMM2 handles polynomial terms seamlessly.

# Generate data with quadratic relationship
n <- 200
x <- seq(-3, 3, length.out = n)
errors <- rchisq(n, df = 5) - 5

# True quadratic model: y = 2 + 3x - 0.5x^2
y <- 2 + 3*x - 0.5*x^2 + errors

dat_poly <- data.frame(y = y, x = x)

# Fit polynomial model
fit_pmm2_poly <- lm_pmm2(y ~ x + I(x^2), data = dat_poly, verbose = FALSE)

# Summary
summary(fit_pmm2_poly, formula = y ~ x + I(x^2), data = dat_poly)
#> PMM2 estimation results
#> Call:
#> lm_pmm2(formula = y ~ x + I(x^2), data = dat_poly, verbose = FALSE)
#> 
#> Coefficients:
#> [1]  2.0897269  2.8725555 -0.4779162
#> 
#> Central moments of initial residuals:
#>   m2 = 10.05743 
#>   m3 = 37.45091 
#>   m4 = 489.7792 
#> 
#> Theoretical characteristics of PMM2 (S = 2):
#>   c3 = 1.174172 
#>   c4 = 1.842015 
#>   g  = 0.6411572  (expected ratio Var[PMM2]/Var[OLS])
#> 
#> Algorithm information:
#>   Convergence status: TRUE 
#>   Iterations: 3 
#> 
#> Approximate statistical inference via bootstrap (B=100):
#>     Estimate  Std.Error   t.value      p.value   conf.low  conf.high
#> 1  2.0897269 0.28555149  7.318214 2.513545e-13  1.5799690  2.6122483
#> 2  2.8725555 0.09797060 29.320586 0.000000e+00  2.6892140  3.0583990
#> 3 -0.4779162 0.06130577 -7.795616 6.439294e-15 -0.5863997 -0.3492109

Diagnostic Plots

PMM2 provides standard diagnostic methods:

# Generate diagnostic plots
plot(fit_pmm2_multi)
title("PMM2 Diagnostics")

The diagnostic plots include:

  1. Residuals vs Fitted: Check for non-linearity and heteroscedasticity
  2. Q-Q Plot: Assess normality of residuals
  3. Scale-Location: Check homoscedasticity assumption
  4. Residuals vs Leverage: Identify influential observations

Comparing PMM2 with OLS: Monte Carlo Evidence

Based on simulation studies documented in the package:

Model Sample Size MSE(PMM2) / MSE(OLS) Variance Reduction
MA(1) 200 / 1000 0.85 / 0.77 15-23%
MA(2) 200 / 1000 0.68 / 0.74 26-32%
ARMA(1,1) 200 / 1000 0.70 / 0.55 30-45%
ARIMA(1,1,1) 300 / 1000 0.76 / 0.48 24-52%

These ratios less than 1.0 confirm that PMM2 achieves lower Mean Squared Error compared to classical methods (CSS/OLS) when errors deviate from normality.


When to Use PMM2

Use PMM2 when:

Stick with OLS when:


Practical Recommendations

  1. Always fit OLS first as a baseline comparison
  2. Check residual diagnostics to assess non-normality
  3. Use bootstrap inference (pmm2_inference()) for robust confidence intervals
  4. Compare standard errors between OLS and PMM2 to quantify efficiency gains
  5. Validate results with out-of-sample prediction when possible

Next Steps

To learn more about EstemPMM:


Summary

The PMM2 method provides a powerful alternative to OLS when error distributions deviate from normality. By leveraging higher-order moments, PMM2 achieves:

The EstemPMM package makes PMM2 accessible through an intuitive interface that mirrors standard R modeling functions like lm().


References

  1. Zabolotnii, S., Warsza, Z.L., Tkachenko, O. (2018). “Polynomial Estimation of Linear Regression Parameters for the Asymmetric PDF of Errors”. In: Automation 2018. Advances in Intelligent Systems and Computing, vol 743. Springer, Cham. https://doi.org/10.1007/978-3-319-77179-3_75

  2. Package documentation: ?lm_pmm2, ?pmm2_inference, ?compare_with_ols

  3. GitHub repository: https://github.com/SZabolotnii/EstemPMM