Heteroscedasity

Pasted image 20240131201841.png

Heteroscedasticity is a common issue encountered in the analysis of time series data. It refers to the phenomenon where the variability of a series is not constant over time. This can significantly impact the predictive modeling and statistical inference of time series data, as most standard models assume homoscedasticity (constant variance).

Understanding Heteroscedasticity

To understand heteroscedasticity, let's consider a simple example: financial market data. It's well known that market volatility tends to increase during periods of uncertainty and decrease in more stable times. This changing volatility is a classic example of heteroscedasticity.

Mathematically, if we have a time series , heteroscedasticity implies that the variance of the error term changes over time:

Here, represents constant variance, which is not applicable in the case of heteroscedasticity.

Detecting Heteroscedasticity

There are several methods to detect heteroscedasticity in time series data:

Visual Inspection

One can plot the residuals of a model against the predicted values or time to visually inspect any pattern that indicates changing variance.

import matplotlib.pyplot as plt
import numpy as np

# Assuming residuals and predictions are stored in residuals and predictions variables
plt.scatter(predictions, residuals)
plt.title('Residuals vs Predictions')
plt.xlabel('Predictions')
plt.ylabel('Residuals')
plt.show()

# Or plotting residuals over time
plt.plot(residuals)
plt.title('Residuals over Time')
plt.xlabel('Time')
plt.ylabel('Residuals')
plt.show()

Statistical Tests

There are also statistical tests for detecting heteroscedasticity:

  • Breusch-Pagan Test
  • White Test

These tests formulate a null hypothesis that assumes homoscedasticity (constant variance) and test against it.

from statsmodels.stats.diagnostic import het_breuschpagan
from statsmodels.regression.linear_model import OLS
import statsmodels.api as sm

# Assuming X (independent variables) and y (dependent variable) are defined
model = OLS(y, sm.add_constant(X)).fit()
test_statistic, p_value, _, _ = het_breuschpagan(model.resid, model.model.exog)

print(f"Breusch-Pagan Test P-value: {p_value}")

Addressing Heteroscedasticity

When heteroscedasticity is present, it can lead to inefficient estimates and incorrect conclusions. Here are some ways to address it:

Transformation

Applying transformations to stabilize the variance can be effective. Common transformations include:

  • Log transformation:
  • Square root transformation:

Weighted Least Squares (WLS)

WLS is an extension of ordinary least squares (OLS) that assigns weights inversely proportional to the variance of errors. It's particularly useful when dealing with heteroscedastic data.

weights = 1 / np.var(residuals)
wls_model = OLS(y, X).fit(weights=weights)

print(wls_model.summary())

GARCH Models for Financial Time Series

For financial time series with volatility clustering, Generalized Autoregressive Conditional Heteroskedasticity (GARCH) models can be very effective:

from arch import arch_model

# Assuming returns contains financial returns data
garch_model = arch_model(returns, vol='Garch', p=1, q=1)
res = garch_model.fit()

print(res.summary())