Overview
The AR1 model (Autoregressive of order 1) is a fundamental time series model where each observation depends on the previous observation plus Gaussian noise:
- Stationary process - the correlation between observations decays exponentially with lag
- Single correlation parameter - \(\rho\) controls how strongly adjacent values are related
- Mean-reverting - values tend to return toward the mean over time
AR1 is ideal for modeling time series where correlation decays with distance, such as economic indicators, climate data, or biological processes with memory.
Mathematical Formulation
For \(n\) time points, the AR1 model for the Gaussian vector \(\mathbf{x} = (x_1, \ldots, x_n)\) is defined as:
\[x_1 \sim \mathcal{N}(0, (\tau(1-\rho^2))^{-1})\]
\[x_i = \rho \, x_{i-1} + \varepsilon_i, \quad \varepsilon_i \sim \mathcal{N}(0, \tau^{-1}), \quad i = 2, \ldots, n\]
where:
- \(\tau\) is the precision of the innovations (noise terms)
- \(\rho\) is the lag-1 autocorrelation with \(|\rho| < 1\) for stationarity
- \(\kappa = \tau(1-\rho^2)\) is the marginal precision
Correlation Structure
The correlation between observations at lag \(k\) is:
\[\text{Cor}(x_i, x_{i+k}) = \rho^{|k|}\]
This exponential decay is the defining characteristic of AR1:
- \(\rho = 0.9\): High persistence, slow decay
- \(\rho = 0.5\): Moderate persistence
- \(\rho \to 0\): Approaches IID (no temporal correlation)
- \(\rho < 0\): Alternating pattern (negative autocorrelation)
Marginal Variance
The marginal variance of each \(x_i\) is \(\text{Var}(x_i) = 1/\kappa = 1/(\tau(1-\rho^2))\). As \(|\rho| \to 1\), marginal variance increases (the process becomes more variable).
When to Use AR1
The AR1 model is appropriate when you have ordered data where:
Economic Time Series
Stock returns, GDP growth, inflation rates - processes where shocks persist but decay over time.
Environmental Monitoring
Temperature anomalies, pollution levels, water quality - measurements with temporal inertia.
Panel Data with Autocorrelation
Repeated measurements where the current state depends on the previous state (e.g., health outcomes over visits).
Residual Correlation
Capturing temporal correlation in residuals after accounting for fixed effects and trends.
Use AR1 when you believe correlation decays exponentially with lag and the process is stationary. Use RW1 when you expect a non-stationary trend where adjacent differences (not levels) are independent.
Visual Understanding: How AR1 Works
AR1 creates correlated random effects where the correlation between time points decays exponentially with lag. The parameter \(\rho\) controls how strongly adjacent values are related.
AR1 Realizations
Each curve below is a single realization of an AR1 process. Notice how values tend to persist around recent levels before gradually reverting toward zero.
Effect of ρ on Persistence
The correlation parameter \(\rho\) controls how quickly values "forget" their past. Higher ρ = more persistent memory.
Autocorrelation Function: The Defining Feature
The correlation between observations at lag \(k\) is exactly \(\rho^k\). This exponential decay is what distinguishes AR1 from other models.
AR1 vs RW1: Side-by-Side Comparison
Compare the stationary AR1 (mean-reverting) with the non-stationary RW1 (trending). AR1 stays bounded while RW1 can drift indefinitely.
Correlation Structure: AR1 vs IID
The correlation matrix shows how observations at different time points are related. AR1 has off-diagonal correlation that decays with distance.
IID: Only diagonal is correlated
(each time point independent)
AR1 (ρ=0.7): Exponential decay
(correlation = ρ|lag|)
- AR1: correlation decays exponentially
- RW1: correlation increases with position
- AR1 is stationary; RW1 is not
When Data Looks Like...
| Pattern in Data | Suggested Model | Reason |
|---|---|---|
| Oscillates around a mean | AR1 | Mean-reverting behavior is characteristic of AR1 |
| Persistent shocks that decay | AR1 (high ρ) | Memory of past values fades over time |
| Trending without bound | RW1 | Non-stationary trends need random walk |
| Alternating high/low pattern | AR1 (negative ρ) | Negative autocorrelation creates oscillations |
| No temporal pattern | IID | Independence assumption is appropriate |
Specification in pyINLA
Allowed Parameters
The AR1 random-effect spec dict accepts the following keys. The source of truth is model_defs/latent/ar1.py's ALLOWED_KEYS set, mirrored 1:1 here.
| Parameter | Required | Description |
|---|---|---|
| Core (model-specific) | ||
model |
Yes | Must be 'ar1'. |
id |
Yes | Column name containing time indices (integers 1, 2, 3, ...). |
hyper |
No | Dict with 'prec', 'rho', and/or 'mean' keys for the three hyperparameters \(\theta_1, \theta_2, \theta_3\). |
n |
No | Number of time points (auto-detected if omitted). |
constr |
No | Sum-to-zero constraint. Default: False (AR1 is stationary, no constraint needed for identifiability). |
cyclic |
No | Circular AR(1): wraps \(t = n\) back to \(t = 1\). Default: False. |
values |
No | Custom location grid (list of integers): allows prediction at unobserved time points. |
extraconstr |
No | Extra linear constraints: {'A': matrix, 'e': vector} where \(\mathbf{A}\mathbf{x} = \mathbf{e}\). |
diagonal |
No | Extra constant added to the precision-matrix diagonal for numerical stability. Default: 0.0. |
| Cross-model common keys | ||
weights |
No | Per-observation weighting vector applied to the random-effect contribution. Length must equal the number of observations. |
Per-Parameter Runnable Examples
Click any parameter to expand a minimal runnable snippet. All snippets assume the shared setup block below (\(n = 30\) time points, \(\text{nobs} = 80\) observations on an integer time index 1..30).
# shared setup for every snippet below
import numpy as np
import pandas as pd
from pyinla import pyinla
rng = np.random.default_rng(42)
n = 30 # number of ordered time points
nobs = 80 # observations
df = pd.DataFrame({
'y': rng.standard_normal(nobs),
'x': rng.standard_normal(nobs),
'time': rng.integers(1, n + 1, size=nobs),
})
id + model required pair (minimal call)
# Bare ar1: 'id' is the column holding the ordered time index.
# 'n' is auto-detected from max(id); constr=False by default (ar1 is stationary).
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar1'}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
hyper override the three theta priors (prec, rho, mean)
# Defaults:
# prec: prior='loggamma', param=[1, 0.00005], initial=4, fixed=False
# rho : prior='normal', param=[0, 0.15], initial=2, fixed=False (rho~0.76)
# mean: prior='normal', param=[0, 1], initial=0, fixed=True
# Below: PC prior on precision, informative normal on logit(rho), mean kept fixed.
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar1',
'hyper': {
'prec': {'prior': 'pc.prec', 'param': [1, 0.01],
'initial': 4, 'fixed': False},
'rho': {'prior': 'normal', 'param': [2, 0.5],
'initial': 2, 'fixed': False}}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
n + values extend the position grid for prediction at unobserved time points
# Set 'n' larger than max(id) AND supply 'values' to pin the explicit
# time grid. Latent positions 1..40 are estimated even though only
# positions 1..30 appear in the data.
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar1',
'n': n + 10,
'values': list(range(1, n + 11))}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
constr opt-in sum-to-zero constraint (default False)
# Unlike rw1/rw2, ar1's default is constr=False because the model is
# stationary and does not have a null direction in the level. You can
# still opt in if you want the field to integrate to zero (e.g. to
# strictly separate the ar1 effect from the intercept).
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar1',
'constr': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
cyclic circular AR(1) (\(t = n\) wraps back to \(t = 1\))
# When cyclic=True the latent field wraps: x_n is a neighbor of x_1.
# Useful for seasonal patterns (e.g. month-of-year, hour-of-day).
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar1',
'cyclic': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
values explicit location grid
# Pin the full ordered set of time points explicitly. The list may extend
# beyond the observed range to enable prediction at new time points.
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar1',
'values': list(range(1, n + 11))}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
extraconstr additional linear constraints \(\mathbf{A}\mathbf{x} = \mathbf{e}\)
# The expected number of columns in A depends on how the latent grid is anchored:
#
# (A) Auto-detected grid (no 'values', no 'n'):
# n_eff = len(unique(time))
# A has shape (k, n_eff) # ar1 default constr=False, no augmentation
# (B) Anchored grid (supply BOTH 'n' AND 'values'):
# n_eff = n (length of the values list)
# A has shape (k, n) # full field dimension
# Robust against sparse data; recommended for prediction-style usage.
#
# Below: case (B) pins the first and last time points to zero. With n=30,
# A has shape (2, 30); 'e' has length 2.
A = np.zeros((2, n))
A[0, 0] = 1.0
A[1, n - 1] = 1.0
e = np.array([0.0, 0.0])
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar1',
'n': n,
'values': list(range(1, n + 1)),
'extraconstr': {'A': A, 'e': e}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
diagonal extra constant on the precision diagonal
# Small numerical-stability jitter. Rarely needed for ar1 with a proper prior.
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar1',
'diagonal': 1e-4}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
weights per-observation weighting
# Per-observation weight vector applied to the ar1 contribution.
# Length must equal the number of observations.
w = rng.uniform(0.5, 1.5, size=nobs)
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar1',
'weights': w}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
Hyperparameters
The AR1 model has three hyperparameters in the engine registry, though only the first two are estimated by default:
- Marginal precision \(\kappa = \tau(1-\rho^2)\), parameterized as \(\theta_1 = \log(\kappa)\). Estimated by default.
- Lag-1 correlation \(\rho\), parameterized as \(\theta_2 = \log\left(\frac{1+\rho}{1-\rho}\right)\) (the "logit" transform; equivalent to Fisher's z). Estimated by default.
- Mean of the AR1 process, parameterized on the identity scale as \(\theta_3\). Experimental; fixed to zero by default (see the Notes section).
Default Configuration
| Hyperparameter | Internal Name | Default Prior | Default Param | Initial |
|---|---|---|---|---|
| Marginal precision (\(\kappa\)) | prec |
loggamma |
[1, 0.00005] |
4 |
| Lag-1 correlation (\(\rho\)) | rho |
normal |
[0, 0.15] |
2 |
Internal Transformations
INLA works with transformed parameters:
- Precision: \(\theta_1 = \log(\kappa)\), so \(\kappa = \exp(\theta_1)\)
- Correlation: \(\theta_2 = \log((1+\rho)/(1-\rho))\), so \(\rho = (e^{\theta_2} - 1)/(e^{\theta_2} + 1)\)
Initial value \(\theta_2 = 2\) corresponds to \(\rho \approx 0.76\).
Customizing Hyperparameters
Override defaults using the hyper key with semantic names:
'random': [
{
'id': 'time',
'model': 'ar1',
'hyper': {
'prec': {
'prior': 'pc.prec',
'param': [1, 0.01], # P(sigma > 1) = 0.01
'initial': 4
},
'rho': {
'prior': 'normal',
'param': [0, 0.15], # Prior on logit(rho)
'initial': 2 # Starts at rho ~ 0.76
}
}
}
]
Available Priors
Precision Prior
The default loggamma prior places a gamma distribution on the marginal precision \(\kappa\):
\[\kappa \sim \text{Gamma}(a, b)\]
# Default loggamma prior
'prec': {'prior': 'loggamma', 'param': [1, 0.00005]}
# PC prior for precision
'prec': {'prior': 'pc.prec', 'param': [1, 0.01]}
Correlation Prior
The default prior for \(\rho\) is a normal distribution on the logit-transformed scale:
\[\theta_2 = \log\left(\frac{1+\rho}{1-\rho}\right) \sim \mathcal{N}(\mu, \sigma^{-2})\]
With default param=[0, 0.15], the prior is \(\mathcal{N}(0, 1/0.15)\), which is fairly diffuse but centered at \(\rho = 0\).
# Default: diffuse prior centered at rho = 0
'rho': {'prior': 'normal', 'param': [0, 0.15]}
# More informative: expect positive correlation
'rho': {'prior': 'normal', 'param': [2, 0.5]} # Centers at rho ~ 0.76
Fixed Values
To fix hyperparameters at known values:
import math
# Fix rho at 0.8
rho_fixed = 0.8
theta2_fixed = math.log((1 + rho_fixed) / (1 - rho_fixed))
'hyper': {
'rho': {
'initial': theta2_fixed, # ~2.197
'fixed': True
}
}
Interpreting Results
Hyperparameter Summary
After fitting, access the hyperparameter estimates via result.summary_hyperpar:
print(result.summary_hyperpar)
# mean sd 0.025quant 0.5quant 0.975quant mode
# Precision for time 23.45 5.67 14.23 22.89 36.12 21.34
# Rho for time 0.82 0.05 0.71 0.82 0.90 0.83
Interpretation:
- Precision: Higher values = lower marginal variance of the AR1 process
- Rho: The estimated lag-1 correlation; values close to 1 indicate high persistence
Random Effect Estimates
The posterior estimates for each time point are in result.summary_random:
# AR1 effect estimates
ar1_effect = result.summary_random['time']
print(ar1_effect)
# Plot the AR1 process
import matplotlib.pyplot as plt
plt.fill_between(ar1_effect['ID'], ar1_effect['0.025quant'], ar1_effect['0.975quant'], alpha=0.3)
plt.plot(ar1_effect['ID'], ar1_effect['mean'], 'b-')
plt.xlabel('Time')
plt.ylabel('AR1 Effect')
Notes
- Stationarity: The constraint \(\rho \in (-1, 1)\) is enforced automatically through the logit transformation \(\theta_2 = \log\!\big(\tfrac{1+\rho}{1-\rho}\big)\), which maps \((-1, 1)\) bijectively to the real line. (Often called the "logit" transform; mathematically it is equivalent to \(2\,\tanh^{-1}(\rho)\), i.e. Fisher's z.)
- Marginal vs Innovation Precision: pyINLA reports the marginal precision \(\kappa = \tau(1-\rho^2)\), not the innovation precision \(\tau\).
- Mean Parameter: The third hyperparameter (the AR1 mean, \(\theta_3\)) is labeled experimental. It is
fixed=Trueby default withinitial = 0, so the process has zero mean unless you explicitly unlock it viahyper['mean']['fixed'] = False. Treat estimates of this parameter as experimental output. - Index Handling: Time indices should be consecutive integers. Gaps are handled (the engine uses
sort(unique(idx))) but may affect efficiency; if you want a fixed-size grid that includes unobserved positions, supplyvalues=list(range(1, n+1))explicitly.
Identifiability
Unlike RW1, AR1 is a proper (stationary) model and does not require a sum-to-zero constraint. However, if you include an intercept, the mean of the AR1 process is absorbed into the intercept.