AR1 Random Effects

Overview

The AR1 model (Autoregressive of order 1) is a fundamental time series model where each observation depends on the previous observation plus Gaussian noise:

Stationary process - the correlation between observations decays exponentially with lag
Single correlation parameter - \(\rho\) controls how strongly adjacent values are related
Mean-reverting - values tend to return toward the mean over time

AR1 is ideal for modeling time series where correlation decays with distance, such as economic indicators, climate data, or biological processes with memory.

Mathematical Formulation

For \(n\) time points, the AR1 model for the Gaussian vector \(\mathbf{x} = (x_1, \ldots, x_n)\) is defined as:

\[x_1 \sim \mathcal{N}(0, (\tau(1-\rho^2))^{-1})\]

\[x_i = \rho \, x_{i-1} + \varepsilon_i, \quad \varepsilon_i \sim \mathcal{N}(0, \tau^{-1}), \quad i = 2, \ldots, n\]

where:

\(\tau\) is the precision of the innovations (noise terms)
\(\rho\) is the lag-1 autocorrelation with \(|\rho| < 1\) for stationarity
\(\kappa = \tau(1-\rho^2)\) is the marginal precision

Correlation Structure

The correlation between observations at lag \(k\) is:

\[\text{Cor}(x_i, x_{i+k}) = \rho^{|k|}\]

This exponential decay is the defining characteristic of AR1:

\(\rho = 0.9\): High persistence, slow decay
\(\rho = 0.5\): Moderate persistence
\(\rho \to 0\): Approaches IID (no temporal correlation)
\(\rho < 0\): Alternating pattern (negative autocorrelation)

Marginal Variance

The marginal variance of each \(x_i\) is \(\text{Var}(x_i) = 1/\kappa = 1/(\tau(1-\rho^2))\). As \(|\rho| \to 1\), marginal variance increases (the process becomes more variable).

When to Use AR1

The AR1 model is appropriate when you have ordered data where:

Economic Time Series

Stock returns, GDP growth, inflation rates - processes where shocks persist but decay over time.

Environmental Monitoring

Temperature anomalies, pollution levels, water quality - measurements with temporal inertia.

Panel Data with Autocorrelation

Repeated measurements where the current state depends on the previous state (e.g., health outcomes over visits).

Residual Correlation

Capturing temporal correlation in residuals after accounting for fixed effects and trends.

AR1 vs RW1

Use AR1 when you believe correlation decays exponentially with lag and the process is stationary. Use RW1 when you expect a non-stationary trend where adjacent differences (not levels) are independent.

Visual Understanding: How AR1 Works

AR1 creates correlated random effects where the correlation between time points decays exponentially with lag. The parameter \(\rho\) controls how strongly adjacent values are related.

AR1 Realizations

Each curve below is a single realization of an AR1 process. Notice how values tend to persist around recent levels before gradually reverting toward zero.

Effect of ρ on Persistence

The correlation parameter \(\rho\) controls how quickly values "forget" their past. Higher ρ = more persistent memory.

Correlation (ρ): 0.7

Autocorrelation Function: The Defining Feature

The correlation between observations at lag \(k\) is exactly \(\rho^k\). This exponential decay is what distinguishes AR1 from other models.

AR1 vs RW1: Side-by-Side Comparison

Compare the stationary AR1 (mean-reverting) with the non-stationary RW1 (trending). AR1 stays bounded while RW1 can drift indefinitely.

Correlation Structure: AR1 vs IID

The correlation matrix shows how observations at different time points are related. AR1 has off-diagonal correlation that decays with distance.

IID: Only diagonal is correlated
(each time point independent)

AR1 (ρ=0.7): Exponential decay
(correlation = ρ^|lag|)

Key difference from RW1:

AR1: correlation decays exponentially
RW1: correlation increases with position
AR1 is stationary; RW1 is not

When Data Looks Like...

Pattern in Data	Suggested Model	Reason
Oscillates around a mean	AR1	Mean-reverting behavior is characteristic of AR1
Persistent shocks that decay	AR1 (high ρ)	Memory of past values fades over time
Trending without bound	RW1	Non-stationary trends need random walk
Alternating high/low pattern	AR1 (negative ρ)	Negative autocorrelation creates oscillations
No temporal pattern	IID	Independence assumption is appropriate

Specification in pyINLA

Allowed Parameters

The AR1 random-effect spec dict accepts the following keys. The source of truth is model_defs/latent/ar1.py's ALLOWED_KEYS set, mirrored 1:1 here.

Parameter	Required	Description
Core (model-specific)
`model`	Yes	Must be `'ar1'`.
`id`	Yes	Column name containing time indices (integers 1, 2, 3, ...).
`hyper`	No	Dict with `'prec'`, `'rho'`, and/or `'mean'` keys for the three hyperparameters \(\theta_1, \theta_2, \theta_3\).
`n`	No	Number of time points (auto-detected if omitted).
`constr`	No	Sum-to-zero constraint. Default: `False` (AR1 is stationary, no constraint needed for identifiability).
`cyclic`	No	Circular AR(1): wraps \(t = n\) back to \(t = 1\). Default: `False`.
`values`	No	Custom location grid (list of integers): allows prediction at unobserved time points.
`extraconstr`	No	Extra linear constraints: `{'A': matrix, 'e': vector}` where \(\mathbf{A}\mathbf{x} = \mathbf{e}\).
`diagonal`	No	Extra constant added to the precision-matrix diagonal for numerical stability. Default: `0.0`.
Cross-model common keys
`weights`	No	Per-observation weighting vector applied to the random-effect contribution. Length must equal the number of observations.

Per-Parameter Runnable Examples

Click any parameter to expand a minimal runnable snippet. All snippets assume the shared setup block below (\(n = 30\) time points, \(\text{nobs} = 80\) observations on an integer time index 1..30).

# shared setup for every snippet below
import numpy as np
import pandas as pd
from pyinla import pyinla

rng = np.random.default_rng(42)
n    = 30           # number of ordered time points
nobs = 80           # observations
df = pd.DataFrame({
    'y':    rng.standard_normal(nobs),
    'x':    rng.standard_normal(nobs),
    'time': rng.integers(1, n + 1, size=nobs),
})

id + model required pair (minimal call)

# Bare ar1: 'id' is the column holding the ordered time index.
# 'n' is auto-detected from max(id); constr=False by default (ar1 is stationary).
formula = {
    'response': 'y',
    'fixed': ['1', 'x'],
    'random': [{'id': 'time', 'model': 'ar1'}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

hyper override the three theta priors (prec, rho, mean)

# Defaults:
#   prec: prior='loggamma', param=[1, 0.00005], initial=4, fixed=False
#   rho : prior='normal',   param=[0, 0.15],    initial=2, fixed=False (rho~0.76)
#   mean: prior='normal',   param=[0, 1],       initial=0, fixed=True
# Below: PC prior on precision, informative normal on logit(rho), mean kept fixed.
formula = {
    'response': 'y',
    'fixed': ['1', 'x'],
    'random': [{'id': 'time', 'model': 'ar1',
                 'hyper': {
                     'prec': {'prior': 'pc.prec', 'param': [1, 0.01],
                              'initial': 4, 'fixed': False},
                     'rho':  {'prior': 'normal', 'param': [2, 0.5],
                              'initial': 2, 'fixed': False}}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

n + values extend the position grid for prediction at unobserved time points

# Set 'n' larger than max(id) AND supply 'values' to pin the explicit
# time grid. Latent positions 1..40 are estimated even though only
# positions 1..30 appear in the data.
formula = {
    'response': 'y',
    'fixed': ['1', 'x'],
    'random': [{'id': 'time', 'model': 'ar1',
                 'n': n + 10,
                 'values': list(range(1, n + 11))}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

constr opt-in sum-to-zero constraint (default False)

# Unlike rw1/rw2, ar1's default is constr=False because the model is
# stationary and does not have a null direction in the level. You can
# still opt in if you want the field to integrate to zero (e.g. to
# strictly separate the ar1 effect from the intercept).
formula = {
    'response': 'y',
    'fixed': ['1', 'x'],
    'random': [{'id': 'time', 'model': 'ar1',
                 'constr': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

cyclic circular AR(1) (\(t = n\) wraps back to \(t = 1\))

# When cyclic=True the latent field wraps: x_n is a neighbor of x_1.
# Useful for seasonal patterns (e.g. month-of-year, hour-of-day).
formula = {
    'response': 'y',
    'fixed': ['1', 'x'],
    'random': [{'id': 'time', 'model': 'ar1',
                 'cyclic': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

values explicit location grid

# Pin the full ordered set of time points explicitly. The list may extend
# beyond the observed range to enable prediction at new time points.
formula = {
    'response': 'y',
    'fixed': ['1', 'x'],
    'random': [{'id': 'time', 'model': 'ar1',
                 'values': list(range(1, n + 11))}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

extraconstr additional linear constraints \(\mathbf{A}\mathbf{x} = \mathbf{e}\)

# The expected number of columns in A depends on how the latent grid is anchored:
#
# (A) Auto-detected grid (no 'values', no 'n'):
#       n_eff = len(unique(time))
#       A has shape (k, n_eff)        # ar1 default constr=False, no augmentation
# (B) Anchored grid (supply BOTH 'n' AND 'values'):
#       n_eff = n   (length of the values list)
#       A has shape (k, n)            # full field dimension
#     Robust against sparse data; recommended for prediction-style usage.
#
# Below: case (B) pins the first and last time points to zero. With n=30,
# A has shape (2, 30); 'e' has length 2.
A = np.zeros((2, n))
A[0, 0]     = 1.0
A[1, n - 1] = 1.0
e = np.array([0.0, 0.0])
formula = {
    'response': 'y',
    'fixed': ['1', 'x'],
    'random': [{'id': 'time', 'model': 'ar1',
                 'n':           n,
                 'values':      list(range(1, n + 1)),
                 'extraconstr': {'A': A, 'e': e}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

diagonal extra constant on the precision diagonal

# Small numerical-stability jitter. Rarely needed for ar1 with a proper prior.
formula = {
    'response': 'y',
    'fixed': ['1', 'x'],
    'random': [{'id': 'time', 'model': 'ar1',
                 'diagonal': 1e-4}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

weights per-observation weighting

# Per-observation weight vector applied to the ar1 contribution.
# Length must equal the number of observations.
w = rng.uniform(0.5, 1.5, size=nobs)
formula = {
    'response': 'y',
    'fixed': ['1', 'x'],
    'random': [{'id': 'time', 'model': 'ar1',
                 'weights': w}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

Hyperparameters

The AR1 model has three hyperparameters in the engine registry, though only the first two are estimated by default:

Marginal precision \(\kappa = \tau(1-\rho^2)\), parameterized as \(\theta_1 = \log(\kappa)\). Estimated by default.
Lag-1 correlation \(\rho\), parameterized as \(\theta_2 = \log\left(\frac{1+\rho}{1-\rho}\right)\) (the "logit" transform; equivalent to Fisher's z). Estimated by default.
Mean of the AR1 process, parameterized on the identity scale as \(\theta_3\). Experimental; fixed to zero by default (see the Notes section).

Default Configuration

Hyperparameter	Internal Name	Default Prior	Default Param	Initial
Marginal precision (\(\kappa\))	`prec`	`loggamma`	`[1, 0.00005]`	`4`
Lag-1 correlation (\(\rho\))	`rho`	`normal`	`[0, 0.15]`	`2`

Internal Transformations

INLA works with transformed parameters:

Precision: \(\theta_1 = \log(\kappa)\), so \(\kappa = \exp(\theta_1)\)
Correlation: \(\theta_2 = \log((1+\rho)/(1-\rho))\), so \(\rho = (e^{\theta_2} - 1)/(e^{\theta_2} + 1)\)

Initial value \(\theta_2 = 2\) corresponds to \(\rho \approx 0.76\).

Customizing Hyperparameters

Override defaults using the hyper key with semantic names:

'random': [
    {
        'id': 'time',
        'model': 'ar1',
        'hyper': {
            'prec': {
                'prior': 'pc.prec',
                'param': [1, 0.01],     # P(sigma > 1) = 0.01
                'initial': 4
            },
            'rho': {
                'prior': 'normal',
                'param': [0, 0.15],     # Prior on logit(rho)
                'initial': 2            # Starts at rho ~ 0.76
            }
        }
    }
]

Available Priors

Precision Prior

The default loggamma prior places a gamma distribution on the marginal precision \(\kappa\):

\[\kappa \sim \text{Gamma}(a, b)\]

# Default loggamma prior
'prec': {'prior': 'loggamma', 'param': [1, 0.00005]}

# PC prior for precision
'prec': {'prior': 'pc.prec', 'param': [1, 0.01]}

Correlation Prior

The default prior for \(\rho\) is a normal distribution on the logit-transformed scale:

\[\theta_2 = \log\left(\frac{1+\rho}{1-\rho}\right) \sim \mathcal{N}(\mu, \sigma^{-2})\]

With default param=[0, 0.15], the prior is \(\mathcal{N}(0, 1/0.15)\), which is fairly diffuse but centered at \(\rho = 0\).

# Default: diffuse prior centered at rho = 0
'rho': {'prior': 'normal', 'param': [0, 0.15]}

# More informative: expect positive correlation
'rho': {'prior': 'normal', 'param': [2, 0.5]}  # Centers at rho ~ 0.76

Fixed Values

To fix hyperparameters at known values:

import math

# Fix rho at 0.8
rho_fixed = 0.8
theta2_fixed = math.log((1 + rho_fixed) / (1 - rho_fixed))

'hyper': {
    'rho': {
        'initial': theta2_fixed,  # ~2.197
        'fixed': True
    }
}

Interpreting Results

Hyperparameter Summary

After fitting, access the hyperparameter estimates via result.summary_hyperpar:

print(result.summary_hyperpar)
#                               mean       sd  0.025quant  0.5quant  0.975quant     mode
# Precision for time          23.45    5.67      14.23     22.89      36.12    21.34
# Rho for time                 0.82    0.05       0.71      0.82       0.90     0.83

Interpretation:

Precision: Higher values = lower marginal variance of the AR1 process
Rho: The estimated lag-1 correlation; values close to 1 indicate high persistence

Random Effect Estimates

The posterior estimates for each time point are in result.summary_random:

# AR1 effect estimates
ar1_effect = result.summary_random['time']
print(ar1_effect)

# Plot the AR1 process
import matplotlib.pyplot as plt
plt.fill_between(ar1_effect['ID'], ar1_effect['0.025quant'], ar1_effect['0.975quant'], alpha=0.3)
plt.plot(ar1_effect['ID'], ar1_effect['mean'], 'b-')
plt.xlabel('Time')
plt.ylabel('AR1 Effect')

Notes

Stationarity: The constraint \(\rho \in (-1, 1)\) is enforced automatically through the logit transformation \(\theta_2 = \log\!\big(\tfrac{1+\rho}{1-\rho}\big)\), which maps \((-1, 1)\) bijectively to the real line. (Often called the "logit" transform; mathematically it is equivalent to \(2\,\tanh^{-1}(\rho)\), i.e. Fisher's z.)
Marginal vs Innovation Precision: pyINLA reports the marginal precision \(\kappa = \tau(1-\rho^2)\), not the innovation precision \(\tau\).
Mean Parameter: The third hyperparameter (the AR1 mean, \(\theta_3\)) is labeled experimental. It is fixed=True by default with initial = 0, so the process has zero mean unless you explicitly unlock it via hyper['mean']['fixed'] = False. Treat estimates of this parameter as experimental output.
Index Handling: Time indices should be consecutive integers. Gaps are handled (the engine uses sort(unique(idx))) but may affect efficiency; if you want a fixed-size grid that includes unobserved positions, supply values=list(range(1, n+1)) explicitly.

Identifiability

Unlike RW1, AR1 is a proper (stationary) model and does not require a sum-to-zero constraint. However, if you include an intercept, the mean of the AR1 process is absorbed into the intercept.

Related Models

AR(p) - Autoregressive order p (higher-order dependence, PACF parameterization)
RW1 - Random Walk of order 1 (non-stationary, first differences are IID)
RW2 - Random Walk of order 2 (smoother trends)
IID - Independent random effects (no temporal structure)