Overview
The AR(p) model (Autoregressive of order p) extends the AR1 model by allowing each observation to depend on the previous \(p\) observations:
- Higher-order dependence - captures more complex correlation structures than AR1
- PACF parameterization - uses partial autocorrelation coefficients \(\psi_1, \ldots, \psi_p\) for stationarity
- Flexible order - supports orders from 1 to 10
- Stationary by construction - the PACF parameterization guarantees stationarity
AR(p) is appropriate when the dependence structure is more complex than AR1's simple exponential decay: quasi-cyclic / pseudo-oscillatory behavior driven by complex roots of the AR polynomial, or processes with longer memory than a single lag can capture. For genuinely periodic patterns (calendar-anchored monthly, weekly, or quarterly cycles), use the seasonal model instead, which fixes the period rather than estimating it from the data.
Mathematical Formulation
For \(n\) time points, the AR(p) model for the Gaussian vector \(\mathbf{x} = (x_1, \ldots, x_n)\) is defined as:
\[x_t = \phi_1 x_{t-1} + \phi_2 x_{t-2} + \cdots + \phi_p x_{t-p} + \varepsilon_t, \quad t = p+1, \ldots, n\]
where the innovation process \(\{\varepsilon_t\}\) has constant variance. The joint distribution for \(\{x_1, \ldots, x_p\}\) is set to the stationary distribution of the process, so there are no boundary issues.
Marginal vs Innovation Precision. The hyperparameter \(\theta_1 = \log(\tau)\) parameterizes the marginal precision of \(x_t\) (i.e. the precision of the stationary distribution), not the innovation precision. This is the same convention used by AR(1). See the equivalent note on the AR1 page.
PACF Parameterization
The AR(p) model has awkward constraints on the \(\phi\)-parameters for stationarity. INLA uses the partial autocorrelation function (PACF) parameterization \(\{\psi_k, k=1, \ldots, p\}\) where \(|\psi_k| < 1\) for all \(k\), along with the marginal precision \(\tau\).
PACF to AR Coefficients
For \(p=1\): \(\psi_1 = \phi_1\)
For \(p=2\): \(\psi_1 = \phi_1/(1-\phi_2)\) and \(\psi_2 = \phi_2\)
The Durbin-Levinson recursions convert between PACF and AR coefficients.
Correlation Structure
Unlike AR1's simple exponential decay, AR(p) can produce:
- Oscillatory patterns - when some \(\phi_k < 0\)
- Slower decay - higher-order terms extend the memory
- Complex autocorrelation - the ACF depends on all \(p\) parameters
When to Use AR(p)
The AR(p) model is appropriate when you have ordered data where:
Complex Time Series
Data with autocorrelation that doesn't decay exponentially - requires multiple lags to capture the dependence.
Quasi-Periodic Patterns
Oscillatory behavior in residuals, cyclic patterns, or data with characteristic frequencies.
Model Selection
When you want to estimate the appropriate AR order from the data rather than assuming AR1.
Extended Memory
Processes where shocks persist longer than what AR1 can capture, but shorter than random walks.
Use AR1 for simple exponentially-decaying correlation. Use AR(p) when the ACF shows significant values beyond lag 1, or when the partial ACF suggests multiple significant lags. Start with low \(p\) and increase if needed.
Specification in pyINLA
Allowed Parameters
| Parameter | Required | Description |
|---|---|---|
id |
Yes | Column name containing time indices (integers 1, 2, 3, ...) |
model |
Yes | Must be 'ar' |
order |
Yes | The autoregressive order \(p\) (integer from 1 to 10) |
hyper |
No | Dict with 'prec' and/or 'pacf1', 'pacf2', ... for hyperparameter configuration |
values |
No | Custom location grid (list of integers) - allows prediction at unobserved time points |
constr |
No | Sum-to-zero constraint (default: False) |
diagonal |
No | Add small value to diagonal for numerical stability |
extraconstr |
No | Extra linear constraints (dict with 'A' matrix and 'e' vector) |
weights |
No | Per-observation weighting vector applied to the random-effect contribution. Length must equal the number of observations. |
Per-Parameter Runnable Examples
Click any parameter to expand a minimal runnable snippet. All snippets assume the shared setup block below (\(n = 30\) time points, \(\text{nobs} = 80\) observations on an integer time index 1..30).
# shared setup for every snippet below
import numpy as np
import pandas as pd
from pyinla import pyinla
rng = np.random.default_rng(42)
n = 30 # number of ordered time points
nobs = 80 # observations
df = pd.DataFrame({
'y': rng.standard_normal(nobs),
'x': rng.standard_normal(nobs),
'time': rng.integers(1, n + 1, size=nobs),
})
id + model + order required trio (minimal call)
# Bare ar: 'id' is the column holding the ordered time index,
# 'order' is the AR(p) order (required; integer in 1..10).
# With order=1 the model collapses to AR(1) (one prec + one pacf).
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
order choose the AR(p) order (1..10)
# order = p activates p PACF coefficients (pacf1..pacf_p) plus the
# marginal log-precision. Below are three common choices.
#
# order=1: shortest memory (equivalent to AR(1)).
formula1 = {'response': 'y', 'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 1}]}
# order=3: three PACF coefficients (richer dependence).
formula3 = {'response': 'y', 'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 3}]}
# order=5: five-lag memory; useful for oscillatory or seasonal residuals.
formula5 = {'response': 'y', 'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 5}]}
result1 = pyinla(formula=formula1, family='gaussian', data=df)
hyper override the precision and/or PACF priors (prec, pacf1, pacf2, ...)
# Defaults (for every active theta):
# prec : prior='pc.prec', param=[3, 0.01], initial=4, fixed=False
# pacf1: prior='pc.cor0', param=[0.5, 0.5], initial=1, fixed=False
# pacf2: prior='pc.cor0', param=[0.5, 0.4], initial=0, fixed=False
# pacf_k (k=3..10) shrink with decreasing alpha (smaller -> tighter to 0).
# Below: tighter PC prior on precision; pin pacf2 at 0 to test a parsimonious
# AR(3) model that retains only lag-1 and lag-3 dependence.
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 3,
'hyper': {
'prec': {'prior': 'pc.prec', 'param': [1, 0.01]},
'pacf1': {'initial': 0.5, 'fixed': False},
'pacf2': {'initial': 0.0, 'fixed': True},
'pacf3': {'initial': -0.1, 'fixed': False}}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
values explicit location grid (also for extension / prediction)
# Pin the full ordered set of time points explicitly. The list may extend
# beyond the observed range so the latent ar field is estimated at new
# time points that have no observations.
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2,
'values': list(range(1, n + 11))}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
constr opt-in sum-to-zero constraint (default False)
# Like ar1, ar(p) defaults to constr=False because the model is stationary.
# Setting constr=True pins the field to sum to zero, useful for strictly
# separating the ar effect from an intercept. Drop the '1' from 'fixed'
# when you opt in (otherwise the level becomes redundant).
formula = {
'response': 'y',
'fixed': ['0', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2,
'constr': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
extraconstr additional linear constraints \(\mathbf{A}\mathbf{x} = \mathbf{e}\)
# ar's default is constr=False (no rank-1 augmentation), so A has the full
# latent-field dimension. With 'values' = 1..n the latent length is n.
# Below: pin the first and last time points to zero. A has shape (2, n);
# 'e' has length 2.
A = np.zeros((2, n))
A[0, 0] = 1.0
A[1, n - 1] = 1.0
e = np.array([0.0, 0.0])
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2,
'values': list(range(1, n + 1)),
'extraconstr': {'A': A, 'e': e}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
diagonal extra constant on the precision diagonal
# Small numerical-stability jitter. Rarely needed for ar with a proper prior.
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2,
'diagonal': 1e-4}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
weights per-observation weighting
# Per-observation weight vector applied to the ar contribution.
# Length must equal the number of observations.
w = rng.uniform(0.5, 1.5, size=nobs)
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2,
'weights': w}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
Prior catalog (one snippet per family; 1:1 with the AR(p) parity scenarios)
Each snippet below sets a single non-default prior for either the marginal-precision hyperparameter (prec, \(\theta_1\)) or a PACF hyperparameter (pacf1, pacf2, …, pacf_p, \(\theta_2, \ldots, \theta_{p+1}\)). Configurations marked engine-limited are intentionally omitted (the engine's optimizer or table machinery fails on those for AR-type models; see the rw1/ar1 audit notes).
Prior on prec (marginal log-precision)
prec · loggamma classical gamma prior on \(\tau\)
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2,
'hyper': {'prec': {'prior': 'loggamma',
'param': [1, 0.00005]}}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
prec · flat improper flat on \(\theta_1\)
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2,
'hyper': {'prec': {'prior': 'flat', 'param': []}}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
prec · logtnormal half-normal on \(\tau\) (log-scale)
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2,
'hyper': {'prec': {'prior': 'logtnormal',
'param': [0.0, 1.0]}}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
prec · laplace double-exponential on \(\theta_1\)
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2,
'hyper': {'prec': {'prior': 'laplace',
'param': [0.0, 1.0]}}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
prec · pc.mgamma PC prior, modified-gamma variant
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2,
'hyper': {'prec': {'prior': 'pc.mgamma',
'param': [1.0]}}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
prec · jeffreystdf Jeffreys-type reference prior
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2,
'hyper': {'prec': {'prior': 'jeffreystdf',
'param': []}}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
prec · expression: … custom log-density (R expression)
# A custom log-prior is supplied as an R expression string. The body is
# evaluated by the C engine; `theta` is the internal-scale variable.
expr = "expression: log_dens = -0.5 * theta * theta; log_dens;"
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2,
'hyper': {'prec': {'prior': expr, 'param': []}}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
prec · table: … custom tabulated log-density
# Inline tabulated prior: alternating (theta, log_density) pairs.
# The C engine requires >= 4 grid points.
table = "table: -6 -18 -2 -2 2 -2 6 -18"
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2,
'hyper': {'prec': {'prior': table, 'param': []}}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
Prior on pacf_k (PACF parameters)
pacf_k · pc.cor0 custom tighter shrinkage toward zero
# Default pc.cor0 has param=[0.5, decreasing_alpha]. Below: tighter prior.
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2,
'hyper': {
'pacf1': {'prior': 'pc.cor0', 'param': [0.3, 0.1]},
'pacf2': {'prior': 'pc.cor0', 'param': [0.3, 0.05]}}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
pacf_k · pc.cor1 chain PC prior shrinking toward \(\psi = 1\)
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2,
'hyper': {
'pacf1': {'prior': 'pc.cor1', 'param': [0.9, 0.9]},
'pacf2': {'prior': 'pc.cor1', 'param': [0.9, 0.9]}}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
pacf_k · normal custom normal prior on \(\theta_{k+1}\) (Fisher-z scale)
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2,
'hyper': {
'pacf1': {'prior': 'normal', 'param': [0.5, 1.0]},
'pacf2': {'prior': 'normal', 'param': [0.0, 2.0]}}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
pacf_k · gaussian alias (gaussian is an alias for normal)
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2,
'hyper': {
'pacf1': {'prior': 'gaussian', 'param': [0.0, 0.5]},
'pacf2': {'prior': 'gaussian', 'param': [0.0, 0.5]}}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
pacf_k · flat improper flat on \(\theta_{k+1}\)
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2,
'hyper': {
'pacf1': {'prior': 'flat', 'param': []},
'pacf2': {'prior': 'flat', 'param': []}}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
pacf_k · expression: … custom log-density
expr = "expression: log_dens = -0.5 * theta * theta; log_dens;"
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2,
'hyper': {
'pacf1': {'prior': expr, 'param': []},
'pacf2': {'prior': expr, 'param': []}}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
pacf_k · table: … custom tabulated log-density
table = "table: -6 -18 -2 -2 2 -2 6 -18"
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2,
'hyper': {
'pacf1': {'prior': table, 'param': []},
'pacf2': {'prior': table, 'param': []}}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
pacf1 · mvnorm joint multivariate normal on all PACF coefficients
# Engine convention: the multivariate normal prior on the
# full PACF vector is specified via the pacf1 slot only. param = c(mu, Q)
# where mu has length p (means) and Q is the p x p precision matrix
# flattened column-major. For order=2: mu=(0,0), Q=diag(0.15, 0.15).
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'ar', 'order': 2,
'hyper': {'pacf1': {'prior': 'mvnorm',
'param': [0, 0, # mu = (0, 0)
0.15, 0,
0, 0.15]}}}], # Q = diag(0.15)
}
result = pyinla(formula=formula, family='gaussian', data=df)
Hyperparameters
The AR(p) model has \(p+1\) hyperparameters:
- Marginal precision \(\tau\), parameterized as \(\theta_1 = \log(\tau)\)
- PACF parameters \(\psi_1, \ldots, \psi_p\), parameterized as \(\theta_{k+1} = \log\left(\frac{1+\psi_k}{1-\psi_k}\right)\) for \(k = 1, \ldots, p\)
Default Configuration
| Hyperparameter | Semantic Name | Default Prior | Default Param | Initial |
|---|---|---|---|---|
| Marginal precision (\(\tau\)) | prec |
pc.prec |
[3, 0.01] |
4 |
| PACF1 (\(\psi_1\)) | pacf1 |
pc.cor0 |
[0.5, 0.5] |
1 |
| PACF2 (\(\psi_2\)) | pacf2 |
pc.cor0 |
[0.5, 0.4] |
0 |
| PACF3 (\(\psi_3\)) | pacf3 |
pc.cor0 |
[0.5, 0.3] |
0 |
| PACF4 (\(\psi_4\)) | pacf4 |
pc.cor0 |
[0.5, 0.2] |
0 |
| PACF5-10 (\(\psi_5\)-\(\psi_{10}\)) | pacf5-pacf10 |
pc.cor0 |
[0.5, 0.1] |
0 |
Internal Transformations
INLA works with transformed parameters:
- Precision: \(\theta_1 = \log(\tau)\), so \(\tau = \exp(\theta_1)\)
- PACF: \(\theta_{k+1} = \log((1+\psi_k)/(1-\psi_k))\), so \(\psi_k = 2\exp(\theta_{k+1})/(1+\exp(\theta_{k+1})) - 1\)
The logit transformation ensures \(|\psi_k| < 1\) for stationarity.
Customizing Hyperparameters
Override defaults using the hyper key with semantic names:
'random': [
{
'id': 'time',
'model': 'ar',
'order': 3,
'hyper': {
'prec': {
'prior': 'pc.prec',
'param': [1, 0.01]
},
'pacf1': {
'initial': 0.5, # Start near rho ~ 0.46
'fixed': False
},
'pacf2': {
'initial': -0.1,
'fixed': False
},
'pacf3': {
'initial': 0,
'fixed': False
}
}
}
]
Available Priors
Precision Prior
The default pc.prec prior is a Penalised Complexity prior on the marginal precision:
# Default PC prior
'prec': {'prior': 'pc.prec', 'param': [3, 0.01]} # P(sigma > 3) = 0.01
# Alternative loggamma prior
'prec': {'prior': 'loggamma', 'param': [1, 0.01]}
PACF Priors
The default pc.cor0 prior is the PC prior for a correlation with \(\rho = 0\) as the base model. It is parameterized by the pair \((u, \alpha)\) where
\[P(|\psi_k| > u) = \alpha, \qquad 0 \le u < 1,\ \ 0 < \alpha < 1.\]
So param=[0.5, 0.5] means "assign 50% probability to \(|\psi_k| > 0.5\)" (a fairly diffuse prior). Tightening the prior toward zero is done by lowering \(\alpha\): e.g. [0.5, 0.1] means \(P(|\psi_k| > 0.5) = 0.1\). The internal density is \(\pi(\psi) = \lambda\,e^{-\lambda\mu(\psi)} J(\psi)\) with \(\mu(\psi) = \sqrt{-\log(1-\psi^2)}\) and \(\lambda = -\log(\alpha)/\mu(u)\); pyinla passes \((u, \alpha)\) through to the engine, which sets up the density internally.
# Default PC prior for PACF (shrinks toward 0)
'pacf1': {'prior': 'pc.cor0', 'param': [0.5, 0.5]}
# Multivariate normal prior for all PACF parameters
# Specified via pacf1 with mean vector and precision matrix
'pacf1': {'prior': 'mvnorm', 'param': [0, 0, 0.15, 0, 0, 0.15]} # p=2 example
Fixed Values
To fix PACF parameters at known values:
# Fix PACF values at known AR coefficients
'hyper': {
'prec': {'prior': 'loggamma', 'param': [1, 0.01]},
'pacf1': {'initial': 0.7, 'fixed': True},
'pacf2': {'initial': -0.2, 'fixed': True}
}
Interpreting Results
Hyperparameter Summary
After fitting, access the hyperparameter estimates via result.summary_hyperpar. Running the shared-setup AR(2) snippet above produces something like:
print(result.summary_hyperpar)
# mean sd 0.025quant 0.5quant 0.975quant mode
# Precision for the Gaussian observations 1.9444 0.3166 1.3829 1.9235 2.6257 1.8898
# Precision for time 13.4108 5.2414 6.0938 12.4475 26.3851 10.7265
# PACF1 for time 0.1579 0.1735 -0.1694 0.1552 0.5039 0.1237
# PACF2 for time -0.0124 0.0989 -0.2124 -0.0104 0.1757 -0.0008
Interpretation:
- Precision for time: The marginal precision \(\tau\) of the AR(p) process
- PACF1, PACF2, ...: The partial autocorrelation coefficients \(\psi_1, \psi_2, \ldots\)
These values are illustrative output from one run on the shared-setup dataset (which is iid noise, so the PACF estimates concentrate near zero). The columns are mean, sd, three posterior quantiles, and posterior mode. Not bit-reproducible: the AR(p) optimizer in the C engine produces small numerical variations across runs even with the same data and seed (\(\sim\)1–10% drift on the precision-style estimates is typical), so your numbers will differ slightly from these. Posterior shapes and PACF magnitudes (concentrating near zero for iid data here) will be similar. Numbers will differ further for your own data.
Converting PACF to AR Coefficients
To convert PACF estimates to traditional AR coefficients \(\phi\), use the Durbin-Levinson algorithm:
def pacf_to_phi(pacf):
"""Convert PACF to AR coefficients using Durbin-Levinson."""
p = len(pacf)
phi = [[0] * (i+1) for i in range(p)]
phi[0][0] = pacf[0]
for k in range(1, p):
phi[k][k] = pacf[k]
for j in range(k):
phi[k][j] = phi[k-1][j] - pacf[k] * phi[k-1][k-1-j]
return phi[-1]
# Example: convert PACF means from summary_hyperpar to AR coefficients.
# Using illustrative values for a process with stronger lag-1 dependence
# (the shared-setup data is iid, so its real PACF values are near zero).
pacf_est = [0.62, -0.11]
phi_est = pacf_to_phi(pacf_est)
print(f"Estimated AR coefficients: phi = {phi_est}")
# Estimated AR coefficients: phi = [0.6882, -0.11]
Notes
- Order limit: Currently, the order \(p\) is limited to 10. Contact the developers if this is insufficient.
- Stationarity: The PACF parameterization ensures stationarity automatically. No explicit constraints needed.
- Fixed PACF caveat: If some \(\psi_k\) parameters are fixed and \(k < p\), the marginal log-likelihood may be incorrect (uses joint prior, not conditional).
- MVN prior: For joint multivariate normal priors on all PACF parameters, specify via the
pacf1hyper key withmvnormprior. - Conversion functions: pyINLA does not ship a built-in PACF↔φ helper. Implement the conversion manually in Python as shown above.
Model Selection
When choosing the order \(p\), examine the partial autocorrelation function of your data. The PACF should "cut off" at lag \(p\) for an AR(p) process. Over-specifying \(p\) adds unnecessary parameters; under-specifying misses important dependence.