Overview
The RW1 model (Random Walk of order 1) is a fundamental model for capturing smooth temporal or spatial variation. It assumes that:
- First differences are independent - \(x_{i+1} - x_i \sim N(0, 1/\tau)\)
- Adjacent values are correlated - nearby time points or locations share information
- Intrinsically Gaussian - forms a Gaussian Markov Random Field (GMRF)
RW1 is ideal for modeling trends and piecewise linear patterns in ordered data, such as time series, age effects, or spatial transects.
Mathematical Formulation
For \(n\) ordered positions, the RW1 model specifies:
\[x_{i+1} - x_i \stackrel{\text{iid}}{\sim} \mathcal{N}(0, \tau^{-1}), \quad i = 1, \ldots, n-1\]
This can be written equivalently as:
\[x_{i+1} \mid x_i \sim \mathcal{N}(x_i, \tau^{-1})\]
where:
- \(x_i\) is the random effect at position \(i\)
- \(\tau\) is the precision of the increments (higher = smoother)
- \(\sigma = 1/\sqrt{\tau}\) is the standard deviation of the increments
The precision matrix for the full vector \(\mathbf{x} = (x_1, \ldots, x_n)^\top\) is:
\[\mathbf{Q} = \tau \begin{pmatrix} 1 & -1 & & \\ -1 & 2 & -1 & \\ & \ddots & \ddots & \ddots \\ & & -1 & 2 & -1 \\ & & & -1 & 1 \end{pmatrix}\]
This tridiagonal structure makes RW1 computationally efficient, even for large \(n\).
Identifiability Note
RW1 is an improper prior - the precision matrix is rank-deficient (rank \(n-1\)). By default, INLA adds a sum-to-zero constraint (constr=True) to make it identifiable. If you disable this constraint, you should remove the intercept from your model.
When to Use RW1
The RW1 model is appropriate when you have ordered data where:
Time Series Trends
Capture step-wise temporal evolution where the level persists between time points but can shift. Good for trend extraction in epidemiological surveillance or economic data.
Age-Period Effects
Model how effects change with age, time period, or cohort. Adjacent ages/periods are expected to have similar effects.
Spatial Transects
One-dimensional spatial variation along a gradient (e.g., elevation, depth, distance from source). Adjacent positions share information.
Step-like Covariate Effects
Effects of an ordered categorical covariate where neighboring categories should be similar but the function need not be smooth. For genuinely smooth function estimation of a continuous covariate, prefer RW2.
Use RW1 when you expect locally constant behavior: a random walk whose level can shift between adjacent positions (penalizes first differences). Use RW2 when you expect locally linear behavior: smoother curves where the slope changes gradually (penalizes second differences).
Visual Understanding: How RW1 Works
RW1 creates a correlated random walk by linking adjacent values: each step is a Gaussian increment whose variance is controlled by the precision \(\tau\). Higher precision means smaller increments and a tighter trajectory; lower precision allows the walk to wander further between adjacent positions.
Random Walk Realizations
Each curve below is a single realization of an RW1 process. At high precision the trajectory stays tight; at low precision it can wander, with occasional large excursions.
Effect of Precision on Smoothness
The precision \(\tau\) controls how much the curve can wiggle. Higher precision = smoother trends.
First Differences: The Building Block
RW1 is defined by its first differences \(\Delta x_i = x_{i+1} - x_i\). These differences are IID normal - the model's single hyperparameter controls their variance.
RW1 vs IID: Side-by-Side Comparison
Compare the correlated RW1 (each step depends on the previous) with IID (every value drawn independently). The same number of parameters produces very different patterns.
Correlation Structure: Why RW1 has Local Dependence
The key difference between IID and RW1 is their correlation structure. In RW1, nearby positions share information - earlier positions "remember" their influence on later ones.
IID: Only diagonal is correlated
(each group independent)
RW1: Correlation decays with distance
(nearby positions similar)
- Diagonal = 1 (self-correlation)
- Darker = higher correlation
- RW1: correlation spreads from diagonal
Linear vs Cyclic: Topology of the Neighbor Graph
Setting 'cyclic': True changes the neighbor graph from a line to a circle: position n becomes a neighbor of position 1. This adds one extra increment, x1 − xn, to the prior, so values at the start and end of the index are tied together. Same single hyperparameter τ, same intrinsic structure, just one extra link in the graph.
Linear (default): each position has at most two neighbors. Positions 1 and 10 are endpoints with only one neighbor each.
Precision τ on each of the 9 increments xi+1 − xi.
Cyclic (cyclic=True): position 10 wraps to position 1. The pink dashed edge is the extra link.
Precision τ on all 10 increments, including x1 − x10.
A sample path of 50 points makes the difference visible: the linear walk drifts (start and end fall at different heights), while the cyclic walk is constrained to come back to its starting value.
Linear sample path (start ≠ end)
End point (red) sits well above / below the start (green). The walk is free to drift.
Cyclic sample path (start = end)
End point lands at the same height as the start (both green). The extra wrap-edge increment x1 − xn in the prior pulls them together.
cyclic=True when the index wraps naturally:
- hour-of-day (0 and 23 are adjacent)
- day-of-week (Sunday and Monday)
- month-of-year (December and January)
- angular / directional variables (0° and 360°)
constr=True (sum-to-zero), which is the pyINLA default for this family.
When Data Looks Like...
| Pattern in Data | Suggested Model | Reason |
|---|---|---|
| Step-wise trend over time | RW1 | Adjacent time points share information; level can persist or shift |
| Sudden level shifts | RW1 (low precision) | RW1 allows jumps while maintaining local structure |
| Smooth curves (no kinks) | RW2 | RW2 penalizes changes in slope, producing locally linear behavior |
| Random scatter by group | IID | No temporal/spatial structure to exploit |
| Seasonal patterns | RW1 + Seasonal | Combine trend with periodic component |
Specification in pyINLA
Allowed Parameters
The RW1 random-effect spec dict accepts the following keys. The source of truth is model_defs/latent/rw1.py's ALLOWED_KEYS set, mirrored 1:1 here.
| Parameter | Required | Description |
|---|---|---|
| Core (model-specific) | ||
model |
Yes | Must be 'rw1'. |
id |
Yes | Column name containing time/position indices (integers). |
hyper |
No | List with one dict: [{prior, param, initial, fixed}] for the log-precision \(\theta\). |
n |
No | Number of positions (auto-detected if omitted). |
constr |
No | Sum-to-zero constraint. Default: True. |
cyclic |
No | Circular/periodic random walk. Default: False. |
scale.model |
No | Scale precision matrix to have generalized variance 1. Default: False. |
values |
No | Custom location grid (list of integers); allows prediction at unobserved locations. |
extraconstr |
No | Extra linear constraints: {'A': matrix, 'e': vector} where \(\mathbf{A}\mathbf{x} = \mathbf{e}\). |
diagonal |
No | Extra constant added to the precision-matrix diagonal for numerical stability. Default: 0.0. |
| Cross-model common keys | ||
weights |
No | Per-observation weighting vector applied to the random-effect contribution. Length must equal the number of observations. |
Per-Parameter Runnable Examples
Click any parameter to expand a minimal runnable snippet. All snippets assume the shared setup block below (\(n = 30\) positions, \(\text{nobs} = 80\) observations on an integer time index 1..30).
# shared setup for every snippet below
import numpy as np
import pandas as pd
from pyinla import pyinla
rng = np.random.default_rng(42)
n = 30 # number of ordered positions
nobs = 80 # observations
df = pd.DataFrame({
'y': rng.standard_normal(nobs),
'x': rng.standard_normal(nobs),
'time': rng.integers(1, n + 1, size=nobs),
})
id + model required pair (minimal call)
# Bare rw1: 'id' is the column holding the ordered position index.
# 'n' is auto-detected from max(id); constr=True is on by default.
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'rw1'}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
hyper override the loggamma prior on log-precision
# Default: prior='loggamma', param=[1, 0.00005], initial=4, fixed=False.
# Below: a more informative loggamma and a different starting value.
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'rw1',
'hyper': [{'prior': 'loggamma',
'param': [1, 0.01],
'initial': 2,
'fixed': False}]}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
n + values extend the position grid for prediction at unobserved locations
# Set 'n' larger than max(id) AND supply 'values' to pin the explicit
# location grid. Latent positions 1..40 are estimated even though only
# positions 1..30 appear in the data.
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'rw1',
'n': n + 10,
'values': list(range(1, n + 11))}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
constr disable the sum-to-zero constraint (also drop the intercept)
# Default is constr=True. Disabling it leaves the latent field's level
# unidentifiable, so the intercept ('1') MUST be removed from 'fixed'.
formula = {
'response': 'y',
'fixed': ['0', 'x'], # no intercept
'random': [{'id': 'time', 'model': 'rw1',
'constr': False}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
cyclic circular random walk (position n wraps to position 1)
# When cyclic=True the latent field wraps: x_n is a neighbor of x_1.
# The sum-to-zero constraint is removed automatically by the engine.
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'rw1',
'cyclic': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
scale.model scale Q to generalized variance 1
# Rescales the precision so the latent field's generalized variance
# is 1 regardless of n. Useful for making the prior on the
# log-precision comparable across rw1 components of different sizes.
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'rw1',
'scale.model': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
values explicit location grid
# Pin the full ordered set of positions explicitly. The list may extend
# beyond the observed range to enable prediction at new positions.
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'rw1',
'values': list(range(1, n + 11))}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
extraconstr additional linear constraints \(\mathbf{A}\mathbf{x} = \mathbf{e}\)
# The expected number of columns in A depends on how the latent grid is anchored:
#
# (A) Auto-detected grid (no 'values', no 'n'):
# n_eff = len(unique(time))
# A has shape (k, n_eff - 1) # sum-to-zero augmentation reduces dim by 1
# This is data-dependent: missing positions in 'time' shrink n_eff.
#
# (B) Anchored grid (supply BOTH 'n' AND 'values'):
# n_eff = n (length of the values list)
# A has shape (k, n) # full field dimension
# Robust against sparse data; recommended for prediction-style usage.
#
# Below: case (B) pins the first and last positions to zero. With n=30,
# A has shape (2, 30); 'e' has length 2.
A = np.zeros((2, n))
A[0, 0] = 1.0
A[1, n - 1] = 1.0
e = np.array([0.0, 0.0])
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'rw1',
'n': n,
'values': list(range(1, n + 1)),
'extraconstr': {'A': A, 'e': e}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
diagonal extra constant on the precision diagonal
# Small numerical-stability jitter. Rarely needed for rw1 with a proper prior.
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'rw1',
'diagonal': 1e-4}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
weights per-observation weighting
# Per-observation weight vector applied to the rw1 contribution.
# Length must equal the number of observations.
w = rng.uniform(0.5, 1.5, size=nobs)
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'rw1',
'weights': w}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
Index Handling
The id column should contain integer values representing ordered positions. pyINLA maps these to consecutive indices internally:
| Input | Behavior |
|---|---|
[1, 2, 3, 4, 5] |
Standard consecutive indices - works directly |
[2000, 2001, 2002, ...] |
Year values - mapped to positions 1, 2, 3, ... |
[1, 1, 2, 2, 3, 3] |
Repeated indices - multiple observations per time point |
[1, NA, 3, NA, 5] |
NA/NaN values - observation excluded from this random effect |
Key points:
- Indices define the order - the RW1 structure depends on which positions are adjacent
- Multiple observations can share the same index (e.g., multiple measurements at the same time point)
- Gaps in indices are handled automatically (positions are mapped consecutively)
Missing values (NaN)
pyINLA accepts np.nan / NA in two places without error:
- NaN in the
timecolumn: that observation contributes nothing to the rw1 component. The covariates file records a-1sentinel for those rows. The remaining observations drive the latent field as usual. - NaN in the
weightsvector: also accepted. The affected rows contribute no weighted rw1 effect.
Both NaN positions and NaN weights interact cleanly with the other rw1 keys (constr, cyclic, explicit values). See the nan_in_idx_* and nan_in_weights cases in the rw1 parity suite for the verified combinations.
Hyperparameters
The RW1 model has one hyperparameter: the precision \(\tau\) of the first differences. In pyINLA, we work with the log-transformed precision \(\theta = \log(\tau)\).
Default Configuration
| Parameter | Default Value | Description |
|---|---|---|
prior |
'loggamma' |
Log-gamma prior on precision (gamma prior on \(\tau\)) |
param |
[1, 0.00005] |
Shape and rate parameters for the gamma prior |
initial |
4 |
Starting value for \(\theta = \log(\tau)\); corresponds to \(\tau \approx 54.6\) |
fixed |
False |
Whether to estimate (False) or fix (True) the precision |
Customizing Hyperparameters
Override defaults using the hyper key:
'random': [
{
'id': 'time',
'model': 'rw1',
'hyper': [{
'prior': 'loggamma',
'param': [1, 0.01], # More informative prior
'initial': 2, # Different starting point
'fixed': False
}]
}
]
Available Priors
Log-Gamma Prior (Default)
The default loggamma prior places a gamma distribution on the precision \(\tau\):
\[\tau \sim \text{Gamma}(a, b)\]
With default param=[1, 0.00005], this is a weakly informative prior allowing both rough and smooth curves.
# Default loggamma prior
'hyper': [{'prior': 'loggamma', 'param': [1, 0.00005]}]
# More informative: expect moderate smoothing
'hyper': [{'prior': 'loggamma', 'param': [1, 0.01]}]
PC Prior (Penalised Complexity)
The pc.prec prior penalizes deviation from the base model (constant effect). Parameterized via the marginal standard deviation:
\[P(\sigma > u) = \alpha\]
# PC prior: P(sigma > 1) = 0.01
# "I believe the increment SD is unlikely to exceed 1"
'hyper': [{'prior': 'pc.prec', 'param': [1, 0.01]}]
# PC prior: P(sigma > 0.5) = 0.05
# Expect a smooth, slowly-varying trend
'hyper': [{'prior': 'pc.prec', 'param': [0.5, 0.05]}]
Fixed Precision
To fix the precision at a known value:
import math
# Fix precision at tau = 100 (very smooth)
'hyper': [{
'initial': math.log(100), # log(tau) = 4.605
'fixed': True
}]
Sum-to-Zero Constraint
By default, RW1 includes a sum-to-zero constraint (constr=True) for identifiability with an intercept. You can disable this:
# Disable constraint (remove intercept from model)
model = {
'response': 'y',
'fixed': ['0', 'x'], # No intercept
'random': [
{
'id': 'time',
'model': 'rw1',
'constr': False
}
]
}
When to Disable Constraint
Only set constr=False if you remove the intercept from your model. Otherwise, the model is not identifiable and results may be unreliable.
Interpreting Results
Hyperparameter Summary
After fitting, access the precision estimate via result.summary_hyperpar:
print(result.summary_hyperpar)
# mean sd 0.025quant 0.5quant 0.975quant mode
# Precision for time 45.23 12.34 24.56 43.21 74.89 41.02
Higher precision = smoother trend (smaller increments). To convert to the posterior of the standard deviation of increments, transform the precision marginal with pyinla.tmarginal. Plugging 1/sqrt(mean(tau)) in directly is biased: by Jensen's inequality the convex map \(\tau \mapsto 1/\sqrt{\tau}\) gives \(1/\sqrt{E[\tau]} \le E[1/\sqrt{\tau}]\), so the naive expression underestimates \(E[\sigma]\).
tau_marg = result.marginals_hyperpar['Precision for time']
sigma_marg = pyinla.tmarginal(lambda tau: 1 / np.sqrt(tau), tau_marg)
sigma_summary = pyinla.zmarginal(sigma_marg) # mean, sd, quantiles of the increment SD
print(f"Posterior mean of increment SD: {sigma_summary['mean']:.4f}")
For a quick "central value" of \(\sigma\) only, the median is exact under any monotone transform (no marginal transform needed):
tau_median = result.summary_hyperpar['0.5quant'].values[0]
sigma_median = 1 / np.sqrt(tau_median)
print(f"Posterior median of increment SD: {sigma_median:.4f}")
Random Effect Estimates
The posterior estimates for each time point are in result.summary_random:
# Temporal trend estimates
trend = result.summary_random['time']
print(trend)
# ID mean sd 0.025quant 0.5quant 0.975quant mode
# 0 1 -0.234567 0.089123 -0.409258 -0.234567 -0.059876 -0.234567
# 1 2 -0.123456 0.081234 -0.282345 -0.123456 0.035433 -0.123456
# ...
# Plot the trend
import matplotlib.pyplot as plt
plt.fill_between(trend['ID'], trend['0.025quant'], trend['0.975quant'], alpha=0.3)
plt.plot(trend['ID'], trend['mean'], 'b-')
plt.xlabel('Time')
plt.ylabel('RW1 Effect')