Overview
The Seasonal model captures periodic patterns with a fixed period \(m\) (e.g., 12 for monthly data with yearly cycles). It assumes that:
- Seasonal sums are independent - \(x_i + x_{i+1} + \cdots + x_{i+m-1} \sim N(0, 1/\tau)\)
- Pattern repeats - effects cycle with period \(m\)
- Intrinsically Gaussian - forms a Gaussian Markov Random Field (GMRF)
The seasonal model is ideal for modeling repeating patterns in time series where the same point in each cycle tends to have a similar effect (e.g., July has high travel demand year after year). The pattern is not pinned to be exactly periodic: the running-sum penalty allows the seasonal effect to drift slowly over time, with the rate of drift controlled by the precision \(\tau\) (higher \(\tau\) = more regular, lower \(\tau\) = more flexible).
Mathematical Formulation
For \(n\) time points with periodicity \(m\), the seasonal model specifies that consecutive sums of \(m\) observations are independent Gaussian:
\[x_i + x_{i+1} + \cdots + x_{i+m-1} \stackrel{\text{iid}}{\sim} \mathcal{N}(0, \tau^{-1}), \quad i = 1, \ldots, n-m+1\]
The density for \(\mathbf{x} = (x_1, \ldots, x_n)^\top\) is derived from these \(n-m+1\) constraints:
\[\pi(\mathbf{x}|\tau) \propto \tau^{\frac{n-m+1}{2}} \exp\left\{-\frac{\tau}{2}\sum_{i=1}^{n-m+1} \left(\sum_{j=0}^{m-1} x_{i+j}\right)^2\right\}\]
This can be written in matrix form as:
\[\pi(\mathbf{x}|\tau) \propto \tau^{\frac{n-m+1}{2}} \exp\left\{-\frac{1}{2}\mathbf{x}^\top \mathbf{Q} \mathbf{x}\right\}\]
where:
- \(\mathbf{Q} = \tau \mathbf{R}\) is the precision matrix
- \(\mathbf{R}\) is the structure matrix reflecting the seasonal neighborhood structure
- \(\tau\) is the precision of the seasonal sums (higher = more regular pattern)
Rank Deficiency
The seasonal model is intrinsic with rank deficiency \(m-1\). This means \(m-1\) linear constraints are needed for identifiability. In practice, this is handled automatically when you include fixed effects or an intercept in your model.
When to Use Seasonal
The seasonal model is appropriate when you have time series data with clear periodic patterns:
Monthly Data (period 12)
Annual cycles in sales, tourism, energy demand, or weather-related phenomena. Each month has its characteristic effect that repeats yearly.
Quarterly Data (period 4)
Business cycles, earnings reports, or economic indicators with quarterly patterns. Q4 retail surge, Q1 slowdown, etc.
Weekly Data (period 7)
Day-of-week effects in traffic, retail, or web activity. Weekends vs weekdays patterns.
Hourly Data (period 24)
Daily cycles in electricity demand, network traffic, or human activity. Morning peaks, afternoon lulls, evening patterns.
Use Seasonal when you expect a repeating pattern with fixed period (e.g., July is always high). Use AR1 when you expect general temporal correlation without a specific period (each time point similar to the previous one).
Specification in pyINLA
Allowed Parameters
| Parameter | Required | Description |
|---|---|---|
id |
Yes | Column name containing time indices (integers 1, 2, 3, ...) |
model |
Yes | Must be 'seasonal' |
season.length |
Yes | Period of the seasonal pattern (e.g., 12 for monthly, 4 for quarterly) |
hyper |
No | List with one dict: [{prior, param, initial, fixed}] for precision |
scale.model |
No | Scale precision matrix to have generalized variance 1 (default: False) |
constr |
No | Sum-to-zero constraint (default: False) |
values |
No | Custom location grid - allows prediction at future time points |
n |
No | Number of time points (auto-detected if omitted) |
extraconstr |
No | Extra linear constraints: {'A': matrix, 'e': vector} where A @ x = e |
diagonal |
No | Extra constant on the precision diagonal (numerical stability; default 0.0) |
weights |
No | Per-observation weighting vector applied to the random-effect contribution. Length must equal the number of observations. |
Common Period Values
| Data Frequency | season.length |
Pattern Captured |
|---|---|---|
| Monthly | 12 |
Yearly cycle (Jan-Dec pattern) |
| Quarterly | 4 |
Yearly cycle (Q1-Q4 pattern) |
| Weekly (daily data) | 7 |
Weekly cycle (Mon-Sun pattern) |
| Daily (hourly data) | 24 |
Daily cycle (hour 0-23 pattern) |
| Half-hourly electricity | 48 |
Daily cycle (48 half-hour periods) |
Per-Parameter Runnable Examples
Click any parameter to expand a minimal runnable snippet. All snippets assume the shared setup block below (\(n = 36\) ordered time points, \(\text{nobs} = 80\) observations).
# shared setup for every snippet below
import numpy as np
import pandas as pd
from pyinla import pyinla
rng = np.random.default_rng(42)
n = 36 # number of ordered time points (3 full periods of length 12)
nobs = 80 # observations
df = pd.DataFrame({
'y': rng.standard_normal(nobs),
'x': rng.standard_normal(nobs),
'time': rng.integers(1, n + 1, size=nobs),
})
id + model + season.length required trio (minimal call)
# 'id' is the column holding the ordered time index. 'season.length' is the
# period m of the cycle (here 12 for a yearly pattern on monthly data).
# 'n' is auto-detected from max(id); constr=False by default.
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'seasonal', 'season.length': 12}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
season.length pick the period m (integer ≥ 2)
# Common choices: 4 (quarterly cycle), 6, 12 (monthly with yearly cycle),
# 7 (weekly on daily data), 24 (daily on hourly data), 48 (half-hourly).
formula_q = {'response': 'y', 'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'seasonal', 'season.length': 4}]}
formula_y = {'response': 'y', 'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'seasonal', 'season.length': 12}]}
result = pyinla(formula=formula_q, family='gaussian', data=df)
hyper override the precision prior
# Default: prior='loggamma', param=[1, 0.00005], initial=4, fixed=False.
# Below: PC prior on the precision with stronger shrinkage toward zero variance.
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'seasonal', 'season.length': 12,
'hyper': [{'prior': 'pc.prec', 'param': [1, 0.01]}]}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
n + values extend the position grid for prediction at unobserved time points
# Set 'n' larger than max(id) AND supply 'values' to pin the explicit grid.
# Latent positions 1..48 are estimated even though only 1..36 appear in data.
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'seasonal', 'season.length': 12,
'n': n + 12,
'values': list(range(1, n + 13))}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
constr opt-in sum-to-zero constraint (default False)
# Seasonal already penalises non-zero sums over each length-m window, so a
# global sum-to-zero is off by default. Set constr=True to also pin the
# overall level; drop the '1' from 'fixed' for identifiability.
formula = {
'response': 'y',
'fixed': ['0', 'x'],
'random': [{'id': 'time', 'model': 'seasonal', 'season.length': 12,
'constr': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
scale.model generalized-variance scaling (recommended with PC priors)
# Scales Q so the generalized variance is 1; makes the prior's
# magnitude interpretation independent of n.
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'seasonal', 'season.length': 12,
'scale.model': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
values explicit location grid
# Pin the full ordered set of time points explicitly. The list may extend
# beyond the observed range so the latent field is estimated at unobserved
# positions (useful for forecasting).
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'seasonal', 'season.length': 12,
'values': list(range(1, n + 1))}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
extraconstr additional linear constraints \(\mathbf{A}\mathbf{x} = \mathbf{e}\)
# Seasonal's default is constr=False (no rank-1 augmentation), so A has
# the full latent-field dimension. With 'values' = 1..n the latent length is n.
# Below: pin the first and last time points to zero. A has shape (2, n).
A = np.zeros((2, n))
A[0, 0] = 1.0
A[1, n - 1] = 1.0
e = np.array([0.0, 0.0])
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'seasonal', 'season.length': 12,
'values': list(range(1, n + 1)),
'extraconstr': {'A': A, 'e': e}}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
diagonal extra constant on the precision diagonal
# Small numerical-stability jitter. Rarely needed with a proper prior.
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'seasonal', 'season.length': 12,
'diagonal': 1e-4}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
weights per-observation weighting
# Per-observation weight vector applied to the seasonal contribution.
# Length must equal the number of observations.
w = rng.uniform(0.5, 1.5, size=nobs)
formula = {
'response': 'y',
'fixed': ['1', 'x'],
'random': [{'id': 'time', 'model': 'seasonal', 'season.length': 12,
'weights': w}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
Hyperparameters
The seasonal model has one hyperparameter: the precision \(\tau\) of the seasonal sums. In pyINLA, we work with the log-transformed precision \(\theta = \log(\tau)\).
Default Configuration
| Parameter | Default Value | Description |
|---|---|---|
prior |
'loggamma' |
Log-gamma prior on precision (gamma prior on \(\tau\)) |
param |
[1, 0.00005] |
Shape and rate parameters for the gamma prior |
initial |
4 |
Starting value for \(\theta = \log(\tau)\); corresponds to \(\tau \approx 54.6\) |
fixed |
False |
Whether to estimate (False) or fix (True) the precision |
Customizing Hyperparameters
Override defaults using the hyper key:
'random': [
{
'id': 'time',
'model': 'seasonal',
'season.length': 12,
'hyper': [{
'prior': 'loggamma',
'param': [1, 0.01], # More informative prior
'initial': 2, # Different starting point
'fixed': False
}]
}
]
Available Priors
Log-Gamma Prior (Default)
The default loggamma prior places a gamma distribution on the precision \(\tau\):
\[\tau \sim \text{Gamma}(a, b)\]
With default param=[1, 0.00005], this is a weakly informative prior allowing both strong and weak seasonal patterns.
# Default loggamma prior
'hyper': [{'prior': 'loggamma', 'param': [1, 0.00005]}]
# More informative: expect regular seasonal pattern
'hyper': [{'prior': 'loggamma', 'param': [1, 0.01]}]
PC Prior (Penalised Complexity)
The pc.prec prior penalizes deviation from the base model (no seasonal pattern). Parameterized via the marginal standard deviation:
\[P(\sigma > u) = \alpha\]
# PC prior: P(sigma > 1) = 0.01
# "I believe the seasonal SD is unlikely to exceed 1"
'hyper': [{'prior': 'pc.prec', 'param': [1, 0.01]}]
# PC prior: P(sigma > 0.5) = 0.05
# Expect a mild seasonal pattern
'hyper': [{'prior': 'pc.prec', 'param': [0.5, 0.05]}]
Fixed Precision
To fix the precision at a known value:
import math
# Fix precision at tau = 100 (very regular seasonal pattern)
'hyper': [{
'initial': math.log(100), # log(tau) = 4.605
'fixed': True
}]
Interpreting Results
Hyperparameter Summary
After fitting, access the hyperparameter estimates via result.summary_hyperpar. Running the shared-setup seasonal snippet above produces something like:
print(result.summary_hyperpar)
# mean sd 0.025quant 0.5quant 0.975quant mode
# Precision for the Gaussian observations 1.8525 0.3164 1.2983 1.8292 2.5399 1.7889
# Precision for time 22033.0 24269.5 1451.5 14414.6 86464.0 3948.2
Higher Precision for time = more regular seasonal pattern (deviations from the period-m running-sum constraint are smaller). The shared-setup data here is iid noise with no real seasonal signal, so the seasonal precision is large with a wide posterior; on data that actually has a periodic pattern, the precision concentrates around a smaller, more informative value.
Random Effect Estimates
The posterior estimates for each time point live in result.summary_random, keyed by the spec's id (here 'time'). The dict value is a DataFrame whose latent-position column is always named 'ID' (engine convention, regardless of the user's id name):
# Seasonal effect estimates
seasonal = result.summary_random['time']
# Extract first year's pattern (repeats for subsequent years)
first_cycle = seasonal[seasonal['ID'] <= 12]
print(first_cycle[['ID', 'mean', '0.025quant', '0.975quant']])
# Plot the seasonal pattern
import matplotlib.pyplot as plt
plt.fill_between(first_cycle['ID'], first_cycle['0.025quant'],
first_cycle['0.975quant'], alpha=0.3)
plt.plot(first_cycle['ID'], first_cycle['mean'], 'b-o')
plt.xlabel('Month')
plt.ylabel('Seasonal Effect')
plt.title('Estimated Seasonal Pattern')
Important Notes
- Rank deficiency: The seasonal model is intrinsic with rank deficiency \(m-1\). When combined with fixed effects (e.g., year dummies), identifiability is typically not an issue.
- Log transformation: For multiplicative seasonality (where seasonal amplitude scales with the level), apply log transformation to the response. The seasonal effects then represent multiplicative factors via \(\exp(\text{effect})\).
- Combining with trend: Seasonal models work well combined with RW1/RW2 for trend extraction, allowing decomposition into trend + seasonal + residual components.
- Index requirements: The
idcolumn should contain consecutive integers (1, 2, 3, ...). The model assumes observations are equally spaced in time.
Related Models
- AR1 - Autoregressive order 1 (general temporal correlation, no fixed period)
- AR(p) - Higher-order autoregressive (complex temporal dependence)
- RW1 - Random Walk of order 1 (smooth trends, often combined with seasonal)
- RW2 - Random Walk of order 2 (smoother trends)
- IID - Independent random effects (no temporal structure)
See It In Action
For a complete worked example using the seasonal model, see the AirPassengers Seasonal Analysis application, which demonstrates seasonal decomposition of monthly airline passenger data.