Overview
The IID model (Independent and Identically Distributed) is the simplest and most commonly used random effect in hierarchical modeling. It assumes that random effects are:
- Independent - no correlation between different groups
- Identically distributed - all groups share the same variance
- Normally distributed - each effect follows a Gaussian distribution
This makes IID ideal for capturing exchangeable group-level heterogeneity - situations where groups differ from each other, but there's no inherent ordering or spatial relationship between them.
Mathematical Formulation
For \(n\) levels of the index variable, the IID model specifies:
\[u_i \stackrel{\text{iid}}{\sim} \mathcal{N}(0, \tau^{-1}), \quad i = 1, \ldots, n\]
where:
- \(u_i\) is the random effect for level \(i\)
- \(\tau\) is the precision (inverse variance) of the random effects
- \(\sigma = 1/\sqrt{\tau}\) is the standard deviation
Terminology note: throughout this page, level means one value of the index variable (one realization of \(u_i\)). The separate spec key group refers to an additional axis of replication and is unrelated.
The precision matrix for the full vector \(\mathbf{u} = (u_1, \ldots, u_n)^\top\) is simply:
\[\mathbf{Q} = \tau \, \mathbf{I}_n\]
where \(\mathbf{I}_n\) is the \(n \times n\) identity matrix.
When to Use IID
The IID model is appropriate when you have grouped or clustered data where:
Hierarchical Data
Students nested in schools, patients in hospitals, employees in companies. Each group has its own baseline that deviates from the overall mean.
Repeated Measures
Multiple observations per subject. The random effect captures subject-specific deviations from the population mean response.
Overdispersion
Count or binary data with more variability than the likelihood predicts. Adding an observation-level IID effect absorbs excess variation.
Unstructured Spatial Effects
Regional variation without a clear spatial pattern. Often combined with structured spatial models (BYM2) to capture residual heterogeneity.
Visual Understanding: IID vs Structured Effects
Understanding when to use IID becomes clearer when you see how it differs from structured random effects like RW1. The key difference is in the correlation structure.
IID: Independent Random Effects
Each level \(u_i\) is drawn independently from the same zero-mean Gaussian \(\mathcal{N}(0, \tau^{-1})\). The values are exchangeable: there is no inherent ordering, and the realized effect at one level carries no information about any other.
IID: Only diagonal
(levels are independent)
RW1: Correlated Random Effects
Adjacent positions are correlated - nearby values tend to be similar. This creates smooth trends rather than independent jumps.
RW1 (sum-to-zero IGMRF)
blue = +, amber = -
Decision Guide
| Use IID when... | Use RW1/AR1 when... |
|---|---|
| Groups have no natural ordering (schools, patients, regions) | Data is ordered in time or space (years, age groups, locations) |
| Group labels are arbitrary (could be reshuffled) | Adjacent positions should have similar values |
| You want to model overdispersion | You want to model smooth trends or gradients |
| Effects are exchangeable (no structure) | There's inherent structure in the index variable |
Specification in pyINLA
Allowed Parameters
The IID random-effect spec dict accepts the following keys. The source of truth is model_defs/latent/iid.py's ALLOWED_KEYS set, mirrored 1:1 here.
| Parameter | Required | Description |
|---|---|---|
| Core | ||
id |
Yes | Column name containing level indices (integers). |
model |
Yes | Must be 'iid'. |
hyper |
No | List with one dict: [{prior, param, initial, fixed}] for the precision hyperparameter. |
n |
No | Number of levels (auto-detected if omitted). |
constr |
No | Sum-to-zero constraint (default: False). |
initial |
No | Shorthand for hyper[0]['initial']. Sets the starting value for the precision hyperparameter without nesting a hyper block. |
fixed |
No | Shorthand for hyper[0]['fixed']. Pins the precision hyperparameter at initial when True. |
diagonal |
No | Extra constant added to the precision diagonal for numerical stability (default: 0.0). |
| Structure / observation-level | ||
weights |
No | Per-observation weighting vector applied to the random-effect contribution. |
id_names |
No | Custom display labels for each level (e.g. for summary_random output). |
values |
No | Explicit list of allowed covariate values; useful when not all levels appear in the data. |
| Replication & grouping | ||
nrep |
No | Number of replications; usually inferred from replicate. |
replicate |
No | Per-observation replication index vector. Use with nrep for cross-classified effects. |
group |
No | Per-observation group index vector for grouped IID. Use together with ngroup and control.group. |
ngroup |
No | Number of groups in the grouped IID layout. Companion to group. |
control.group |
No | Dict controlling between-group correlation, e.g. {'model': 'iid'} or {'model': 'ar1'}. |
| Constraints | ||
extraconstr |
No |
Extra linear constraints, as a dict {'A': <matrix>, 'e': <vector>}, applied in addition to constr.
Each row of A is one constraint enforcing
\(A[i, :] \cdot u = e[i]\) on the n-level latent field.
A shape is (k, n) for k constraints, e is shape (k,).
Accepted types for A: numpy.ndarray (1D treated as one row, 2D as k rows), Python list / list-of-lists, or any scipy.sparse matrix.
e is optional: if omitted it defaults to np.zeros(k) (auto-inferred from A).
|
| Copy (feature-key form) | ||
copy |
No | Name of an earlier random-effect to copy. Internally rewrites the spec to the copy model with of referencing the source. |
| Diagnostics | ||
vb.correct |
No | Variational-Bayes correction set. Pass True to include this component, or a list of node indices for fine-grained control. |
debug |
No | Enable local debug output for this random effect (boolean). |
Any key not listed above raises a clear safety error with the
allowed-keys list before the engine runs. Per-observation scaling
is done via weights, not scale (the named
scale= argument is not a supported IID feature).
Per-Parameter Runnable Examples
Click any parameter to expand a minimal runnable snippet. All snippets assume import numpy as np; import pandas as pd; import math; from pyinla import pyinla and a small dataframe df with columns y, x, group_id (5 levels).
# shared setup for every snippet below
import numpy as np
import pandas as pd
import math
from pyinla import pyinla
rng = np.random.default_rng(42)
n = 60
df = pd.DataFrame({
'y': rng.standard_normal(n),
'x': rng.standard_normal(n),
'group_id': rng.integers(1, 6, size=n),
})
id column name with the level index
formula = {
'response': 'y',
'random': [{'id': 'group_id', 'model': 'iid'}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
model must be 'iid'
# 'model': 'iid' is the marker that selects this family (required).
formula = {
'response': 'y',
'random': [{'id': 'group_id', 'model': 'iid'}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
hyper prior + starting value for the precision
formula = {
'response': 'y',
'random': [{
'id': 'group_id', 'model': 'iid',
'hyper': [{
'prior': 'pc.prec',
'param': [1, 0.01],
'initial': 2,
'fixed': False,
}],
}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
n explicit number of levels (must match the declared level set)
# 'n' must equal the count of unique levels declared by the data or by
# 'values'. The engine refuses to run when they mismatch. To reserve
# room for unobserved levels, declare them with 'values' AND set 'n'
# to that length.
formula = {
'response': 'y',
'random': [{'id': 'group_id', 'model': 'iid',
'values': list(range(1, 11)), 'n': 10}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
constr sum-to-zero constraint
formula = {
'response': 'y',
'random': [{'id': 'group_id', 'model': 'iid', 'constr': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
initial shorthand for hyper[0]['initial']
# start the optimizer at log(tau) = 2 (tau ~ 7.4)
formula = {
'response': 'y',
'random': [{'id': 'group_id', 'model': 'iid', 'initial': 2}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
fixed pin the precision at initial
# fix tau = 10 (no estimation of the precision hyperparameter)
formula = {
'response': 'y',
'random': [{'id': 'group_id', 'model': 'iid',
'initial': math.log(10), 'fixed': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
diagonal extra constant on the precision diagonal
# small jitter for numerical stability
formula = {
'response': 'y',
'random': [{'id': 'group_id', 'model': 'iid', 'diagonal': 1e-4}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
weights per-observation weighting vector
weights[i]:
Concrete example with 6 rows, 3 groups, weights = [1, 2, 1, 2, 1, 2]:
| i | group_id | weights[i] | contribution to ηi |
|---|---|---|---|
| 1 | 1 | 1.0 | + 1.0 · u1 |
| 2 | 1 | 2.0 | + 2.0 · u1 (same u1, twice the pull) |
| 3 | 2 | 1.0 | + 1.0 · u2 |
| 4 | 2 | 2.0 | + 2.0 · u2 |
| 5 | 3 | 1.0 | + 1.0 · u3 |
| 6 | 3 | 2.0 | + 2.0 · u3 |
Three good reasons to set it:
- Exposure-style scaling on the RE itself: same group, per-row exposure (e.g. population, time at risk) multiplies the group's effect.
- Sneak a linear term into an f() block: set
id = 1(one level),weights = x_i. Then u1 plays the role of a slope and ηi += xi · u1. - Custom design contrasts: a row picks up a fractional or signed share of a group's u (positive or negative weights are allowed).
w = rng.uniform(0.5, 1.5, size=n)
formula = {
'response': 'y',
'random': [{'id': 'group_id', 'model': 'iid', 'weights': w}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
id_names display labels for levels
formula = {
'response': 'y',
'random': [{'id': 'group_id', 'model': 'iid',
'id_names': ['A', 'B', 'C', 'D', 'E']}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
values explicit set of allowed level values
# declare levels 1..5 even if some are absent from this fit
formula = {
'response': 'y',
'random': [{'id': 'group_id', 'model': 'iid',
'values': [1, 2, 3, 4, 5]}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
nrep + replicate independent replications of the iid effect
df['rep_id'] = rng.integers(1, 4, size=n)
formula = {
'response': 'y',
'random': [{'id': 'group_id', 'model': 'iid',
'nrep': 3, 'replicate': df['rep_id'].values}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
group + ngroup + control.group extra correlated axis
df['week'] = rng.integers(1, 5, size=n)
formula = {
'response': 'y',
'random': [{
'id': 'group_id', 'model': 'iid',
'group': df['week'].values,
'ngroup': 4,
'control.group': {'model': 'ar1'},
}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
extraconstr additional linear constraint A u = e
# enforce sum of all 5 random effects equals zero
formula = {
'response': 'y',
'random': [{
'id': 'group_id', 'model': 'iid',
'extraconstr': {'A': np.ones((1, 5)),
'e': np.zeros(1)},
}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
copy copy an earlier component
df['group2_id'] = rng.integers(1, 6, size=n)
formula = {
'response': 'y',
'random': [
{'id': 'group_id', 'model': 'iid'},
{'id': 'group2_id', 'model': 'iid', 'copy': 'group_id'},
],
}
result = pyinla(formula=formula, family='gaussian', data=df)
vb.correct variational-Bayes correction
# enable VB correction on this component (or pass a list of node indices)
formula = {
'response': 'y',
'random': [{'id': 'group_id', 'model': 'iid', 'vb.correct': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
debug extra debug output for this component
formula = {
'response': 'y',
'random': [{'id': 'group_id', 'model': 'iid', 'debug': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
Level Identifier Handling
The id column should contain integer values identifying which level of the random effect each observation belongs to. pyINLA handles indices flexibly:
| Input | Behavior |
|---|---|
[1, 2, 3, 1, 2, 3] |
Standard consecutive indices - works directly |
[5, 10, 15, 5, 10] |
Non-consecutive integers - automatically mapped to positions |
[1, np.nan, 3, np.nan, 5] |
np.nan in the index column: that row contributes nothing to this random effect |
[0, 1, 2, 0, 1] |
Zero-based indices - handled correctly |
Key points:
- Indices are mapped to consecutive positions internally via
sort(unique(values)) np.nanin the index column means "this row has no contribution from this random effect" (useful for mixed designs)- String/factor columns must be converted to integers first (e.g.
pd.factorize)
# These all work:
df['idx'] = [1, 2, 3, 1, 2, 3] # Standard
df['idx'] = [10, 20, 30, 10, 20, 30] # Non-consecutive (mapped to 0,1,2)
df['idx'] = [1, np.nan, 2, np.nan, 1, 2] # np.nan drops the RE contribution for that row
# Convert string groups to integers
df['group_idx'] = pd.factorize(df['group_name'])[0] + 1
Two related but distinct things:
- Predicting an observation: set the response
ytonp.nanon the rows you want predicted. The posterior predictive is then returned for those rows. - Dropping the random effect for a row: set the index to
np.nan. That row gets zero contribution from this random effect (its linear predictor is built from the other terms only).
For an unseen group, you have two choices: add it as a new index level (the RE is drawn from its prior, with honest uncertainty), or set the index to np.nan (predict at the population mean, no group-level effect).
NaN in the weights vector
When weights is supplied, individual entries may also be np.nan / NA. pyINLA accepts these without error: the engine treats those rows as having no weighted contribution from this random effect. There is no auto-imputation or row-dropping side effect.
This behaviour is covered by the nan_in_weights case in the iid sweep.
Hyperparameters
The IID model has one hyperparameter: the precision \(\tau\) of the random effects. In pyINLA, we work with the log-transformed precision \(\theta = \log(\tau)\) for numerical stability.
Default Configuration
| Parameter | Default Value | Description |
|---|---|---|
prior |
'loggamma' |
Log-gamma prior on precision (gamma prior on \(\tau\)) |
param |
[1, 0.00005] |
Shape and rate parameters for the gamma prior |
initial |
4 |
Starting value for \(\theta = \log(\tau)\); corresponds to \(\tau \approx 54.6\) |
fixed |
False |
Whether to estimate (False) or fix (True) the precision |
Customizing Hyperparameters
Override defaults using the hyper key in your random effect specification:
'random': [
{
'id': 'group_id',
'model': 'iid',
'hyper': [{
'prior': 'loggamma',
'param': [1, 0.01], # More informative prior
'initial': 2, # Different starting point
'fixed': False
}]
}
]
Available Priors
Log-Gamma Prior (Default)
The default loggamma prior places a gamma distribution on the precision \(\tau\):
\[\tau \sim \text{Gamma}(a, b)\]
With default param=[1, 0.00005], this is a very weakly informative prior that allows \(\tau\) to range over several orders of magnitude.
# Default loggamma prior
'hyper': [{'prior': 'loggamma', 'param': [1, 0.00005]}]
# More informative: expect precision around 10-100
'hyper': [{'prior': 'loggamma', 'param': [1, 0.01]}]
PC Prior (Penalised Complexity)
The pc.prec prior is a principled choice that penalizes deviation from a base model (no random effect). It's parameterized in terms of the standard deviation \(\sigma = 1/\sqrt{\tau}\):
\[P(\sigma > u) = \alpha\]
With param=[u, alpha], you specify that the probability of \(\sigma\) exceeding \(u\) is \(\alpha\).
# PC prior: P(sigma > 1) = 0.01
# "I believe standard deviation is unlikely to exceed 1"
'hyper': [{'prior': 'pc.prec', 'param': [1, 0.01]}]
# PC prior: P(sigma > 0.5) = 0.05
# More restrictive: effects should be small
'hyper': [{'prior': 'pc.prec', 'param': [0.5, 0.05]}]
Other Supported Priors
Beyond loggamma and pc.prec, the precision hyperparameter (internally \(\theta = \log \tau\)) accepts several more priors. Pick by what you want to place the prior on: the log-precision \(\theta\), the precision \(\tau\), or the standard deviation \(\sigma = 1/\sqrt{\tau}\).
| Prior | Placed on | param |
Use when |
|---|---|---|---|
normal (alias: gaussian) |
\(\theta = \log\tau\) | [mean, precision] |
You want a Gaussian on the log-precision (e.g. centre at a known order of magnitude). |
logtnormal (alias: logtgaussian) |
\(\tau\) | [mean, precision] |
Truncated log-normal on \(\tau\); keeps the precision bounded away from zero. |
flat |
\(\theta\) | (none) | Improper uniform on the log-precision. Use with care; can produce improper posteriors. |
logiflat |
\(\tau\) | (none) | Improper flat prior on \(1/\tau\). Numerically stable in practice. |
logflat (use with caution) |
\(\tau\) | (none) | Improper flat prior on \(\tau\) directly. Often unstable: with weak data the optimizer can drive \(\theta \to -33\) so \(\tau = e^\theta \to \infty\) and the engine aborts on a singular matrix. Prefer pc.prec or loggamma for weakly-informative precision priors. |
expression:<R code> |
\(\theta\) | (none) | Custom log-density expression, evaluated as a function of theta (the internal log-precision scale). The expression must return \(\log \pi(\theta)\), in R syntax. |
table: x1 ... xn | l1 ... ln |
\(\theta\) | (none) | Tabulated log-density on a grid: xi are \(\theta\) values, li are the corresponding \(\log \pi(\theta_i)\) values. The | separator is optional (any non-numeric token is ignored). Linear interpolation between knots. |
# Gaussian on log-precision, centred at 4 with precision 1
'hyper': [{'prior': 'normal', 'param': [4, 1]}]
# Improper flat prior on log-precision
'hyper': [{'prior': 'flat'}]
# Custom log-density as an R expression (here: standard normal on theta)
'hyper': [{'prior': 'expression: -theta * theta / 2'}]
# Tabulated log-density on a grid: theta values | log pi(theta) values
'hyper': [{'prior': 'table: -3 -1 1 3 | -4.5 -1.1 -1.1 -4.5'}]
Unknown prior names are rejected up front with a suggestion of near matches. The full registry is in pyinla/api/priors.py.
Fixed Precision
To fix the precision at a known value (no estimation), set fixed=True and specify the initial value on the log scale:
import math
# Fix precision at tau = 10 (variance = 0.1)
'hyper': [{
'initial': math.log(10), # log(tau) = 2.303
'fixed': True
}]
Interpreting Results
Hyperparameter Summary
After fitting, access the precision estimate via result.summary_hyperpar:
print(result.summary_hyperpar)
# mean sd 0.025quant 0.5quant 0.975quant mode
# Precision for z 6.995828 1.913305 3.986632 6.748838 11.421926 6.294978
To convert precision to standard deviation:
tau_mean = result.summary_hyperpar['mean'].values[0]
sigma_mean = 1 / np.sqrt(tau_mean)
print(f"Estimated SD of random effects: {sigma_mean:.3f}")
Random Effect Estimates
The posterior estimates for each group's random effect are available in result.summary_random:
# Random effects for each group
print(result.summary_random['group_id'])
# ID mean sd 0.025quant 0.5quant 0.975quant mode
# 0 1 0.234567 0.089123 0.059876 0.234567 0.409258 0.234567
# 1 2 -0.123456 0.091234 -0.302345 -0.123456 0.055433 -0.123456
# ...