IID Random Effects

Overview

The IID model (Independent and Identically Distributed) is the simplest and most commonly used random effect in hierarchical modeling. It assumes that random effects are:

Independent - no correlation between different groups
Identically distributed - all groups share the same variance
Normally distributed - each effect follows a Gaussian distribution

This makes IID ideal for capturing exchangeable group-level heterogeneity - situations where groups differ from each other, but there's no inherent ordering or spatial relationship between them.

Mathematical Formulation

For \(n\) levels of the index variable, the IID model specifies:

\[u_i \stackrel{\text{iid}}{\sim} \mathcal{N}(0, \tau^{-1}), \quad i = 1, \ldots, n\]

where:

\(u_i\) is the random effect for level \(i\)
\(\tau\) is the precision (inverse variance) of the random effects
\(\sigma = 1/\sqrt{\tau}\) is the standard deviation

Terminology note: throughout this page, level means one value of the index variable (one realization of \(u_i\)). The separate spec key group refers to an additional axis of replication and is unrelated.

The precision matrix for the full vector \(\mathbf{u} = (u_1, \ldots, u_n)^\top\) is simply:

\[\mathbf{Q} = \tau \, \mathbf{I}_n\]

where \(\mathbf{I}_n\) is the \(n \times n\) identity matrix.

When to Use IID

The IID model is appropriate when you have grouped or clustered data where:

Hierarchical Data

Students nested in schools, patients in hospitals, employees in companies. Each group has its own baseline that deviates from the overall mean.

Repeated Measures

Multiple observations per subject. The random effect captures subject-specific deviations from the population mean response.

Overdispersion

Count or binary data with more variability than the likelihood predicts. Adding an observation-level IID effect absorbs excess variation.

Unstructured Spatial Effects

Regional variation without a clear spatial pattern. Often combined with structured spatial models (BYM2) to capture residual heterogeneity.

Visual Understanding: IID vs Structured Effects

Understanding when to use IID becomes clearer when you see how it differs from structured random effects like RW1. The key difference is in the correlation structure.

IID: Independent Random Effects

Each level \(u_i\) is drawn independently from the same zero-mean Gaussian \(\mathcal{N}(0, \tau^{-1})\). The values are exchangeable: there is no inherent ordering, and the realized effect at one level carries no information about any other.

IID: Only diagonal
(levels are independent)

RW1: Correlated Random Effects

Adjacent positions are correlated - nearby values tend to be similar. This creates smooth trends rather than independent jumps.

RW1 (sum-to-zero IGMRF)
blue = +, amber = -

Decision Guide

Use IID when...	Use RW1/AR1 when...
Groups have no natural ordering (schools, patients, regions)	Data is ordered in time or space (years, age groups, locations)
Group labels are arbitrary (could be reshuffled)	Adjacent positions should have similar values
You want to model overdispersion	You want to model smooth trends or gradients
Effects are exchangeable (no structure)	There's inherent structure in the index variable

Specification in pyINLA

Allowed Parameters

The IID random-effect spec dict accepts the following keys. The source of truth is model_defs/latent/iid.py's ALLOWED_KEYS set, mirrored 1:1 here.

Parameter	Required	Description
Core
`id`	Yes	Column name containing level indices (integers).
`model`	Yes	Must be `'iid'`.
`hyper`	No	List with one dict: `[{prior, param, initial, fixed}]` for the precision hyperparameter.
`n`	No	Number of levels (auto-detected if omitted).
`constr`	No	Sum-to-zero constraint (default: `False`).
`initial`	No	Shorthand for `hyper[0]['initial']`. Sets the starting value for the precision hyperparameter without nesting a `hyper` block.
`fixed`	No	Shorthand for `hyper[0]['fixed']`. Pins the precision hyperparameter at `initial` when `True`.
`diagonal`	No	Extra constant added to the precision diagonal for numerical stability (default: `0.0`).
Structure / observation-level
`weights`	No	Per-observation weighting vector applied to the random-effect contribution.
`id_names`	No	Custom display labels for each level (e.g. for `summary_random` output).
`values`	No	Explicit list of allowed covariate values; useful when not all levels appear in the data.
Replication & grouping
`nrep`	No	Number of replications; usually inferred from `replicate`.
`replicate`	No	Per-observation replication index vector. Use with `nrep` for cross-classified effects.
`group`	No	Per-observation group index vector for grouped IID. Use together with `ngroup` and `control.group`.
`ngroup`	No	Number of groups in the grouped IID layout. Companion to `group`.
`control.group`	No	Dict controlling between-group correlation, e.g. `{'model': 'iid'}` or `{'model': 'ar1'}`.
Constraints
`extraconstr`	No	Extra linear constraints, as a dict `{'A': <matrix>, 'e': <vector>}`, applied in addition to `constr`. Each row of `A` is one constraint enforcing \(A[i, :] \cdot u = e[i]\) on the n-level latent field. `A` shape is `(k, n)` for `k` constraints, `e` is shape `(k,)`. Accepted types for `A`: `numpy.ndarray` (1D treated as one row, 2D as `k` rows), Python list / list-of-lists, or any `scipy.sparse` matrix. `e` is optional: if omitted it defaults to `np.zeros(k)` (auto-inferred from `A`).
Copy (feature-key form)
`copy`	No	Name of an earlier random-effect to copy. Internally rewrites the spec to the `copy` model with `of` referencing the source.
Diagnostics
`vb.correct`	No	Variational-Bayes correction set. Pass `True` to include this component, or a list of node indices for fine-grained control.
`debug`	No	Enable local debug output for this random effect (boolean).

Any key not listed above raises a clear safety error with the allowed-keys list before the engine runs. Per-observation scaling is done via weights, not scale (the named scale= argument is not a supported IID feature).

Per-Parameter Runnable Examples

Click any parameter to expand a minimal runnable snippet. All snippets assume import numpy as np; import pandas as pd; import math; from pyinla import pyinla and a small dataframe df with columns y, x, group_id (5 levels).

# shared setup for every snippet below
import numpy as np
import pandas as pd
import math
from pyinla import pyinla

rng = np.random.default_rng(42)
n = 60
df = pd.DataFrame({
    'y': rng.standard_normal(n),
    'x': rng.standard_normal(n),
    'group_id': rng.integers(1, 6, size=n),
})

id column name with the level index

formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid'}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

model must be 'iid'

# 'model': 'iid' is the marker that selects this family (required).
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid'}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

hyper prior + starting value for the precision

formula = {
    'response': 'y',
    'random': [{
        'id': 'group_id', 'model': 'iid',
        'hyper': [{
            'prior': 'pc.prec',
            'param': [1, 0.01],
            'initial': 2,
            'fixed': False,
        }],
    }],
}
result = pyinla(formula=formula, family='gaussian', data=df)

n explicit number of levels (must match the declared level set)

# 'n' must equal the count of unique levels declared by the data or by
# 'values'. The engine refuses to run when they mismatch. To reserve
# room for unobserved levels, declare them with 'values' AND set 'n'
# to that length.
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid',
                 'values': list(range(1, 11)), 'n': 10}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

constr sum-to-zero constraint

formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid', 'constr': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

initial shorthand for hyper[0]['initial']

# start the optimizer at log(tau) = 2  (tau ~ 7.4)
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid', 'initial': 2}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

fixed pin the precision at initial

# fix tau = 10  (no estimation of the precision hyperparameter)
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid',
                 'initial': math.log(10), 'fixed': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

diagonal extra constant on the precision diagonal

# small jitter for numerical stability
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid', 'diagonal': 1e-4}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

weights per-observation weighting vector

What it actually does. The random-effect contribution at observation i is scaled by weights[i]:

η_i += w_i · u_id(i) (default w_i = 1)

The latent vector u is unchanged: still one value per level, still iid Gaussian. Two rows in the same group can receive different weighted contributions. Not a likelihood weight: this is a design-matrix scaling on the random-effect contribution, not how much each y counts in the log-likelihood.

Concrete example with 6 rows, 3 groups, weights = [1, 2, 1, 2, 1, 2]:

i	group_id	weights[i]	contribution to η_i
1	1	1.0	+ 1.0 · u₁
2	1	2.0	+ 2.0 · u₁ (same u₁, twice the pull)
3	2	1.0	+ 1.0 · u₂
4	2	2.0	+ 2.0 · u₂
5	3	1.0	+ 1.0 · u₃
6	3	2.0	+ 2.0 · u₃

Three good reasons to set it:

Exposure-style scaling on the RE itself: same group, per-row exposure (e.g. population, time at risk) multiplies the group's effect.
Sneak a linear term into an f() block: set id = 1 (one level), weights = x_i. Then u₁ plays the role of a slope and η_i += x_i · u₁.
Custom design contrasts: a row picks up a fractional or signed share of a group's u (positive or negative weights are allowed).

w = rng.uniform(0.5, 1.5, size=n)
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid', 'weights': w}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

id_names display labels for levels

formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid',
                 'id_names': ['A', 'B', 'C', 'D', 'E']}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

values explicit set of allowed level values

# declare levels 1..5 even if some are absent from this fit
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid',
                 'values': [1, 2, 3, 4, 5]}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

nrep + replicate independent replications of the iid effect

df['rep_id'] = rng.integers(1, 4, size=n)
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid',
                 'nrep': 3, 'replicate': df['rep_id'].values}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

group + ngroup + control.group extra correlated axis

df['week'] = rng.integers(1, 5, size=n)
formula = {
    'response': 'y',
    'random': [{
        'id': 'group_id', 'model': 'iid',
        'group': df['week'].values,
        'ngroup': 4,
        'control.group': {'model': 'ar1'},
    }],
}
result = pyinla(formula=formula, family='gaussian', data=df)

extraconstr additional linear constraint A u = e

# enforce sum of all 5 random effects equals zero
formula = {
    'response': 'y',
    'random': [{
        'id': 'group_id', 'model': 'iid',
        'extraconstr': {'A': np.ones((1, 5)),
                         'e': np.zeros(1)},
    }],
}
result = pyinla(formula=formula, family='gaussian', data=df)

copy copy an earlier component

df['group2_id'] = rng.integers(1, 6, size=n)
formula = {
    'response': 'y',
    'random': [
        {'id': 'group_id',  'model': 'iid'},
        {'id': 'group2_id', 'model': 'iid', 'copy': 'group_id'},
    ],
}
result = pyinla(formula=formula, family='gaussian', data=df)

vb.correct variational-Bayes correction

# enable VB correction on this component  (or pass a list of node indices)
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid', 'vb.correct': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

debug extra debug output for this component

formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid', 'debug': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

Level Identifier Handling

The id column should contain integer values identifying which level of the random effect each observation belongs to. pyINLA handles indices flexibly:

Input	Behavior
`[1, 2, 3, 1, 2, 3]`	Standard consecutive indices - works directly
`[5, 10, 15, 5, 10]`	Non-consecutive integers - automatically mapped to positions
`[1, np.nan, 3, np.nan, 5]`	`np.nan` in the index column: that row contributes nothing to this random effect
`[0, 1, 2, 0, 1]`	Zero-based indices - handled correctly

Key points:

Indices are mapped to consecutive positions internally via sort(unique(values))
np.nan in the index column means "this row has no contribution from this random effect" (useful for mixed designs)
String/factor columns must be converted to integers first (e.g. pd.factorize)

# These all work:
df['idx'] = [1, 2, 3, 1, 2, 3]       # Standard
df['idx'] = [10, 20, 30, 10, 20, 30]   # Non-consecutive (mapped to 0,1,2)
df['idx'] = [1, np.nan, 2, np.nan, 1, 2]  # np.nan drops the RE contribution for that row

# Convert string groups to integers
df['group_idx'] = pd.factorize(df['group_name'])[0] + 1

Prediction and missing indices

Two related but distinct things:

Predicting an observation: set the response y to np.nan on the rows you want predicted. The posterior predictive is then returned for those rows.
Dropping the random effect for a row: set the index to np.nan. That row gets zero contribution from this random effect (its linear predictor is built from the other terms only).

For an unseen group, you have two choices: add it as a new index level (the RE is drawn from its prior, with honest uncertainty), or set the index to np.nan (predict at the population mean, no group-level effect).

NaN in the weights vector

When weights is supplied, individual entries may also be np.nan / NA. pyINLA accepts these without error: the engine treats those rows as having no weighted contribution from this random effect. There is no auto-imputation or row-dropping side effect.

This behaviour is covered by the nan_in_weights case in the iid sweep.

Hyperparameters

The IID model has one hyperparameter: the precision \(\tau\) of the random effects. In pyINLA, we work with the log-transformed precision \(\theta = \log(\tau)\) for numerical stability.

Default Configuration

Parameter	Default Value	Description
`prior`	`'loggamma'`	Log-gamma prior on precision (gamma prior on \(\tau\))
`param`	`[1, 0.00005]`	Shape and rate parameters for the gamma prior
`initial`	`4`	Starting value for \(\theta = \log(\tau)\); corresponds to \(\tau \approx 54.6\)
`fixed`	`False`	Whether to estimate (False) or fix (True) the precision

Customizing Hyperparameters

Override defaults using the hyper key in your random effect specification:

'random': [
    {
        'id': 'group_id',
        'model': 'iid',
        'hyper': [{
            'prior': 'loggamma',
            'param': [1, 0.01],      # More informative prior
            'initial': 2,             # Different starting point
            'fixed': False
        }]
    }
]

Available Priors

Log-Gamma Prior (Default)

The default loggamma prior places a gamma distribution on the precision \(\tau\):

\[\tau \sim \text{Gamma}(a, b)\]

With default param=[1, 0.00005], this is a very weakly informative prior that allows \(\tau\) to range over several orders of magnitude.

# Default loggamma prior
'hyper': [{'prior': 'loggamma', 'param': [1, 0.00005]}]

# More informative: expect precision around 10-100
'hyper': [{'prior': 'loggamma', 'param': [1, 0.01]}]

PC Prior (Penalised Complexity)

The pc.prec prior is a principled choice that penalizes deviation from a base model (no random effect). It's parameterized in terms of the standard deviation \(\sigma = 1/\sqrt{\tau}\):

\[P(\sigma > u) = \alpha\]

With param=[u, alpha], you specify that the probability of \(\sigma\) exceeding \(u\) is \(\alpha\).

# PC prior: P(sigma > 1) = 0.01
# "I believe standard deviation is unlikely to exceed 1"
'hyper': [{'prior': 'pc.prec', 'param': [1, 0.01]}]

# PC prior: P(sigma > 0.5) = 0.05
# More restrictive: effects should be small
'hyper': [{'prior': 'pc.prec', 'param': [0.5, 0.05]}]

Other Supported Priors

Beyond loggamma and pc.prec, the precision hyperparameter (internally \(\theta = \log \tau\)) accepts several more priors. Pick by what you want to place the prior on: the log-precision \(\theta\), the precision \(\tau\), or the standard deviation \(\sigma = 1/\sqrt{\tau}\).

Prior	Placed on	`param`	Use when
`normal` (alias: `gaussian`)	\(\theta = \log\tau\)	`[mean, precision]`	You want a Gaussian on the log-precision (e.g. centre at a known order of magnitude).
`logtnormal` (alias: `logtgaussian`)	\(\tau\)	`[mean, precision]`	Truncated log-normal on \(\tau\); keeps the precision bounded away from zero.
`flat`	\(\theta\)	(none)	Improper uniform on the log-precision. Use with care; can produce improper posteriors.
`logiflat`	\(\tau\)	(none)	Improper flat prior on \(1/\tau\). Numerically stable in practice.
`logflat` (use with caution)	\(\tau\)	(none)	Improper flat prior on \(\tau\) directly. Often unstable: with weak data the optimizer can drive \(\theta \to -33\) so \(\tau = e^\theta \to \infty\) and the engine aborts on a singular matrix. Prefer `pc.prec` or `loggamma` for weakly-informative precision priors.
`expression:<R code>`	\(\theta\)	(none)	Custom log-density expression, evaluated as a function of `theta` (the internal log-precision scale). The expression must return \(\log \pi(\theta)\), in R syntax.
`table: x1 ... xn \| l1 ... ln`	\(\theta\)	(none)	Tabulated log-density on a grid: `xi` are \(\theta\) values, `li` are the corresponding \(\log \pi(\theta_i)\) values. The `\|` separator is optional (any non-numeric token is ignored). Linear interpolation between knots.

# Gaussian on log-precision, centred at 4 with precision 1
'hyper': [{'prior': 'normal', 'param': [4, 1]}]

# Improper flat prior on log-precision
'hyper': [{'prior': 'flat'}]

# Custom log-density as an R expression (here: standard normal on theta)
'hyper': [{'prior': 'expression: -theta * theta / 2'}]

# Tabulated log-density on a grid: theta values | log pi(theta) values
'hyper': [{'prior': 'table: -3 -1 1 3 | -4.5 -1.1 -1.1 -4.5'}]

Unknown prior names are rejected up front with a suggestion of near matches. The full registry is in pyinla/api/priors.py.

Fixed Precision

To fix the precision at a known value (no estimation), set fixed=True and specify the initial value on the log scale:

import math

# Fix precision at tau = 10 (variance = 0.1)
'hyper': [{
    'initial': math.log(10),  # log(tau) = 2.303
    'fixed': True
}]

Interpreting Results

Hyperparameter Summary

After fitting, access the precision estimate via result.summary_hyperpar:

print(result.summary_hyperpar)
#                      mean        sd  0.025quant  0.5quant  0.975quant     mode
# Precision for z  6.995828  1.913305    3.986632  6.748838   11.421926  6.294978

To convert precision to standard deviation:

tau_mean = result.summary_hyperpar['mean'].values[0]
sigma_mean = 1 / np.sqrt(tau_mean)
print(f"Estimated SD of random effects: {sigma_mean:.3f}")

Random Effect Estimates

The posterior estimates for each group's random effect are available in result.summary_random:

# Random effects for each group
print(result.summary_random['group_id'])
#    ID      mean        sd  0.025quant  0.5quant  0.975quant      mode
# 0   1  0.234567  0.089123    0.059876  0.234567    0.409258  0.234567
# 1   2 -0.123456  0.091234   -0.302345 -0.123456    0.055433 -0.123456
# ...

Worked Examples

NC SIDS Disease Mapping with IID Random Effects