Random Effects

IID Model

Independent and Identically Distributed random effects for modeling exchangeable group-level variation, overdispersion, and simple hierarchical structures.

← Back to Random Effects

Overview

The IID model (Independent and Identically Distributed) is the simplest and most commonly used random effect in hierarchical modeling. It assumes that random effects are:

  • Independent - no correlation between different groups
  • Identically distributed - all groups share the same variance
  • Normally distributed - each effect follows a Gaussian distribution

This makes IID ideal for capturing exchangeable group-level heterogeneity - situations where groups differ from each other, but there's no inherent ordering or spatial relationship between them.

Mathematical Formulation

For \(n\) levels of the index variable, the IID model specifies:

\[u_i \stackrel{\text{iid}}{\sim} \mathcal{N}(0, \tau^{-1}), \quad i = 1, \ldots, n\]

where:

  • \(u_i\) is the random effect for level \(i\)
  • \(\tau\) is the precision (inverse variance) of the random effects
  • \(\sigma = 1/\sqrt{\tau}\) is the standard deviation

Terminology note: throughout this page, level means one value of the index variable (one realization of \(u_i\)). The separate spec key group refers to an additional axis of replication and is unrelated.

The precision matrix for the full vector \(\mathbf{u} = (u_1, \ldots, u_n)^\top\) is simply:

\[\mathbf{Q} = \tau \, \mathbf{I}_n\]

where \(\mathbf{I}_n\) is the \(n \times n\) identity matrix.

When to Use IID

The IID model is appropriate when you have grouped or clustered data where:

Hierarchical Data

Students nested in schools, patients in hospitals, employees in companies. Each group has its own baseline that deviates from the overall mean.

Repeated Measures

Multiple observations per subject. The random effect captures subject-specific deviations from the population mean response.

Overdispersion

Count or binary data with more variability than the likelihood predicts. Adding an observation-level IID effect absorbs excess variation.

Unstructured Spatial Effects

Regional variation without a clear spatial pattern. Often combined with structured spatial models (BYM2) to capture residual heterogeneity.

Visual Understanding: IID vs Structured Effects

Understanding when to use IID becomes clearer when you see how it differs from structured random effects like RW1. The key difference is in the correlation structure.

IID: Independent Random Effects

Each level \(u_i\) is drawn independently from the same zero-mean Gaussian \(\mathcal{N}(0, \tau^{-1})\). The values are exchangeable: there is no inherent ordering, and the realized effect at one level carries no information about any other.

IID: Only diagonal
(levels are independent)

RW1: Correlated Random Effects

Adjacent positions are correlated - nearby values tend to be similar. This creates smooth trends rather than independent jumps.

RW1 (sum-to-zero IGMRF)
blue = +, amber = -

Decision Guide

Use IID when... Use RW1/AR1 when...
Groups have no natural ordering (schools, patients, regions) Data is ordered in time or space (years, age groups, locations)
Group labels are arbitrary (could be reshuffled) Adjacent positions should have similar values
You want to model overdispersion You want to model smooth trends or gradients
Effects are exchangeable (no structure) There's inherent structure in the index variable

Specification in pyINLA

Allowed Parameters

The IID random-effect spec dict accepts the following keys. The source of truth is model_defs/latent/iid.py's ALLOWED_KEYS set, mirrored 1:1 here.

Parameter Required Description
Core
id Yes Column name containing level indices (integers).
model Yes Must be 'iid'.
hyper No List with one dict: [{prior, param, initial, fixed}] for the precision hyperparameter.
n No Number of levels (auto-detected if omitted).
constr No Sum-to-zero constraint (default: False).
initial No Shorthand for hyper[0]['initial']. Sets the starting value for the precision hyperparameter without nesting a hyper block.
fixed No Shorthand for hyper[0]['fixed']. Pins the precision hyperparameter at initial when True.
diagonal No Extra constant added to the precision diagonal for numerical stability (default: 0.0).
Structure / observation-level
weights No Per-observation weighting vector applied to the random-effect contribution.
id_names No Custom display labels for each level (e.g. for summary_random output).
values No Explicit list of allowed covariate values; useful when not all levels appear in the data.
Replication & grouping
nrep No Number of replications; usually inferred from replicate.
replicate No Per-observation replication index vector. Use with nrep for cross-classified effects.
group No Per-observation group index vector for grouped IID. Use together with ngroup and control.group.
ngroup No Number of groups in the grouped IID layout. Companion to group.
control.group No Dict controlling between-group correlation, e.g. {'model': 'iid'} or {'model': 'ar1'}.
Constraints
extraconstr No Extra linear constraints, as a dict {'A': <matrix>, 'e': <vector>}, applied in addition to constr. Each row of A is one constraint enforcing \(A[i, :] \cdot u = e[i]\) on the n-level latent field. A shape is (k, n) for k constraints, e is shape (k,). Accepted types for A: numpy.ndarray (1D treated as one row, 2D as k rows), Python list / list-of-lists, or any scipy.sparse matrix. e is optional: if omitted it defaults to np.zeros(k) (auto-inferred from A).
Copy (feature-key form)
copy No Name of an earlier random-effect to copy. Internally rewrites the spec to the copy model with of referencing the source.
Diagnostics
vb.correct No Variational-Bayes correction set. Pass True to include this component, or a list of node indices for fine-grained control.
debug No Enable local debug output for this random effect (boolean).

Any key not listed above raises a clear safety error with the allowed-keys list before the engine runs. Per-observation scaling is done via weights, not scale (the named scale= argument is not a supported IID feature).

Per-Parameter Runnable Examples

Click any parameter to expand a minimal runnable snippet. All snippets assume import numpy as np; import pandas as pd; import math; from pyinla import pyinla and a small dataframe df with columns y, x, group_id (5 levels).

# shared setup for every snippet below
import numpy as np
import pandas as pd
import math
from pyinla import pyinla

rng = np.random.default_rng(42)
n = 60
df = pd.DataFrame({
    'y': rng.standard_normal(n),
    'x': rng.standard_normal(n),
    'group_id': rng.integers(1, 6, size=n),
})
id  column name with the level index
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid'}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
model  must be 'iid'
# 'model': 'iid' is the marker that selects this family (required).
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid'}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
hyper  prior + starting value for the precision
formula = {
    'response': 'y',
    'random': [{
        'id': 'group_id', 'model': 'iid',
        'hyper': [{
            'prior': 'pc.prec',
            'param': [1, 0.01],
            'initial': 2,
            'fixed': False,
        }],
    }],
}
result = pyinla(formula=formula, family='gaussian', data=df)
n  explicit number of levels (must match the declared level set)
# 'n' must equal the count of unique levels declared by the data or by
# 'values'. The engine refuses to run when they mismatch. To reserve
# room for unobserved levels, declare them with 'values' AND set 'n'
# to that length.
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid',
                 'values': list(range(1, 11)), 'n': 10}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
constr  sum-to-zero constraint
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid', 'constr': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
initial  shorthand for hyper[0]['initial']
# start the optimizer at log(tau) = 2  (tau ~ 7.4)
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid', 'initial': 2}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
fixed  pin the precision at initial
# fix tau = 10  (no estimation of the precision hyperparameter)
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid',
                 'initial': math.log(10), 'fixed': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
diagonal  extra constant on the precision diagonal
# small jitter for numerical stability
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid', 'diagonal': 1e-4}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
weights  per-observation weighting vector
What it actually does. The random-effect contribution at observation i is scaled by weights[i]:
ηi  +=  wi · uid(i)   (default wi = 1)
The latent vector u is unchanged: still one value per level, still iid Gaussian. Two rows in the same group can receive different weighted contributions. Not a likelihood weight: this is a design-matrix scaling on the random-effect contribution, not how much each y counts in the log-likelihood.

Concrete example with 6 rows, 3 groups, weights = [1, 2, 1, 2, 1, 2]:

i group_id weights[i] contribution to ηi
111.0+ 1.0 · u1
212.0+ 2.0 · u1   (same u1, twice the pull)
321.0+ 1.0 · u2
422.0+ 2.0 · u2
531.0+ 1.0 · u3
632.0+ 2.0 · u3

Three good reasons to set it:

  • Exposure-style scaling on the RE itself: same group, per-row exposure (e.g. population, time at risk) multiplies the group's effect.
  • Sneak a linear term into an f() block: set id = 1 (one level), weights = x_i. Then u1 plays the role of a slope and ηi += xi · u1.
  • Custom design contrasts: a row picks up a fractional or signed share of a group's u (positive or negative weights are allowed).
w = rng.uniform(0.5, 1.5, size=n)
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid', 'weights': w}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
id_names  display labels for levels
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid',
                 'id_names': ['A', 'B', 'C', 'D', 'E']}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
values  explicit set of allowed level values
# declare levels 1..5 even if some are absent from this fit
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid',
                 'values': [1, 2, 3, 4, 5]}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
nrep + replicate  independent replications of the iid effect
df['rep_id'] = rng.integers(1, 4, size=n)
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid',
                 'nrep': 3, 'replicate': df['rep_id'].values}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
group + ngroup + control.group  extra correlated axis
df['week'] = rng.integers(1, 5, size=n)
formula = {
    'response': 'y',
    'random': [{
        'id': 'group_id', 'model': 'iid',
        'group': df['week'].values,
        'ngroup': 4,
        'control.group': {'model': 'ar1'},
    }],
}
result = pyinla(formula=formula, family='gaussian', data=df)
extraconstr  additional linear constraint A u = e
# enforce sum of all 5 random effects equals zero
formula = {
    'response': 'y',
    'random': [{
        'id': 'group_id', 'model': 'iid',
        'extraconstr': {'A': np.ones((1, 5)),
                         'e': np.zeros(1)},
    }],
}
result = pyinla(formula=formula, family='gaussian', data=df)
copy  copy an earlier component
df['group2_id'] = rng.integers(1, 6, size=n)
formula = {
    'response': 'y',
    'random': [
        {'id': 'group_id',  'model': 'iid'},
        {'id': 'group2_id', 'model': 'iid', 'copy': 'group_id'},
    ],
}
result = pyinla(formula=formula, family='gaussian', data=df)
vb.correct  variational-Bayes correction
# enable VB correction on this component  (or pass a list of node indices)
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid', 'vb.correct': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)
debug  extra debug output for this component
formula = {
    'response': 'y',
    'random': [{'id': 'group_id', 'model': 'iid', 'debug': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

Level Identifier Handling

The id column should contain integer values identifying which level of the random effect each observation belongs to. pyINLA handles indices flexibly:

Input Behavior
[1, 2, 3, 1, 2, 3] Standard consecutive indices - works directly
[5, 10, 15, 5, 10] Non-consecutive integers - automatically mapped to positions
[1, np.nan, 3, np.nan, 5] np.nan in the index column: that row contributes nothing to this random effect
[0, 1, 2, 0, 1] Zero-based indices - handled correctly

Key points:

  • Indices are mapped to consecutive positions internally via sort(unique(values))
  • np.nan in the index column means "this row has no contribution from this random effect" (useful for mixed designs)
  • String/factor columns must be converted to integers first (e.g. pd.factorize)
# These all work:
df['idx'] = [1, 2, 3, 1, 2, 3]       # Standard
df['idx'] = [10, 20, 30, 10, 20, 30]   # Non-consecutive (mapped to 0,1,2)
df['idx'] = [1, np.nan, 2, np.nan, 1, 2]  # np.nan drops the RE contribution for that row

# Convert string groups to integers
df['group_idx'] = pd.factorize(df['group_name'])[0] + 1
Prediction and missing indices

Two related but distinct things:

  • Predicting an observation: set the response y to np.nan on the rows you want predicted. The posterior predictive is then returned for those rows.
  • Dropping the random effect for a row: set the index to np.nan. That row gets zero contribution from this random effect (its linear predictor is built from the other terms only).

For an unseen group, you have two choices: add it as a new index level (the RE is drawn from its prior, with honest uncertainty), or set the index to np.nan (predict at the population mean, no group-level effect).

NaN in the weights vector

When weights is supplied, individual entries may also be np.nan / NA. pyINLA accepts these without error: the engine treats those rows as having no weighted contribution from this random effect. There is no auto-imputation or row-dropping side effect.

This behaviour is covered by the nan_in_weights case in the iid sweep.

Hyperparameters

The IID model has one hyperparameter: the precision \(\tau\) of the random effects. In pyINLA, we work with the log-transformed precision \(\theta = \log(\tau)\) for numerical stability.

Default Configuration

Parameter Default Value Description
prior 'loggamma' Log-gamma prior on precision (gamma prior on \(\tau\))
param [1, 0.00005] Shape and rate parameters for the gamma prior
initial 4 Starting value for \(\theta = \log(\tau)\); corresponds to \(\tau \approx 54.6\)
fixed False Whether to estimate (False) or fix (True) the precision

Customizing Hyperparameters

Override defaults using the hyper key in your random effect specification:

'random': [
    {
        'id': 'group_id',
        'model': 'iid',
        'hyper': [{
            'prior': 'loggamma',
            'param': [1, 0.01],      # More informative prior
            'initial': 2,             # Different starting point
            'fixed': False
        }]
    }
]

Available Priors

Log-Gamma Prior (Default)

The default loggamma prior places a gamma distribution on the precision \(\tau\):

\[\tau \sim \text{Gamma}(a, b)\]

With default param=[1, 0.00005], this is a very weakly informative prior that allows \(\tau\) to range over several orders of magnitude.

# Default loggamma prior
'hyper': [{'prior': 'loggamma', 'param': [1, 0.00005]}]

# More informative: expect precision around 10-100
'hyper': [{'prior': 'loggamma', 'param': [1, 0.01]}]

PC Prior (Penalised Complexity)

The pc.prec prior is a principled choice that penalizes deviation from a base model (no random effect). It's parameterized in terms of the standard deviation \(\sigma = 1/\sqrt{\tau}\):

\[P(\sigma > u) = \alpha\]

With param=[u, alpha], you specify that the probability of \(\sigma\) exceeding \(u\) is \(\alpha\).

# PC prior: P(sigma > 1) = 0.01
# "I believe standard deviation is unlikely to exceed 1"
'hyper': [{'prior': 'pc.prec', 'param': [1, 0.01]}]

# PC prior: P(sigma > 0.5) = 0.05
# More restrictive: effects should be small
'hyper': [{'prior': 'pc.prec', 'param': [0.5, 0.05]}]

Other Supported Priors

Beyond loggamma and pc.prec, the precision hyperparameter (internally \(\theta = \log \tau\)) accepts several more priors. Pick by what you want to place the prior on: the log-precision \(\theta\), the precision \(\tau\), or the standard deviation \(\sigma = 1/\sqrt{\tau}\).

Prior Placed on param Use when
normal (alias: gaussian) \(\theta = \log\tau\) [mean, precision] You want a Gaussian on the log-precision (e.g. centre at a known order of magnitude).
logtnormal (alias: logtgaussian) \(\tau\) [mean, precision] Truncated log-normal on \(\tau\); keeps the precision bounded away from zero.
flat \(\theta\) (none) Improper uniform on the log-precision. Use with care; can produce improper posteriors.
logiflat \(\tau\) (none) Improper flat prior on \(1/\tau\). Numerically stable in practice.
logflat (use with caution) \(\tau\) (none) Improper flat prior on \(\tau\) directly. Often unstable: with weak data the optimizer can drive \(\theta \to -33\) so \(\tau = e^\theta \to \infty\) and the engine aborts on a singular matrix. Prefer pc.prec or loggamma for weakly-informative precision priors.
expression:<R code> \(\theta\) (none) Custom log-density expression, evaluated as a function of theta (the internal log-precision scale). The expression must return \(\log \pi(\theta)\), in R syntax.
table: x1 ... xn | l1 ... ln \(\theta\) (none) Tabulated log-density on a grid: xi are \(\theta\) values, li are the corresponding \(\log \pi(\theta_i)\) values. The | separator is optional (any non-numeric token is ignored). Linear interpolation between knots.
# Gaussian on log-precision, centred at 4 with precision 1
'hyper': [{'prior': 'normal', 'param': [4, 1]}]

# Improper flat prior on log-precision
'hyper': [{'prior': 'flat'}]

# Custom log-density as an R expression (here: standard normal on theta)
'hyper': [{'prior': 'expression: -theta * theta / 2'}]

# Tabulated log-density on a grid: theta values | log pi(theta) values
'hyper': [{'prior': 'table: -3 -1 1 3 | -4.5 -1.1 -1.1 -4.5'}]

Unknown prior names are rejected up front with a suggestion of near matches. The full registry is in pyinla/api/priors.py.

Fixed Precision

To fix the precision at a known value (no estimation), set fixed=True and specify the initial value on the log scale:

import math

# Fix precision at tau = 10 (variance = 0.1)
'hyper': [{
    'initial': math.log(10),  # log(tau) = 2.303
    'fixed': True
}]

Interpreting Results

Hyperparameter Summary

After fitting, access the precision estimate via result.summary_hyperpar:

print(result.summary_hyperpar)
#                      mean        sd  0.025quant  0.5quant  0.975quant     mode
# Precision for z  6.995828  1.913305    3.986632  6.748838   11.421926  6.294978

To convert precision to standard deviation:

tau_mean = result.summary_hyperpar['mean'].values[0]
sigma_mean = 1 / np.sqrt(tau_mean)
print(f"Estimated SD of random effects: {sigma_mean:.3f}")

Random Effect Estimates

The posterior estimates for each group's random effect are available in result.summary_random:

# Random effects for each group
print(result.summary_random['group_id'])
#    ID      mean        sd  0.025quant  0.5quant  0.975quant      mode
# 0   1  0.234567  0.089123    0.059876  0.234567    0.409258  0.234567
# 1   2 -0.123456  0.091234   -0.302345 -0.123456    0.055433 -0.123456
# ...