BYM2 Model - pyINLA

Overview

The BYM2 model is a modern reparameterization of the classic Besag-York-Mollie (BYM) convolution model, introduced by Riebler et al. (2016). It combines:

Structured spatial effect - an ICAR (Besag) component capturing spatial autocorrelation
Unstructured effect - an IID component for region-specific heterogeneity

The key innovation is the interpretable parameterization: instead of separate variance parameters for each component, BYM2 uses:

\(\tau\) - overall precision (controls total variance)
\(\phi \in (0,1)\) - mixing parameter (proportion of variance from spatial structure)

This makes prior specification intuitive: \(\phi = 1\) means purely spatial (Besag), \(\phi = 0\) means purely unstructured (IID).

Mathematical Formulation

For \(n\) regions, the BYM2 model defines the random effect as:

\[u_i = \frac{1}{\sqrt{\tau}}\left(\sqrt{\phi} \, s_i + \sqrt{1-\phi} \, v_i\right)\]

where:

\(s_i\) is the scaled spatial (ICAR) component with \(\text{Var}(s_i) \approx 1\)
\(v_i\) is the IID (unstructured) component with \(v_i \sim \mathcal{N}(0, 1)\)
\(\tau\) is the overall precision (inverse of total marginal variance)
\(\phi \in (0,1)\) is the mixing parameter controlling the spatial/IID balance

Interpretation of \(\phi\)

Value of \(\phi\)	Interpretation
\(\phi \to 1\)	Purely spatial - variation explained by neighborhood structure
\(\phi = 0.5\)	Equal contribution from spatial and unstructured components
\(\phi \to 0\)	Purely unstructured - independent regional variation

Marginal Variance

Thanks to the scaling, the marginal variance is approximately:

\[\text{Var}(u_i) \approx \frac{1}{\tau}\]

This makes prior specification for \(\tau\) intuitive - it directly controls the overall variance regardless of the mixing \(\phi\).

When to Use BYM2

BYM2 is the recommended model for disease mapping and areal spatial analysis. Use it when:

Disease Mapping

The standard choice for mapping disease rates. Captures both spatial clustering and region-specific anomalies.

Uncertainty in Spatial Structure

When you don't know if variation is spatially structured or random, let the data decide via the mixing parameter φ ∈ [0, 1] (φ = 1 means fully spatial, φ = 0 means fully IID).

Interpretable Priors

When you want to specify priors on total variance and spatial proportion separately, with clear interpretation.

Ecological/Environmental Studies

Species distribution, pollution levels, or any phenomenon with both neighbor-correlated spatial variation and region-specific noise.

BYM2 vs Besag + IID

The classic BYM approach uses separate Besag and IID effects. BYM2 is preferred because:

Hyperparameters are interpretable (total variance + mixing proportion)
PC priors can be specified meaningfully
Avoids confounding between spatial and IID variance parameters
Better posterior identifiability and more stable posterior geometry

Specification in pyINLA

Allowed Parameters

Parameter	Required	Default	Description
`id`	Yes	-	Column name holding area / region indices (integers 1..n)
`model`	Yes	-	Must be `'bym2'`
`graph`	Yes	-	Neighborhood structure: dense / sparse adjacency matrix, an `InlaGraph`, or a path to a text-format `.graph` file
`hyper`	No	PC priors	List of two dicts: `[{prec}, {phi}]` overriding the priors on \(\theta_1 = \log\tau\) and \(\theta_2 = \mathrm{logit}\,\phi\)
`n`	No	auto	Number of areas (auto-detected from the graph if omitted)
`constr`	No	`True`	Sum-to-zero constraint on the structured component (applied per connected component when `adjust.for.con.comp=True`)
`scale.model`	No	`False`	Scale R so the geometric mean of the marginal variances on the structured part is 1. Recommended for the BYM2 marginal-precision interpretation; the literature pairs it with `pc.prec`
`adjust.for.con.comp`	No	`True`	When the graph has more than one connected component, treat each component independently (per-component sum-to-zero, rank-deficiency = number of components of size \(\ge 2\))
`values`	No	-	Explicit list of allowed area-id values
`rankdef`	No	auto	Explicit rank-deficiency override (auto-derived from connected components otherwise)
`extraconstr`	No	-	Extra linear constraints: `{'A': matrix, 'e': vector}` where `A @ x = e`
`diagonal`	No	`0.0`	Extra constant added to the precision diagonal (numerical stability)

Graph Formats

The required graph argument accepts four equivalent forms (0- vs 1-indexed text files are auto-detected and remapped internally):

Format	Description
`scipy.sparse matrix`	Sparse adjacency, CSR or CSC; most efficient for large graphs
`numpy.ndarray`	Dense symmetric adjacency of shape (n, n)
`str` or `pathlib.Path`	Path to a text-format `.graph` file: line 1 is `n`, then one line per node `i k nb_1 … nb_k`
`InlaGraph`	Internal graph object built by pyinla's `read.graph` helper

Per-Parameter Runnable Examples

Click any parameter to expand a minimal runnable snippet. All snippets assume the shared setup block below (8 connected areas on a 4×2 grid; nobs = 80 observations). Each snippet corresponds 1:1 to a parity scenario in scenarios/latent_bym2/.

# shared setup for every snippet below
import numpy as np
import pandas as pd
import scipy.sparse as sp
from pyinla import pyinla

rng = np.random.default_rng(42)
n    = 8            # 4x2 grid
nobs = 80

# Connected 4x2 grid (used by most snippets)
A = np.zeros((n, n))
for i, j in [(1,2),(2,3),(3,4),(5,6),(6,7),(7,8),
              (1,5),(2,6),(3,7),(4,8)]:
    A[i-1, j-1] = A[j-1, i-1] = 1
graph = sp.csr_matrix(A)

# Two-component 4+4 graph (used by the adjust.for.con.comp snippet)
A2 = np.zeros((n, n))
for i, j in [(1,2),(2,3),(3,4),(5,6),(6,7),(7,8)]:
    A2[i-1, j-1] = A2[j-1, i-1] = 1
graph_2cc = sp.csr_matrix(A2)

df = pd.DataFrame({
    'y':    rng.standard_normal(nobs),
    'x':    rng.standard_normal(nobs),
    'area': rng.integers(1, n + 1, size=nobs),
})

id + model + graph required trio (minimal call)

# Defaults: constr=True, adjust.for.con.comp=True, scale.model=False.
# Default priors: pc.prec(1, 0.01) on prec, pc(0.5, 0.5) on phi.
formula = {
    'response': 'y',
    'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

graph alternative formats (sparse, dense, file path, InlaGraph)

# pyinla accepts the adjacency in four forms (all produce the same Model.ini):
#   (1) scipy.sparse matrix (used in the shared setup above)
#   (2) dense numpy adjacency
#   (3) path to a text-format graph file (line 1 n, then "i k nb_1 ... nb_k")
#   (4) InlaGraph object returned by pyinla.read_graph.inla_read_graph
from pyinla.read_graph import inla_read_graph

graph_dense  = A                                           # (2)
graph_file   = '/path/to/regions.graph'                     # (3)
graph_object = inla_read_graph(str('/path/to/regions.graph'))# (4)

formula = {
    'response': 'y',
    'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph_dense}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

hyper override the prec and / or phi priors

# hyper is a length-2 list: [prec dict, phi dict]. An empty dict {} keeps the
# default for that slot, so {} + closed-form phi prior pins phi without
# triggering the tabulated pc-phi prior (libm-dependent bytes).
formula = {
    'response': 'y',
    'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'hyper': [{'prior': 'loggamma', 'param': [1, 0.01]},
                            {'prior': 'normal',   'param': [0.0, 1.0]}]}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

n explicit graph size

# Auto-detected from the graph by default; pass it explicitly to assert size.
formula = {
    'response': 'y',
    'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'n': 8}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

constr turn off the sum-to-zero constraint (default True)

# Drop the intercept since the constraint normally absorbs the level.
formula = {
    'response': 'y',
    'fixed': ['0', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'constr': False,
                 'hyper': [{}, {'prior': 'normal', 'param': [0.0, 1.0]}]}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

scale.model geometric-mean variance scaling (recommended with PC priors)

# The default is False; the BYM2 literature recommends True so the
# marginal-precision interpretation is invariant to the graph structure.
formula = {
    'response': 'y',
    'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'scale.model': True}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

adjust.for.con.comp per-component sum-to-zero (default True)

# With multi-component graphs: True emits one sum-to-zero per component;
# False uses a single global sum-to-zero. Only meaningful when the graph
# has more than one connected component.
# Pin phi to a closed-form prior on multi-component graphs to avoid the
# pc-phi table-generator edge case on disconnected adjacencies.
formula = {
    'response': 'y',
    'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph_2cc,
                 'adjust.for.con.comp': False,
                 'hyper': [{}, {'prior': 'normal', 'param': [0.0, 1.0]}]}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

values explicit area-id grid

formula = {
    'response': 'y',
    'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'values': list(range(1, n + 1))}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

extraconstr additional linear constraints \(\mathbf{A}\mathbf{x} = \mathbf{e}\)

# bym2's latent is augmented to length 2n (top half = total field b,
# bottom half = structured u). The extraconstr A matrix must have
# ncol == 2n. Below: pin the first and last entries of the structured
# component (rows n..2n-1).
A_extra = np.zeros((2, 2 * n))
A_extra[0, n + 0]     = 1.0
A_extra[1, 2 * n - 1] = 1.0
e = np.array([0.0, 0.0])
formula = {
    'response': 'y',
    'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'values':      list(range(1, n + 1)),
                 'extraconstr': {'A': A_extra, 'e': e},
                 'hyper': [{}, {'prior': 'normal', 'param': [0.0, 1.0]}]}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

diagonal extra constant on the precision diagonal

formula = {
    'response': 'y',
    'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'diagonal': 1e-4}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

rankdef explicit rank-deficiency override

# pyinla derives the rank deficiency from the graph by default: it equals
# the number of connected components when adjust.for.con.comp=True (one
# null direction per component on the structured part), or 1 otherwise.
# Override only when you know the true rank deficiency (e.g. with a
# non-standard structure matrix where the auto-count would be wrong).
formula = {
    'response': 'y',
    'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'rankdef': 1}],
}
result = pyinla(formula=formula, family='gaussian', data=df)

Hyperparameters

BYM2 has two hyperparameters:

theta1: Precision (\(\tau\))

Controls the overall variance of the random effect. Higher precision means smaller variance.

Internal representation: \(\theta_1 = \log(\tau)\)
Interpretation: \(\text{Var}(u_i) \approx 1/\tau\), so \(\sigma \approx 1/\sqrt{\tau}\)

Parameter	Default Value	Description
`prior`	`'pc.prec'`	PC prior on precision
`param`	`[1, 0.01]`	P(\(\sigma > 1\)) = 0.01
`initial`	`4`	Starting value for \(\log(\tau)\); corresponds to \(\tau \approx 54.6\)
`fixed`	`False`	Whether to estimate (False) or fix (True)

theta2: Mixing (\(\phi\))

Controls the proportion of variance attributed to spatial structure vs unstructured variation.

Internal representation: \(\theta_2 = \text{logit}(\phi) = \log(\phi/(1-\phi))\)
Interpretation: \(\phi\) is the proportion of variance from the spatial component

Parameter	Default Value	Description
`prior`	`'pc'`	PC prior (adaptive based on graph structure)
`param`	`[0.5, 0.5]`	P(\(\phi < u\)) = \(\alpha\); here P(\(\phi < 0.5\)) = 0.5
`initial`	`-3`	Starting value for logit(\(\phi\)); corresponds to \(\phi \approx 0.047\)
`fixed`	`False`	Whether to estimate (False) or fix (True)

PC Prior for \(\phi\) is Adaptive

The PC prior for the mixing parameter depends on the graph structure. pyINLA automatically generates the appropriate prior table based on your adjacency graph. This ensures proper penalization towards the base model (\(\phi = 0\), pure IID).

Customizing Hyperparameters

Override defaults using the hyper key. pyinla expects a list of two dicts in [prec, phi] order; an empty {} in either slot keeps the default for that slot.

'random': [
    {
        'id': 'area',
        'model': 'bym2',
        'graph': graph,
        'hyper': [
            {'prior': 'pc.prec', 'param': [0.5, 0.01], 'initial': 4},  # prec: P(sigma > 0.5) = 0.01
            {'prior': 'pc',      'param': [0.5, 2.0/3.0], 'initial': 0},  # phi:  P(phi < 0.5) = 2/3
        ],
    }
]

Fixing Hyperparameters

import math

# Fix tau = 5 and phi = 0.8 (mostly spatial). theta scale: log(tau) and logit(phi).
phi_fixed = 0.8
logit_phi = math.log(phi_fixed / (1 - phi_fixed))

'hyper': [
    {'initial': math.log(5), 'fixed': True},
    {'initial': logit_phi,    'fixed': True},
]

Available Priors

BYM2's hyper is a length-2 list: the first dict configures the prior on the log marginal precision \(\theta_1 = \log\tau\), the second configures the prior on the logit mixing parameter \(\theta_2 = \mathrm{logit}\,\phi\). An empty {} in either slot keeps that slot's default.

Prior catalog (one snippet per family; 1:1 with the bym2 parity scenarios)

Prior on \(\theta_1 = \log\tau\) (slot 0, key `prec`)

prec · pc.prec PC prior on \(\sigma = 1/\sqrt{\tau}\) (default)

formula = {
    'response': 'y', 'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'hyper': [{'prior': 'pc.prec', 'param': [1, 0.01]},
                            {'prior': 'normal',  'param': [0.0, 1.0]}]}],
}

prec · loggamma gamma on \(\tau\)

formula = {
    'response': 'y', 'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'hyper': [{'prior': 'loggamma', 'param': [1, 0.00005]},
                            {'prior': 'normal',   'param': [0.0, 1.0]}]}],
}

prec · fixed pin \(\tau\) at a known value

import math
formula = {
    'response': 'y', 'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'hyper': [{'initial': math.log(10), 'fixed': True},
                            {'prior': 'normal', 'param': [0.0, 1.0]}]}],
}

prec · normal / gaussian normal on \(\theta_1\)

formula = {
    'response': 'y', 'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'hyper': [{'prior': 'normal', 'param': [0.0, 1.0]},
                            {'prior': 'normal', 'param': [0.0, 1.0]}]}],
}

prec · laplace double-exponential on \(\theta_1\)

formula = {
    'response': 'y', 'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'hyper': [{'prior': 'laplace', 'param': [0.0, 1.0]},
                            {'prior': 'normal',  'param': [0.0, 1.0]}]}],
}

prec · pc.mgamma PC prior, modified-gamma variant

formula = {
    'response': 'y', 'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'hyper': [{'prior': 'pc.mgamma', 'param': [1.0]},
                            {'prior': 'normal',    'param': [0.0, 1.0]}]}],
}

prec · jeffreystdf Jeffreys-type reference prior

formula = {
    'response': 'y', 'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'hyper': [{'prior': 'jeffreystdf', 'param': []},
                            {'prior': 'normal',      'param': [0.0, 1.0]}]}],
}

prec · expression: … custom log-density (R expression)

# A custom log-prior supplied as an R expression string. The body is
# evaluated by the C engine; `theta` is the internal-scale variable.
expr = "expression: log_dens = -0.5 * theta * theta; log_dens;"
formula = {
    'response': 'y', 'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'hyper': [{'prior': expr,     'param': []},
                            {'prior': 'normal', 'param': [0.0, 1.0]}]}],
}

prec · table: … custom tabulated log-density

# Inline tabulated prior: alternating (theta, log_density) pairs.
# The C engine requires at least 4 grid points.
table_prior = "table: -6 -18 -2 -2 2 -2 6 -18"
formula = {
    'response': 'y', 'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'hyper': [{'prior': table_prior, 'param': []},
                            {'prior': 'normal',      'param': [0.0, 1.0]}]}],
}

Prior on \(\theta_2 = \mathrm{logit}\,\phi\) (slot 1, key `phi`)

phi · pc graph-dependent PC prior on \(\phi\) (default)

# P(phi < 0.5) = 0.5 - agnostic default.
formula = {
    'response': 'y', 'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'hyper': [{}, {'prior': 'pc', 'param': [0.5, 0.5]}]}],
}

phi · normal / gaussian normal prior on \(\theta_2\)

formula = {
    'response': 'y', 'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'hyper': [{}, {'prior': 'normal', 'param': [0.0, 1.0]}]}],
}

phi · loggamma loggamma on \(\theta_2\)

formula = {
    'response': 'y', 'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'hyper': [{}, {'prior': 'loggamma', 'param': [1, 0.01]}]}],
}

phi · flat improper flat on \(\theta_2\)

formula = {
    'response': 'y', 'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'hyper': [{}, {'prior': 'flat', 'param': []}]}],
}

phi · logtnormal truncated log-normal on \(\theta_2\)

formula = {
    'response': 'y', 'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'hyper': [{}, {'prior': 'logtnormal', 'param': [0.0, 1.0]}]}],
}

phi · expression: … custom log-density (R expression)

expr = "expression: log_dens = -0.5 * theta * theta; log_dens;"
formula = {
    'response': 'y', 'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'hyper': [{}, {'prior': expr, 'param': []}]}],
}

phi · table: … custom tabulated log-density

table_prior = "table: -6 -18 -2 -2 2 -2 6 -18"
formula = {
    'response': 'y', 'fixed': ['1', 'x'],
    'random': [{'id': 'area', 'model': 'bym2', 'graph': graph,
                 'hyper': [{}, {'prior': table_prior, 'param': []}]}],
}

Interpreting Results

Hyperparameter Summary

The hyperparameter summary shows both internal (theta) and user-scale parameters:

# View hyperparameters
print(result.summary_hyperpar)

# Typical output:
#                                          mean       sd   0.025quant   0.5quant   0.975quant
# Precision for region                   12.34     3.21         7.12      11.89        20.45
# Phi for region                          0.78     0.12         0.51       0.80         0.94

What Does Phi Tell You?

\(\phi > 0.8\): Strong spatial structure - neighboring regions are similar
\(\phi \approx 0.5\): Mixed - both spatial and local factors matter equally
\(\phi < 0.2\): Weak spatial structure - region-specific factors dominate

Random Effects: the latent has length 2n

BYM2 is an augmented latent in the engine: the random-effect vector returned in summary_random['area'] stacks the mixed total field on top of the bare structured part, total length \(2n\):

\[\begin{pmatrix} \mathbf{b} = \tfrac{1}{\sqrt{\tau}}\big(\sqrt{1-\phi}\,\mathbf{v} + \sqrt{\phi}\,\mathbf{u}\big) \\ \mathbf{u} \end{pmatrix}\]

The first \(n\) rows are the total BYM2 effect; the next \(n\) rows are the unscaled structured component \(\mathbf{u}\) (so you can inspect the spatial structure on its own).

# The id-keyed DataFrame holds 2n rows (mixed total in 0..n-1, structured
# u in n..2n-1).
eff = result.summary_random['area']
total      = eff.iloc[:n]      # BYM2 total effect b (length n)
structured = eff.iloc[n:]      # structured component u alone (length n)

print(total.head())          # ID, mean, sd, 0.025quant, 0.5quant, 0.975quant, mode, kld
print(structured.head())

Implementation Details

The structured component's behavior depends on the combination of scale.model and adjust.for.con.comp options. Let \(\mathbf{Q} = \tau \mathbf{R}\) be the precision matrix of the underlying Besag part.

scale.model=False, adjust.for.con.comp=False

The constraint constr=True is interpreted as a single sum-to-zero constraint over the whole graph. Singletons get an improper uniform distribution on \((-\infty, \infty)\) before the constraint.

scale.model=True, adjust.for.con.comp=False

The constraint constr=True applies over the whole graph. The structure matrix \(\mathbf{R}\) (excluding singletons) is scaled so the geometric mean of the marginal variances is 1, and \(\mathbf{R}\) is modified so that singletons have a standard Normal distribution.

scale.model=False, adjust.for.con.comp=True

The constraint constr=True is interpreted as one sum-to-zero constraint per connected component of size \(\geq 2\). Singletons get an improper uniform distribution on \((-\infty, \infty)\).

scale.model=True, adjust.for.con.comp=True (recommended for BYM2)

The constraint constr=True applies separately to each connected component of size \(\geq 2\). \(\mathbf{R}\) is scaled so the geometric mean of the marginal variances in each component is 1, and singletons get a standard Normal distribution. This is the Riebler-Sorbye 2016 setup.

Technical Notes

The latent emitted to the engine has dimension \(2n\) (top half = total BYM2 field \(\mathbf{b}\), bottom half = structured \(\mathbf{u}\)). Slice summary_random['area'] accordingly.
The term \(\tfrac{1}{2}\log(|\mathbf{R}|^{*})\) of the normalization constant is not computed; add it to the log marginal likelihood estimate if needed, where \(\mathbf{R}\) is the precision matrix of the standardised Besag part.
For graphs with disconnected components the per-component sum-to-zero behavior is documented in the 4 Implementation Details cases above; the default is adjust.for.con.comp=True.

Reference

Riebler, A., Sorbye, S.H., Simpson, D., and Rue, H. (2016). "An intuitive Bayesian spatial model for disease mapping that accounts for scaling." Statistical Methods in Medical Research, 25(4), 1145-1165.

This paper introduces the BYM2 parameterization and explains the advantages over the classic BYM model.

BibTeX

@article{riebler2016intuitive,
  title={An intuitive Bayesian spatial model for disease mapping that accounts for scaling},
  author={Riebler, Andrea and S{\o}rbye, Sigrunn H and Simpson, Daniel and Rue, H{\aa}vard},
  journal={Statistical methods in medical research},
  volume={25},
  number={4},
  pages={1145--1165},
  year={2016},
  publisher={SAGE Publications Sage UK: London, England}
}

Related Models

Besag (ICAR) - Pure spatial model (equivalent to BYM2 with \(\phi = 1\))
IID - Pure unstructured model (equivalent to BYM2 with \(\phi = 0\))
Generic0 - Custom precision structure specification