Advanced

Posterior Sampling

Draw samples from the joint posterior distribution for inference beyond marginals: useful for derived quantities, predictions, and Monte Carlo integration.

Why Sample from the Posterior?

While INLA provides accurate marginal posteriors through numerical integration, sometimes you need samples from the joint posterior distribution. Common use cases include:

Derived quantities: functions of multiple parameters (e.g., \(\beta_1 \times \beta_2\))
Posterior predictions: propagating uncertainty through nonlinear transformations
Model comparison: computing posterior predictive checks
Correlation analysis: understanding dependencies between parameters

How pyINLA Generates Samples

pyINLA computes deterministic approximations to the posterior first, then generates samples from those approximations. This is fast, but means samples are approximate: they come from the Gaussian (or skew-corrected) approximation to the true posterior.

Sampling Functions

pyINLA provides four sampling functions:

pyinla.qsample()

Sample from a Gaussian Markov Random Field (GMRF) given a precision matrix Q. The fundamental building block for all other sampling.

pyinla.posterior_sample()

Generate samples from the full joint posterior of hyperparameters and latent field, using stored configurations from model fitting.

pyinla.posterior_sample_eval()

Evaluate a function on posterior samples. Extract specific elements or compute derived quantities.

pyinla.hyperpar_sample()

Sample only the hyperparameters from their joint posterior, using eigendecomposition of the covariance matrix.

qsample: GMRF Sampling

pyinla.qsample() draws samples from a Gaussian Markov Random Field (GMRF) defined by a precision matrix \(\mathbf{Q}\):

\mathbf{x} \sim \mathcal{N}(\boldsymbol{\mu}, \mathbf{Q}^{-1})

How GMRF Sampling Works

The INLA binary uses a Cholesky factorization approach:

Compute Cholesky factor: \(\mathbf{Q} = \mathbf{L}\mathbf{L}^T\)
Generate standard normal: \(\mathbf{z} \sim \mathcal{N}(0, \mathbf{I})\)
Solve: \(\mathbf{L}^T \mathbf{x} = \mathbf{z}\) via back-substitution
Add mean: \(\mathbf{x} \leftarrow \mathbf{x} + \boldsymbol{\mu}\)

This exploits the sparse structure of Q for efficient sampling.

Parameters

Parameter	Type	Description
`n`	int	Number of samples to generate
`Q`	ndarray, sparse, or str	Precision matrix (positive definite). Accepts a dense ndarray, any scipy.sparse format, or a path to a binary fmesher file.
`mu`	ndarray, optional	Mean vector (default: zero)
`b`	ndarray, optional	Linear term in the log-density \(-\tfrac{1}{2}(x-\mu)^\top Q (x-\mu) + b^\top x\). Shifts the posterior mean by \(Q^{-1} b\) (canonical / information form).
`constr`	dict, optional	Linear constraints: `{'A': matrix, 'e': vector}` for \(\mathbf{A}\mathbf{x} = \mathbf{e}\)
`sample`	ndarray, optional	If provided, compute the log-density of these samples instead of generating new ones.
`compute_mean`	bool, optional	If True, also return the (constrained) posterior mean.
`reordering`	str	Cholesky reordering algorithm: `'auto'` (default), `'amd'`, `'metis'`, `'band'`, `'identity'`.
`seed`	int, optional	Random seed for reproducibility. Note: bit-exact reproducibility requires also pinning `num_threads` (e.g. `num_threads='1:1'`); the seed alone is not enough if the thread count varies across runs.
`logdens`	bool	If True, also return log-densities
`selection`	array, optional	Indices of elements to return (1-based)

Per-Parameter Runnable Examples

Click any parameter to expand a minimal runnable snippet. All snippets share the setup below; each chunk redefines Q as appropriate.

# shared setup for every qsample snippet below
import numpy as np
from scipy import sparse
import pyinla

n + Q required minimal call

# Diagonal precision matrix (independent components):
# precision = 5 means variance = 0.2 per component.
Q = np.eye(5) * 5
samples = pyinla.qsample(n=1000, Q=Q)
print(f"Shape:    {samples.shape}")        # (5, 1000)
print(f"Variance: {np.var(samples, axis=1)}")  # ~0.2

mu non-zero mean vector

Q  = np.eye(5) * 10
mu = np.array([1, 2, 3, 4, 5], dtype=float)
samples = pyinla.qsample(n=100, Q=Q, mu=mu, seed=42)
print(samples.mean(axis=1).round(2))  # empirical means ~ mu

constr linear constraint A x = e

Q = np.eye(5) * 10

# --- Single constraint: A is 1 x 5, e is length 1 ---
# Sum-to-zero: x[0] + x[1] + x[2] + x[3] + x[4] = 0
constr = {'A': np.ones((1, 5)), 'e': np.array([0.0])}
samples = pyinla.qsample(n=100, Q=Q, constr=constr, seed=42)
print(np.sum(samples, axis=0)[:5].round(6))  # all ~0

# --- Multiple constraints: A is k x 5, e is length k ---
# Row 1: sum-to-zero;  Row 2: x[0] - x[4] = 2
A = np.array([
    [1.0,  1.0,  1.0,  1.0,  1.0],
    [1.0,  0.0,  0.0,  0.0, -1.0],
])
e = np.array([0.0, 2.0])
samples = pyinla.qsample(n=100, Q=Q, constr={'A': A, 'e': e}, seed=42)
print("sum:    ", samples.sum(axis=0)[:5].round(6))   # all ~0
print("x0 - x4:", (samples[0] - samples[4])[:5].round(6))  # all ~2

seed reproducibility (pair with num_threads for bit-exact)

Q = np.eye(3) * 4

# seed alone: reproducible up to floating-point thread-scheduling noise.
a = pyinla.qsample(n=10, Q=Q, seed=7)
b = pyinla.qsample(n=10, Q=Q, seed=7)
print("byte-equal (seed only):    ", np.array_equal(a, b))

# Add num_threads='1:1' to make it byte-exact across runs.
a2 = pyinla.qsample(n=10, Q=Q, seed=7, num_threads="1:1")
b2 = pyinla.qsample(n=10, Q=Q, seed=7, num_threads="1:1")
print("byte-equal (seed + 1:1):   ", np.array_equal(a2, b2))  # True

logdens also return log-densities (returns a dict)

# Sparse tridiagonal Q: GMRF Cholesky exploits the sparsity.
n        = 100
diag     = np.ones(n) * 4
off_diag = -np.ones(n - 1)
Q = sparse.diags([off_diag, diag, off_diag], [-1, 0, 1], format='csc')

result = pyinla.qsample(n=50, Q=Q, seed=77, logdens=True)
print(list(result.keys()))               # ['sample', 'logdens', 'mean']
print(result['sample'].shape, result['logdens'].shape)

selection return only a subset of components (1-based)

Q = np.eye(5) * 4
# Only return components 1, 3, 5 (1-based indices).
samples = pyinla.qsample(n=10, Q=Q, seed=1,
                            selection=np.array([1, 3, 5]))
print(samples.shape)  # (3, 10)

b canonical / information form (linear term)

# Log-density is -1/2 (x - mu)^T Q (x - mu) + b^T x.
# With mu = 0, the posterior mean equals Q^-1 b.
Q = np.eye(5) * 4
b = np.array([1.0, -1.0, 0.5, -0.5, 0.0])
samples = pyinla.qsample(n=200, Q=Q, b=b, seed=1)
print("empirical mean:", samples.mean(axis=1).round(3))
print("expected Q^-1 b:", np.linalg.solve(Q, b).round(3))

compute_mean also return the (constrained) posterior mean

# compute_mean=True returns a dict with 'sample', 'logdens', 'mean'.
# With a sum-to-zero constraint, the mean is mu projected onto sum=0.
Q      = np.eye(5) * 10
mu     = np.array([1, 2, 3, 4, 5], dtype=float)
constr = {'A': np.ones((1, 5)), 'e': np.array([0.0])}
out = pyinla.qsample(n=50, Q=Q, mu=mu, constr=constr,
                       seed=1, compute_mean=True)
print(list(out.keys()))            # ['sample', 'logdens', 'mean']
print(out['mean'].round(2))         # [-2, -1, 0, 1, 2] (sums to 0)

sample recompute log-density for given samples (no new draws)

# Pass an existing matrix of samples to evaluate their log-density
# under N(mu, Q^-1). Useful for importance sampling and model comparison.
Q = np.eye(5) * 4
x = np.random.default_rng(0).standard_normal((5, 10))
out = pyinla.qsample(n=10, Q=Q, sample=x, logdens=True)
print(out['logdens'].shape)  # (10,)
print(out['logdens'].round(3))

reordering pick the Cholesky reordering algorithm

# 'auto' picks the best heuristic; 'band' is optimal for band-structured Q;
# 'amd' / 'metis' minimise fill-in for general sparsity. Same result, different speed.
n = 50
Q = (np.eye(n) * 4
     + np.diag(-np.ones(n - 1), 1)
     + np.diag(-np.ones(n - 1), -1))
s_auto = pyinla.qsample(n=10, Q=Q, seed=1, reordering='auto')
s_band = pyinla.qsample(n=10, Q=Q, seed=1, reordering='band')
print(s_auto.shape, s_band.shape)  # both (50, 10)

Q in different formats

The same call accepts dense ndarrays, any scipy.sparse format, or a path to a binary fmesher file on disk. All three produce identical samples for the same Q.

Q as scipy csr_matrix (any sparse format works)

Q = sparse.eye(5).tocsr() * 5.0  # csc, csr, dia, coo all OK
samples = pyinla.qsample(n=20, Q=Q, seed=1)
print(samples.shape)  # (5, 20)

Q via sparse.diags(...) (band / tridiagonal builders)

# Build a tridiagonal Q without materialising it densely.
n        = 100
diag     = np.ones(n) * 4
off_diag = -np.ones(n - 1)
Q = sparse.diags([off_diag, diag, off_diag], [-1, 0, 1], format='csc')
samples = pyinla.qsample(n=50, Q=Q, seed=1)
print(samples.shape)  # (100, 50)

Q as a file path (binary fmesher format on disk)

# Useful when Q is large and pre-built (e.g. SPDE precision exported once,
# then re-used across many sampling calls without re-serialising).
import tempfile, os
from pyinla.fmesher_io import write_fmesher_file

Q   = sparse.eye(5).tocsc() * 5.0
tmp = tempfile.NamedTemporaryFile(delete=False, suffix='.dat')
tmp.close()
write_fmesher_file(Q, tmp.name)

samples = pyinla.qsample(n=20, Q=tmp.name, seed=1)
print(samples.shape)  # (5, 20)
os.unlink(tmp.name)

posterior_sample: Full Posterior Sampling

pyinla.posterior_sample() generates samples from the joint posterior of hyperparameters \(\boldsymbol{\theta}\) and latent field \(\mathbf{x}\):

(\boldsymbol{\theta}, \mathbf{x}) \sim \pi(\boldsymbol{\theta}, \mathbf{x} \mid \mathbf{y})

Requirement: You must fit the model with control={'compute': {'config': True}} to enable posterior sampling. This stores the configurations needed for sampling.

The Algorithm

The sampling algorithm follows these steps:

Sample configurations weighted by log-posterior
Each configuration \(k\) has a weight \(\propto \exp(\log \pi(\boldsymbol{\theta}^{(k)} \mid \mathbf{y}))\). Configurations with higher posterior probability are sampled more frequently.
For each sampled config, call qsample
Sample the latent field from the GMRF: \(\mathbf{x} \sim \mathcal{N}(\boldsymbol{\mu}_k, \mathbf{Q}_k^{-1})\) where \(\mathbf{Q}_k\) is the precision matrix for config \(k\).
Compute the linear predictor
The linear predictor is computed as: \(\boldsymbol{\eta} = \mathbf{A} \mathbf{x} + \text{offsets}\) where \(\mathbf{A}\) is the design matrix mapping latent field to observations.
Apply skewness correction
If the posterior is skewed, apply a correction using the skew-normal distribution to improve the Gaussian approximation.
Transform hyperparameters
Convert from internal scale (e.g., log-precision) to user scale (e.g., precision), computing the Jacobian for proper density transformation.

What is the Log-Posterior?

Log-Posterior Computation

For each hyperparameter configuration \(\boldsymbol{\theta}_k\), INLA computes:

\log \pi(\boldsymbol{\theta}_k \mid \mathbf{y}) \propto \log \tilde{\pi}(\mathbf{y} \mid \boldsymbol{\theta}_k) + \log \pi(\boldsymbol{\theta}_k)

The marginal likelihood is approximated using the Laplace approximation:

\log \tilde{\pi}(\mathbf{y} \mid \boldsymbol{\theta}) = \log \pi(\mathbf{y}, \mathbf{x}^*) - \frac{1}{2}\log|\mathbf{H}| + \frac{n}{2}\log(2\pi)

where \(\mathbf{x}^*\) is the mode and \(\mathbf{H}\) is the Hessian (precision matrix) at the mode.

Parameters

Parameter	Type	Description
`n`	int	Number of samples to generate
`result`	PyINLAResult	Fitted model (with config=True)
`selection`	dict, optional	Select specific latent elements, e.g., `{'(Intercept)': 1, 'x': 1}`
`seed`	int, optional	Random seed for reproducibility. Note: bit-exact reproducibility requires also pinning `num_threads` (e.g. `num_threads='1:1'`); the seed alone is not enough if the thread count varies across runs.
`intern`	bool	If True, return hyperparameters on internal scale (default: False)
`use_improved_mean`	bool	Use VB-corrected mean (default: True)
`skew_corr`	bool	Apply skewness correction (default: True)

Per-Parameter Runnable Examples

Click any parameter to expand a minimal runnable snippet. All snippets share the setup block below: a small fitted model with config=True. The same res is reused by the posterior_sample_eval and hyperpar_sample sections.

# shared setup for posterior_sample / posterior_sample_eval / hyperpar_sample
import numpy as np
import pandas as pd
import pyinla
from pyinla import pyinla as pyinla_fit

rng  = np.random.default_rng(42)
nobs = 100
x1   = rng.standard_normal(nobs)
x2   = rng.standard_normal(nobs)
y    = 1.5 + 2.0*x1 - 1.0*x2 + 0.5*rng.standard_normal(nobs)
df   = pd.DataFrame({'y': y, 'x1': x1, 'x2': x2})

# config=True stores the configurations needed for posterior sampling.
res = pyinla_fit(
    model={'response': 'y', 'fixed': ['1', 'x1', 'x2']},
    family='gaussian', data=df,
    control={'compute': {'config': True}},
)

n + result required minimal call

samples = pyinla.posterior_sample(n=50, result=res, seed=42)
print(f"len(samples) = {len(samples)}")
print(f"keys of one sample: {list(samples[0].keys())}")
# Each entry has 'hyperpar', 'latent', 'logdens'.

selection subset of latent elements (dict form)

# Each key names a latent element; value is the count to keep.
samples = pyinla.posterior_sample(
    n=20, result=res, seed=42,
    selection={'(Intercept)': 1, 'x1': 1, 'x2': 1},
)
# With selection, sample['latent'] is a (k, 1) ndarray of selected values.
print(samples[0]['latent'].shape)         # (3, 1)
print(samples[0]['latent'].ravel())

seed reproducibility

a = pyinla.posterior_sample(n=10, result=res, seed=99)
b = pyinla.posterior_sample(n=10, result=res, seed=99)
key = list(a[0]['hyperpar'].keys())[0]
print("same seed -> same first hyperpar:",
      a[0]['hyperpar'][key] == b[0]['hyperpar'][key])

intern user scale vs internal scale

s_user   = pyinla.posterior_sample(n=20, result=res, seed=1, intern=False)
s_intern = pyinla.posterior_sample(n=20, result=res, seed=1, intern=True)
print(list(s_user[0]['hyperpar'].keys()))    # ['Precision for the Gaussian observations']
print(list(s_intern[0]['hyperpar'].keys()))  # ['Log precision for the Gaussian observations']

use_improved_mean VB-corrected mean (default True)

# Default uses the VB-corrected posterior mean of the latent field as
# the qsample mu. Setting False uses the Laplace mode instead.
samples_default = pyinla.posterior_sample(n=20, result=res, seed=1)
samples_mode    = pyinla.posterior_sample(n=20, result=res, seed=1,
                                            use_improved_mean=False)
print(f"both succeeded: {len(samples_default)}, {len(samples_mode)}")

skew_corr skew-normal correction (default True)

# skew_corr=False returns pure Gaussian draws (no skewness correction).
s_on  = pyinla.posterior_sample(n=20, result=res, seed=1, skew_corr=True)
s_off = pyinla.posterior_sample(n=20, result=res, seed=1, skew_corr=False)
print(f"both succeeded: {len(s_on)}, {len(s_off)}")

posterior_sample_eval: Evaluate Functions on Samples

pyinla.posterior_sample_eval() applies a function to each posterior sample to compute derived quantities:

g(\boldsymbol{\theta}, \mathbf{x})^{(i)} = f(\text{sample}_i)

Parameters

Parameter	Type	Description
`fun`	str or callable	Variable name to extract, or function taking keyword arguments
`samples`	list	Output from `posterior_sample()`
`return_matrix`	bool	Return as matrix (default: True)

Per-Parameter Runnable Examples

Reuses the same fitted res from the posterior_sample section. Build a sample list once, then evaluate per-parameter.

# Build the sample list once; reuse across the chunks below.
samples = pyinla.posterior_sample(n=200, result=res, seed=7)

fun = string extract a single named variable

intercepts = pyinla.posterior_sample_eval("(Intercept)", samples)
print(f"shape: {intercepts.shape}")        # (1, 200)
print(f"mean:  {intercepts.mean():.4f}")
print(f"sd:    {intercepts.std():.4f}")

fun = callable derived quantity across multiple variables

# A callable receives one sample at a time via **kwargs keyed by latent name.
def product_func(**kwargs):
    intercept = kwargs.get('(Intercept)', 0)
    slope     = kwargs.get('x1', 0)
    return intercept * slope

prod = pyinla.posterior_sample_eval(product_func, samples)
print(f"E[Intercept * x1]: {prod.mean():.4f}")
print(f"95% CI: [{np.percentile(prod, 2.5):.4f},",
      f"{np.percentile(prod, 97.5):.4f}]")

return_matrix = False return a list instead of a matrix

# Default return_matrix=True stacks per-sample outputs into one ndarray.
# Set False to receive a plain Python list of per-sample evaluations.
out = pyinla.posterior_sample_eval("x1", samples, return_matrix=False)
print(f"type: {type(out).__name__}, len: {len(out)}")

hyperpar_sample: Hyperparameter Sampling

pyinla.hyperpar_sample() generates samples only from the hyperparameter posterior \(\pi(\boldsymbol{\theta} \mid \mathbf{y})\), without sampling the latent field:

Algorithm

Hyperparameter sampling uses eigendecomposition of the Hessian at the mode:

Get mode \(\boldsymbol{\theta}^*\) and Hessian \(\mathbf{H}\) from INLA
Compute eigendecomposition: \(\mathbf{H}^{-1} = \mathbf{V}\boldsymbol{\Lambda}\mathbf{V}^T\)
Generate: \(\boldsymbol{\theta} = \boldsymbol{\theta}^* + \mathbf{V}\boldsymbol{\Lambda}^{1/2}\mathbf{z}\) where \(\mathbf{z} \sim \mathcal{N}(0, \mathbf{I})\)
Optionally transform from internal to user scale

Parameters

Parameter	Type	Description
`n`	int	Number of samples
`result`	PyINLAResult	Fitted model
`intern`	bool	If True, return on internal scale (default: False)
`improve_marginals`	bool	If True, refine the Gaussian marginal with stored skewness/kurtosis info (default: False)

Per-Parameter Runnable Examples

Reuses the same fitted res from the posterior_sample section.

n + result required minimal call

samples = pyinla.hyperpar_sample(n=500, result=res)
print(f"type: {type(samples).__name__}, shape: {samples.shape}")
# Returns a pandas DataFrame, one row per sample, columns = hyperparameters.

intern user scale vs internal scale

s_user   = pyinla.hyperpar_sample(n=500, result=res, intern=False)
s_intern = pyinla.hyperpar_sample(n=500, result=res, intern=True)
print(f"user-scale mean:   {s_user.mean().values}")    # precision
print(f"intern-scale mean: {s_intern.mean().values}")  # log-precision

improve_marginals refine the Gaussian marginal (default False)

# When True, the sampler refines the Gaussian eigendecomposition with
# the stored skewness/kurtosis information from the marginal posterior.
samples = pyinla.hyperpar_sample(n=500, result=res,
                                  improve_marginals=True)
print(f"shape: {samples.shape}")

Comparing with the INLA Marginal

import matplotlib.pyplot as plt

samples = pyinla.hyperpar_sample(n=5000, result=res)
# marginals_hyperpar entries are DataFrames with columns 'x' (support)
# and 'y' (density).
marg = res.marginals_hyperpar['Precision for the Gaussian observations']

fig, ax = plt.subplots(figsize=(8, 5))
ax.hist(samples.values.flatten(), bins=40, density=True,
        alpha=0.7, label='Samples', edgecolor='black')
ax.plot(marg['x'], marg['y'], 'r-', linewidth=2, label='INLA Marginal')
ax.set_xlabel('Precision'); ax.set_ylabel('Density'); ax.legend()
plt.show()

Skewness Correction

INLA stores skewness information for each latent field element. The sampling algorithm can correct for non-Gaussianity using a skew-normal transformation:

x_{\text{corrected}} = x_{\text{gaussian}} + \sigma \cdot \delta \cdot \left( |z| - \sqrt{\frac{2}{\pi}} \right) / \sqrt{1 - \frac{2\delta^2}{\pi}}

where \(\delta\) is the skew-normal parameter derived from the stored skewness.

Skewness Parameter Conversion

The skewness \(\gamma_1\) is converted to the skew-normal delta parameter:

\delta = \text{sign}(\gamma_1) \cdot \sqrt{\frac{\pi/2 \cdot |\gamma_1|^{2/3}}{|\gamma_1|^{2/3} + ((4-\pi)/2)^{2/3}}}

Skewness is clamped to \([-0.99, 0.99]\) to ensure valid delta values.

Controlling Skewness Correction

import pyinla

# With skewness correction (default)
samples_skew = pyinla.posterior_sample(n=100, result=res, skew_corr=True)

# Without skewness correction (pure Gaussian)
samples_gauss = pyinla.posterior_sample(n=100, result=res, skew_corr=False)

Internal vs User Scale

INLA represents hyperparameters on an internal scale (typically unbounded) during optimization, then transforms to user scale for reporting:

User Scale	Internal Scale	Transformation
Precision (\(\tau > 0\))	Log-precision	\(\tau = \exp(\theta)\)
Variance (\(\sigma^2 > 0\))	Log-variance	\(\sigma^2 = \exp(\theta)\)
Correlation (\(\rho \in (-1,1)\))	Fisher-z	\(\rho = \tanh(\theta)\)
Probability (\(p \in (0,1)\))	Logit	\(p = 1/(1+\exp(-\theta))\)

Reading the column names

When you switch intern=False → True, the column names in hyperpar_sample's DataFrame (or the keys of posterior_sample()'s 'hyperpar' dict) change in a predictable way. The new name tells you which transformation was applied:

User-scale name	Internal-scale name	Transformation
`'Precision for ...'`	`'Log precision for ...'`	`exp`
`'Variance for ...'`	`'Log variance for ...'`	`exp`
`'Rho for ...'`	`'Rho_intern for ...'`	`tanh` (Fisher-z)
`'Probability for ...'`	`'Logit ...'`	`sigmoid` (inverse logit)

Concrete example: an AR1 random effect contributes two hyperparameters. On the user scale they appear as 'Precision for t' and 'Rho for t'; on the internal scale they become 'Log precision for t' and 'Rho_intern for t'. Reading the prefix is how you tell exp apart from tanh in the same model.

Jensen warning when comparing means across scales

For any nonlinear transformation \(f\) (i.e. all four above), the mean does not commute with the transform:

\[\mathbb{E}[f(\theta)] \;\neq\; f(\mathbb{E}[\theta]).\]

For exp (precision / variance), the exact relationship under a Gaussian-approximated internal posterior \(\theta \sim \mathcal{N}(\mu,\sigma^{2})\) is

\[\mathbb{E}[\tau] \;=\; \mathbb{E}[\exp(\theta)] \;=\; \exp\!\left(\mu + \tfrac{\sigma^{2}}{2}\right),\]

so np.exp(s_intern.mean()) is biased low by the \(\sigma^{2}/2\) term and does not equal s_user.mean(). Same shape of pitfall applies to tanh and the sigmoid: applying the transform to the mean is not the mean of the transformed samples.

Always transform element-wise first:

# good: applies exp to each sample, then averages
np.exp(s_intern['Log precision for t']).mean()   # ~= s_user['Precision for t'].mean()

# bad:  applies exp to the mean of log-precisions
np.exp(s_intern['Log precision for t'].mean())   # biased low

Jacobian correction: When transforming from internal to user scale, the log-density is adjusted by the log-Jacobian of the transformation to ensure proper probability densities.