Log-Logistic Regression

A tutorial on Bayesian regression with the Log-Logistic distribution for positive continuous data with heavy tails.

Introduction

The Log-Logistic distribution is a continuous probability distribution for positive values. It is commonly used in survival analysis and reliability engineering as an alternative to the log-normal and Weibull distributions. Key features include:

The Model

For each observation $i = 1, \ldots, n$, the response $y_i > 0$ follows a Log-Logistic distribution. Two parameterization variants are available:

Variant 0 (default):

$$ F_0(y_i) = \frac{1}{1 + \lambda_i y_i^{-\alpha}}, \quad y_i > 0 $$

Variant 1:

$$ F_1(y_i) = \frac{1}{1 + (\lambda_i y_i)^{-\alpha}}, \quad y_i > 0 $$

where:

Hyperparameters

The Log-Logistic has one hyperparameter:

The shape parameter controls the tail behavior:

Dataset

The dataset contains $n = 1000$ observations:

ColumnDescriptionType
yPositive continuous responsefloat
xCovariate (standardized)float

Implementation in pyINLA

Fitting a Log-Logistic regression model with variant 0:

import pandas as pd
from pyinla import pyinla

# Load data (variant 0)
df = pd.read_csv('dataset_loglogistic_v0.csv')

# Define model: y ~ 1 + x
model = {
    'response': 'y',
    'fixed': ['1', 'x']
}

# Specify variant in control
control = {
    'family': {
        'variant': 0
    },
    'compute': {
        'dic': True,
        'cpo': True,
        'mlik': True
    }
}

# Fit the model
result = pyinla(
    model=model,
    family='loglogistic',
    data=df,
    control=control
)

# View results
print(result.summary_fixed)
print(result.summary_hyperpar)

Expected Output

Fixed Effects

ParameterMeanSD2.5%97.5%
(Intercept)1.0400.0610.9211.160
x2.1340.0781.9822.288

Hyperparameters

ParameterMeanSD2.5%97.5%
alpha2.0620.0541.9572.169

The true parameters were $\beta_0 = 1.1$, $\beta_1 = 2.2$, and $\alpha = 2.1$.

Results and Diagnostics

The posterior summaries recover the true parameters well. The fitted median $\hat{\lambda}^{1/\alpha}$ provides a natural summary for the log-logistic distribution.

Observed values versus fitted median with identity line
Observed values $y_i$ versus fitted median $\hat{\lambda}_i^{1/\hat{\alpha}}$.
Residuals versus fitted median
Residuals $y_i - \hat{\lambda}_i^{1/\hat{\alpha}}$ versus fitted median.

To reproduce these figures locally, download the render_loglogistic_plots.py script and run it alongside the CSV dataset.

Using Variant 1

For variant 1 parameterization:

# Load data generated with variant 1
df_v1 = pd.read_csv('dataset_loglogistic_v1.csv')

control_v1 = {
    'family': {
        'variant': 1
    },
    'compute': {
        'dic': True
    }
}

result_v1 = pyinla(
    model=model,
    family='loglogistic',
    data=df_v1,
    control=control_v1
)

print(result_v1.summary_fixed)
print(result_v1.summary_hyperpar)

The key difference between variants affects how the scale relates to $\lambda$:

Data Generation (Reference)

The dataset was simulated with:

$$ \beta_0 = 1.1, \quad \beta_1 = 2.2, \quad \alpha = 2.1 $$
import numpy as np
import pandas as pd

def rloglogistic(n, lam, alpha, variant=0, rng=None):
    """Generate random samples from a log-logistic distribution."""
    if rng is None:
        rng = np.random.default_rng()
    u = rng.uniform(size=n)
    if variant == 0:
        y = (lam / (1.0 / u - 1.0)) ** (1.0 / alpha)
    elif variant == 1:
        y = (1.0 / (1.0 / u - 1.0)) ** (1.0 / alpha) / lam
    return y

rng = np.random.default_rng(2026)
n = 1000
alpha = 2.1

# Generate standardized covariate
x = rng.uniform(-1, 1, size=n)
x = (x - x.mean()) / x.std()

# Linear predictor and lambda
eta = 1.1 + 2.2 * x
lam = np.exp(eta)

# Generate log-logistic response (variant=0)
y = rloglogistic(n, lam=lam, alpha=alpha, variant=0, rng=rng)

df = pd.DataFrame({'y': y, 'x': np.round(x, 6)})
# df.to_csv('dataset_loglogistic_v0.csv', index=False)