Link Functions

What is a Link Function?

In generalized linear models (GLMs), a link function is a mathematical transformation that connects the linear predictor $\eta_i = \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \cdots$ to the expected value of the response variable.

The Core Idea

Your linear predictor $\eta_i$ can take any value from $-\infty$ to $+\infty$. But many response variables have natural constraints:

Counts must be non-negative $(0, 1, 2, \ldots)$
Probabilities must be between 0 and 1
Durations/times must be positive

The link function transforms between these constrained values and the unconstrained linear predictor.

Mathematically, if $\mu_i = \mathbb{E}[y_i]$ is the expected value of the response, then:

$$g(\mu_i) = \eta_i \quad \Leftrightarrow \quad \mu_i = g^{-1}(\eta_i)$$

where $g(\cdot)$ is the link function and $g^{-1}(\cdot)$ is its inverse (also called the "response function").

Why Do We Need Link Functions?

Example 1: Modeling Counts with Poisson

Suppose you're modeling the number of accidents per day. Your linear predictor might be:

$\eta_i = 2.5 - 0.3 \times x_i$ where $x_i$ = safety training hours

Without a link function, if $x = 10$, you'd predict $\eta = 2.5 - 3 = -0.5$ accidents, which is impossible!

The log link solves this by setting $\log(\mu_i) = \eta_i$, so $\mu_i = e^{\eta_i} = e^{-0.5} \approx 0.61$ accidents. The mean is always positive, as required.

Example 2: Modeling Probabilities with Binomial

For binary outcomes (success/failure), probabilities must satisfy $0 < p < 1$. The linear predictor could give any value, but the logit link constrains the probability:

$\text{logit}(p_i) = \log\left(\frac{p_i}{1-p_i}\right) = \eta_i \quad \Rightarrow \quad p_i = \frac{e^{\eta_i}}{1 + e^{\eta_i}}$

This ensures probabilities are always between 0 and 1, regardless of the predictor values.

Common Link Functions

Identity Link

$g(\mu) = \mu$ → $\mu = \eta$

No transformation. The mean equals the linear predictor directly.

Used for: Gaussian, Logistic, Student's t, Skew Normal, Log-Normal

Log Link

$g(\mu) = \log(\mu)$ → $\mu = e^{\eta}$

Ensures the mean is always positive. Coefficients have multiplicative interpretation.

Used for: Poisson, Gamma, Negative Binomial, Exponential, Weibull, Log-Logistic

Logit Link

$g(p) = \log\left(\frac{p}{1-p}\right)$ → $p = \frac{e^{\eta}}{1+e^{\eta}}$

Maps probabilities $(0,1)$ to the real line. Coefficients are changes in log-odds; $e^\beta$ is the odds ratio.

Used for: Binomial, Beta, Beta-Binomial, Negative Binomial (type 2)

Probit Link

$g(p) = \Phi^{-1}(p)$ → $p = \Phi(\eta)$

Uses the standard normal CDF. Similar to logit but with lighter tails.

Used for: Binomial, Beta, Beta-Binomial (alternative)

Complementary Log-Log (cloglog)

$g(p) = \log(-\log(1-p))$ → $p = 1 - e^{-e^{\eta}}$

Asymmetric link, useful when probability of success approaches 1 faster than 0.

Used for: Binomial, Beta, Beta-Binomial (rare events)

Negative Log (neglog)

$g(\mu) = -\log(\mu)$ → $\mu = e^{-\eta}$

Used in accelerated failure time (AFT) survival models. Because $\mu = e^{-\eta}$, a positive coefficient shortens survival time and a negative coefficient lengthens it (the opposite sign convention from the log link).

Used for: Weibull, Weibull-surv, Gamma-surv, Exponential-surv, Log-Logistic, Log-Logistic-surv. Note: plain gamma and exponential do not accept neglog (only their -surv variants do), and lognormal/lognormalsurv never accept neglog (they only allow default/identity).

Link Functions by Likelihood Family

Continuous (Real-valued) Response can be any real number

Family	Default Link	Available Links	Why This Default?
Gaussian	identity	identity, log, logit, loga, cauchit, logoffset	Response is unbounded; identity needs no transformation. Other links are exposed for specialized cases (e.g., modeling transformed responses)
Logistic	identity	identity	Unbounded response with heavier tails than Gaussian
Student's t	identity	identity	Robust to outliers; models the location parameter
Skew Normal	identity	identity	Asymmetric but still unbounded response

Positive Continuous Response must be positive $(y > 0)$

Family	Default Link	Available Links	Why This Default?
Gamma	log	log, quantile (plain `gamma`); log, neglog, quantile (`gammasurv`)	Log ensures a positive mean and is preferred in practice (the canonical GLM link is $1/\mu$, but log is more interpretable). The neglog AFT link is exposed only in the `gammasurv` variant
Exponential	log	log (plain `exponential`); log, neglog (`exponentialsurv`)	Positive rate parameter; neglog (AFT) is exposed only in the `exponentialsurv` variant
Log-Normal	identity	identity	Models $\log(y) \sim N(\mu, \sigma^2)$; $\mu$ is unbounded
Weibull	log	log, neglog, quantile	Log for positive scale; neglog for AFT interpretation
Log-Logistic	log	log, neglog	Positive scale parameter; common in survival analysis

Count Data Response is a non-negative integer $(y \in \{0, 1, 2, \ldots\})$

Family	Default Link	Available Links	Why This Default?
Poisson	log	log, logoffset, quantile	Canonical link; ensures positive mean rate. logoffset is the log link with an exposure term added to the linear predictor
Negative Binomial	log	log, logoffset, quantile	Same as Poisson; handles overdispersion

Binary & Probability Response is probability or proportion $(0 < y < 1)$

Family	Default Link	Available Links	Why This Default?
Binomial	logit	logit, probit, cloglog, ccloglog, loglog, loga, cauchit, robit, sn, log, powerlogit	Canonical link; log-odds interpretation
Beta	logit	logit, probit, cloglog, ccloglog, loglog, loga, cauchit	Maps $(0,1)$ to real line; intuitive for proportions
Beta-Binomial	logit	logit, probit, cloglog, ccloglog, loglog, loga, cauchit, robit, sn	Same as binomial; probability parameter in $(0,1)$
Negative Binomial (type 2)	logit	logit, loga, cauchit, probit, cloglog, ccloglog, loglog	This variant parameterizes the model on the success probability $p \in (0,1)$ rather than on the mean count, so probability links apply (logit, probit, etc.). Standard NB regression on the mean uses log; that is the "Negative Binomial" row above

Special Link Functions

Log-Offset (logoffset)

$\log(\mu_i) = \log(E_i) + \eta_i$

Not a separate link function: it is the log link with a known exposure or offset term $E_i$ added to the linear predictor. Useful when modeling rates with varying observation periods or population sizes.

Used for: Negative Binomial with known exposure

Quantile Link

Models a specific quantile rather than the mean

Allows regression on quantiles (e.g., median) instead of the expected value. Useful for asymmetric distributions.

Used for: Gamma, Weibull, Negative Binomial

Cauchit Link

$g(p) = \tan(\pi(p - 0.5))$

Based on the Cauchy distribution. Has heavier tails than logit or probit, making it robust to extreme observations.

Used for: Binomial, Beta, Beta-Binomial

Robit Link

$g(p) = t_\nu^{-1}(p)$

Uses the Student's t distribution quantile function. Provides a robust alternative to probit with heavier tails.

Used for: Binomial, Beta-Binomial

Skew Normal (sn) Link

$g(p) = \text{SN}^{-1}(p)$

Uses the skew-normal quantile function. Allows asymmetric response curves for probability models.

Used for: Binomial, Beta-Binomial

Log-A (loga) Link

$g(p) = \log\left(\frac{p}{a - p}\right)$ for $p \in (0, a)$

Generalization of logit allowing an upper bound different from 1. Useful for truncated proportions.

Used for: Binomial, Beta

Choosing a Link Function

        General Guidelines
        Start with the default. Canonical links have nice statistical properties and are well-tested.
Consider interpretability. Logit gives odds ratios, log gives multiplicative effects.
Match the domain. Ensure the link maps to the correct range for your response.
Check model fit. Compare DIC, WAIC, or CPO across different links if unsure.

      

When to Use Alternative Links

Situation	Consider	Reason
Extreme probabilities (rare events)	cloglog or loglog	Better behavior near 0 or 1
Outliers in probability models	cauchit or robit	Heavier tails provide robustness
Survival analysis (AFT)	neglog	Coefficient sign flips relative to log: positive coefficients shorten survival time, negative coefficients lengthen it
Asymmetric probability response	sn (skew normal)	Allows asymmetric dose-response curves
Known exposure in count data	logoffset	Properly accounts for varying observation periods

Interpreting Coefficients by Link Function

Each coefficient $\beta$ tells you: when a covariate increases by one unit, how does the response change? The link function determines the scale on which this change is expressed. To convert back to the natural scale of the response, you often exponentiate.

Link	When covariate increases by 1 unit...	exp($\beta$) gives	Example
identity	The mean response changes by $\beta$ (directly)	Not needed	$\beta = 0.5$ → mean response increases by 0.5 units
log	$\log(\text{mean response})$ changes by $\beta$	Rate ratio: the mean response is multiplied by exp($\beta$)	$\beta = 0.3$ → $e^{0.3} = 1.35$ → mean response increases by 35%
logit	The log-odds of the event change by $\beta$	Odds ratio: the odds are multiplied by exp($\beta$)	$\beta = -1.4$ → $e^{-1.4} = 0.25$ → odds decrease by 75%
probit	$\Phi^{-1}(p)$ changes by $\beta$	No simple exp() interpretation	$\beta = 0.5$ → probability increases, but the amount depends on the baseline $p$
cloglog	$\log(-\log(1-p))$ changes by $\beta$	Hazard ratio (in discrete-time survival models)	$\beta = 0.7$ → $e^{0.7} = 2.01$ → hazard doubles
neglog	$-\log(\text{survival time})$ changes by $\beta$ (AFT models)	Time ratio (inverse direction)	$\beta = 0.5$ → $e^{-0.5} = 0.61$ → survival time multiplied by 0.61

The General Rule

For logarithmic links (log, logit, cloglog), exponentiating the coefficient gives a multiplicative effect on the natural scale:

exp($\beta$) > 1 → increase. Percentage increase = (exp($\beta$) − 1) × 100
exp($\beta$) < 1 → decrease. Percentage decrease = (1 − exp($\beta$)) × 100
exp($\beta$) = 1 (i.e., $\beta = 0$) → no effect

For the identity link (Gaussian), coefficients are directly interpretable without any transformation.

For the probit link, there is no simple exp() interpretation. Use marginal effects, or convert approximately to logit-scale coefficients by multiplying probit coefficients by a factor between 1.6 and $\pi/\sqrt{3} \approx 1.81$ (1.6 is Amemiya's approximation, 1.7 is also commonly used). All three are rough approximations; prefer marginal effects when precision matters.

Concrete Examples

Model	Coefficient	Interpretation
Gaussian (identity link) Predicting income	$\beta_{\text{education}} = 3200$	Each additional year of education increases income by $3,200
Poisson (log link) Counting hospital visits	$\beta_{\text{age}} = 0.02$	$e^{0.02} = 1.02$ → each year of age increases visits by 2%
Binomial (logit link) Modeling vote choice	$\beta_{\text{female}} = -0.18$	$e^{-0.18} = 0.84$ → women have 16% lower odds of voting for the candidate
Poisson (log link) Disease counts with offset	$\beta_{\text{pollution}} = 0.15$	$e^{0.15} = 1.16$ → 16% higher disease rate per unit pollution increase

Quick Reference: All Link Functions

Link	Formula $g(\mu)$	Inverse $\mu = g^{-1}(\eta)$	Domain
identity	$\mu$	$\eta$	$(-\infty, \infty)$
log	$\log(\mu)$	$e^\eta$	$(0, \infty)$
neglog	$-\log(\mu)$	$e^{-\eta}$	$(0, \infty)$
logit	$\log(\mu/(1-\mu))$	$e^\eta/(1+e^\eta)$	$(0, 1)$
probit	$\Phi^{-1}(\mu)$	$\Phi(\eta)$	$(0, 1)$
cloglog	$\log(-\log(1-\mu))$	$1 - e^{-e^\eta}$	$(0, 1)$
ccloglog	$\log(-\log(\mu))$	$e^{-e^\eta}$	$(0, 1)$
loglog	$-\log(-\log(\mu))$	$e^{-e^{-\eta}}$	$(0, 1)$
cauchit	$\tan(\pi(\mu-0.5))$	$0.5 + \arctan(\eta)/\pi$	$(0, 1)$