Politics & Public Policy

1988 Election: Multilevel Modeling

Hierarchical logistic regression for modeling voting behavior. Individuals nested within states, with demographic fixed effects and state-level random intercepts. A classic example from Gelman & Hill.

About the Data

We model individual vote choice in the 1988 US Presidential Election (Bush vs. Dukakis) using simulated survey data (following Gelman & Hill). The data has a natural hierarchical structure:

Level 1 (Individuals): 6,003 survey respondents with demographic characteristics
Level 2 (States): 50 states with different political climates (~120 respondents per state)
Outcome: Binary vote choice (1 = Bush, 0 = Dukakis)
Bush vote share: 53.5%

Individual-Level Predictors

Variable	Description	Levels
Gender	Respondent gender	Male (baseline), Female
Race	Racial identification	Non-Black (baseline), Black
Age	Age category	18-29 (baseline), 30-44, 45-64, 65+
Education	Educational attainment	No high school diploma (baseline), High school graduate, Some college, College graduate

Download the data: election_1988.csv

Why Multilevel Modeling?

Voters within the same state share characteristics we can't directly observe: local economic conditions, media markets, campaign intensity, political culture. Ignoring this clustering causes problems:

Problem	Consequence	Multilevel Solution
Non-independence	Standard errors are too small (overconfident conclusions)	Random intercepts account for within-state correlation
Unobserved heterogeneity	Omitted variable bias in state-level effects	Random effects absorb state-specific variation
Small state samples	Unreliable estimates for states with few respondents	Shrinkage "borrows strength" from other states

The Key Insight: Partial Pooling

Multilevel models provide a middle ground between two extremes:

Complete pooling: Ignore states entirely (all states identical) - loses real variation
No pooling: Fit separate models per state - unstable with small samples
Partial pooling (multilevel): State estimates are "shrunk" toward the overall mean, with more shrinkage for states with less data

The Statistical Model

We fit a hierarchical logistic regression with random intercepts:

Level 1: Individual Voting

\text{logit}\!\bigl(P(\text{Bush}_i = 1)\bigr) = \beta_0 + \beta_1 \text{female}_i + \beta_2 \text{black}_i + \sum_{k=2}^{4}\beta^{\text{age}}_{k}\, x^{\text{age}}_{ik} + \sum_{k=2}^{4}\beta^{\text{edu}}_{k}\, x^{\text{edu}}_{ik} + u_{j[i]}

Level 2: State Random Effects

u_j \sim N(0, \sigma^2_{\text{state}})

Each state $j$ gets its own intercept $u_j$ that shifts the baseline probability of voting for Bush up or down. These effects are assumed to come from a common normal distribution.

What the Random Intercept Captures

State political culture (historically Republican vs. Democratic)
Local economic conditions in 1988
Campaign intensity and media coverage
Any other unmeasured state-level factors

pyINLA Implementation

Step 1: Load and Prepare the Data

First we load the simulated survey data and inspect its structure. Each row is one respondent with their vote choice, demographics, and state ID.

import pandas as pd
import numpy as np

# Load data
df = pd.read_csv('election_1988.csv')
print(f"Dataset shape: {df.shape}")
print(f"Bush vote share: {df['bush'].mean():.1%}")
print(f"Respondents per state:")
print(df.groupby('state').size().describe())

Dataset shape: (6003, 7)

Bush vote share: 53.5%

Respondents per state:
count     50.00000
mean     120.06000
std        9.07207
min      102.00000
25%      114.25000
50%      119.00000
75%      124.00000
max      150.00000

Step 2: Define and Fit the Model

pyINLA automatically creates dummy variables for categorical (non-numeric) columns, dropping the first level as the baseline when an intercept is present. We convert age_cat and edu_cat to strings so pyINLA treats them as categorical.

from pyinla import pyinla

# Mark categorical columns (integers would be treated as numeric)
df['age_cat'] = df['age_cat'].astype(str)
df['edu_cat'] = df['edu_cat'].astype(str)

# Ntrials for binary outcome
Ntrials = df['Ntrials'].to_numpy()

# Define multilevel model
model = {
    'response': 'bush',
    'fixed': ['1', 'female', 'black', 'age_cat', 'edu_cat'],
    'random': [
        {'id': 'state', 'model': 'iid'}  # Random intercept for state
    ]
}

# Control options: compute model comparison criteria
control = {
    'compute': {
        'dic': True,
        'waic': True,
        'mlik': True
    }
}

# Fit model
py_result = pyinla(
    model=model,
    family='binomial',
    data=df,
    Ntrials=Ntrials,
    control=control
)

print("=== pyINLA Fixed Effects ===")
print(py_result.summary_fixed.drop(columns='kld').to_string())

print("\n=== pyINLA Hyperparameters (State Precision) ===")
print(py_result.summary_hyperpar.to_string())

print("\n=== pyINLA DIC ===")
print(py_result.dic['dic'])

print("\n=== pyINLA Marginal Likelihood ===")
print(py_result.mlik.to_string())

=== pyINLA Fixed Effects ===
                 mean        sd  0.025quant  0.5quant  0.975quant      mode
(Intercept)  0.374417  0.100632    0.177003  0.374393    0.571967  0.374394
female      -0.177099  0.053916   -0.282825 -0.177098   -0.071379 -0.177098
black       -1.372144  0.103406   -1.574927 -1.372140   -1.169388 -1.372140
age_cat2    -0.268209  0.073312   -0.411967 -0.268209   -0.124450 -0.268209
age_cat3    -0.113667  0.073879   -0.258531 -0.113669    0.031208 -0.113669
age_cat4    -0.234295  0.088410   -0.407659 -0.234296   -0.060930 -0.234296
edu_cat2     0.035432  0.072074   -0.105900  0.035432    0.176762  0.035432
edu_cat3     0.405881  0.080641    0.247759  0.405879    0.564018  0.405879
edu_cat4     0.142852  0.092140   -0.037827  0.142852    0.323526  0.142852

=== pyINLA Hyperparameters (State Precision) ===
                       mean        sd  0.025quant  0.5quant  0.975quant      mode
Precision for state  5.9919  1.490322    3.557808  5.828241    9.364695  5.516869

=== pyINLA DIC ===
7856.855545675052

=== pyINLA Marginal Likelihood ===
log marginal-likelihood (integration)   -4002.291130
log marginal-likelihood (Gaussian)      -4002.684146

Key points:

'random': [{'id': 'state', 'model': 'iid'}] specifies an IID (independent and identically distributed) Gaussian random effect for each state
family='binomial' with Ntrials=1 gives binary logistic regression
dic, waic, and mlik provide model comparison criteria

Step 3: Visualize the Results

We plot the estimated fixed-effect coefficients with 95% credible intervals. Green dots mark significant effects (interval excludes zero).

import matplotlib.pyplot as plt
from matplotlib.lines import Line2D

# Get coefficients (exclude intercept for cleaner plot)
coef_data = py_result.summary_fixed.drop('(Intercept)').copy()

# Readable labels for each covariate
labels = {
    'female': 'Female', 'black': 'Black',
    'age_cat2': 'Age 30-44', 'age_cat3': 'Age 45-64', 'age_cat4': 'Age 65+',
    'edu_cat2': 'HS Graduate', 'edu_cat3': 'Some College', 'edu_cat4': 'College Grad'
}
coef_data.index = [labels.get(x, x) for x in coef_data.index]
coef_data = coef_data.sort_values('mean', ascending=True)

# Green if significant (CI excludes zero), grey otherwise
y_pos = np.arange(len(coef_data))
colors = ['#059669' if not ((row['0.025quant'] < 0) and (row['0.975quant'] > 0))
          else '#94a3b8' for _, row in coef_data.iterrows()]

fig, ax = plt.subplots(figsize=(10, 6))
ax.hlines(y_pos, coef_data['0.025quant'], coef_data['0.975quant'],
          colors=colors, linewidth=2, alpha=0.7)
ax.scatter(coef_data['mean'], y_pos, c=colors, s=80, zorder=5)
ax.axvline(x=0, color='#e11d48', linestyle='--', linewidth=1.5, alpha=0.7)

ax.set_yticks(y_pos)
ax.set_yticklabels(coef_data.index)
ax.set_xlabel('Coefficient (log-odds scale)')
ax.set_title('1988 Election: Demographic Effects on Bush Vote')
ax.legend(handles=[
    Line2D([0], [0], marker='o', color='w', markerfacecolor='#059669', markersize=10, label='Significant'),
    Line2D([0], [0], marker='o', color='w', markerfacecolor='#94a3b8', markersize=10, label='Not significant')
], loc='lower right')
ax.grid(axis='x', alpha=0.3)
plt.tight_layout()
plt.savefig('election_coefficients.png', dpi=150, bbox_inches='tight')

Demographic effects on Bush vote — Coefficient estimates with 95% credible intervals. Green indicates significant effects (interval excludes zero).

Key Findings

Effect	Coefficient	Odds Ratio	95% CI	Interpretation
Black	-1.372	0.25	(0.21 – 0.31)	Black voters had 75% lower odds of voting Bush (strongest effect)
Some College	0.406	1.50	(1.28 – 1.76)	Some college increases Bush odds by 50%
Age 30–44	-0.268	0.76	(0.66 – 0.88)	30–44 age group had 24% lower odds than 18–29 of voting Bush
Female	-0.177	0.84	(0.75 – 0.93)	Women had 16% lower odds of supporting Bush
Age 65+	-0.234	0.79	(0.67 – 0.94)	65+ age group had 21% lower odds than 18–29
College Grad	0.143	1.15	(0.96 – 1.38)	College graduates slightly more likely (not significant)
High School Grad	0.035	1.04	(0.90 – 1.19)	Minimal difference from baseline (not significant)
Age 45–64	-0.114	0.89	(0.77 – 1.03)	Slightly lower odds than 18–29 (not significant)

Step 4: Compute Odds Ratios

Exponentiate each coefficient to convert from log-odds to odds ratios. An asterisk marks effects whose 95% credible interval excludes 1 (statistically significant).

# Convert log-odds coefficients to odds ratios
fixed = py_result.summary_fixed
for var in ['female', 'black', 'age_cat2', 'age_cat3', 'age_cat4',
            'edu_cat2', 'edu_cat3', 'edu_cat4']:
    coef = fixed.loc[var, 'mean']
    ci_low = fixed.loc[var, '0.025quant']
    ci_high = fixed.loc[var, '0.975quant']

    or_val = np.exp(coef)        # Odds ratio = exp(coefficient)
    or_low = np.exp(ci_low)      # Lower bound of 95% CI
    or_high = np.exp(ci_high)    # Upper bound of 95% CI

    # Significant if CI excludes zero on the log-odds scale
    sig = "*" if not ((ci_low < 0) and (ci_high > 0)) else ""
    label = labels.get(var, var)
    print(f"  {label:15s}: OR = {or_val:.2f} ({or_low:.2f} - {or_high:.2f}) {sig}")

DEMOGRAPHIC EFFECTS (Odds Ratios for Bush vote):
  Female         : OR = 0.84 (0.75 - 0.93) *
  Black          : OR = 0.25 (0.21 - 0.31) *
  Age 30-44      : OR = 0.76 (0.66 - 0.88) *
  Age 45-64      : OR = 0.89 (0.77 - 1.03)
  Age 65+        : OR = 0.79 (0.67 - 0.94) *
  HS Graduate    : OR = 1.04 (0.90 - 1.19)
  Some College   : OR = 1.50 (1.28 - 1.76) *
  College Grad   : OR = 1.15 (0.96 - 1.38)

  * = statistically significant (95% CI excludes 1)

Understanding Odds Ratios

The model estimates coefficients on the log-odds scale. Odds ratios (OR) convert these into an intuitive multiplicative effect by exponentiating: OR = exp(coefficient).

What Are Odds?

Odds is just another way to express probability.

Probability says: out of everyone, what fraction chose Bush?
60 out of 100 voted Bush → probability = 0.60 (60%)

Odds says: how many voted Bush for every one who didn't?
60 voted Bush, 40 didn't → odds = 60 / 40 = 1.5
Meaning: for every 1 person who didn't vote Bush, 1.5 did.

The formula: odds = p / (1 − p)

How to Read Odds Ratios

Odds Ratio	Meaning	Example from Results
OR = 1.0	No effect, same odds as the baseline group
OR > 1.0	Higher odds: (OR − 1) × 100 = % increase	Some College OR = 1.50 → 50% higher odds of voting Bush
OR < 1.0	Lower odds: (1 − OR) × 100 = % decrease	Black OR = 0.25 → 75% lower odds of voting Bush

Key Points

Each odds ratio compares a group to its baseline category (Male, Non-Black, Age 18–29, No high school diploma)
The 95% credible interval tells you about uncertainty. If it includes 1.0, the effect is not statistically significant
Odds ratios are multiplicative, not additive: OR = 0.25 means the odds are multiplied by 0.25 (divided by 4), not reduced by 0.25
Higher or lower odds of what? Always of whatever the response variable codes as 1. Here the response is bush (voted Bush = 1), so OR > 1 means more likely to vote Bush, and OR < 1 means less likely to vote Bush. If the response were democrat instead, the direction would flip.

Step 5: Plot Odds Ratios

Exponentiate the coefficients and their credible intervals to plot on the odds ratio scale. The red dashed line at OR = 1 represents no effect.

# Exponentiate coefficients to get odds ratios
or_data = pd.DataFrame({
    'OR': np.exp(coef_data['mean']),
    'OR_low': np.exp(coef_data['0.025quant']),
    'OR_high': np.exp(coef_data['0.975quant'])
}, index=coef_data.index)
or_data = or_data.sort_values('OR', ascending=True)

# Green if CI excludes 1 (significant on OR scale)
y_pos = np.arange(len(or_data))
colors = ['#059669' if not ((row['OR_low'] < 1) and (row['OR_high'] > 1))
          else '#94a3b8' for _, row in or_data.iterrows()]

fig, ax = plt.subplots(figsize=(10, 6))
ax.hlines(y_pos, or_data['OR_low'], or_data['OR_high'],
          colors=colors, linewidth=2, alpha=0.7)
ax.scatter(or_data['OR'], y_pos, c=colors, s=80, zorder=5)
ax.axvline(x=1, color='#e11d48', linestyle='--', linewidth=1.5, alpha=0.7)

ax.set_yticks(y_pos)
ax.set_yticklabels(or_data.index)
ax.set_xlabel('Odds Ratio (vs reference category)')
ax.set_title('1988 Election: Odds Ratios for Bush Vote')
ax.grid(axis='x', alpha=0.3)
plt.tight_layout()
plt.savefig('election_odds_ratios.png', dpi=150, bbox_inches='tight')

Odds ratios for Bush vote — Odds ratios with 95% credible intervals. Red dashed line at OR=1 (no effect). The race effect (Black) is by far the strongest predictor.

Step 6: State Random Effects (Caterpillar Plot)

The caterpillar plot shows the estimated random intercept for each state, ordered from lowest to highest:

States above zero: More pro-Bush than demographics alone predict
States below zero: More pro-Dukakis than demographics alone predict
Interval width: Wider intervals indicate more uncertainty (often smaller sample sizes)
Shrinkage: Notice most estimates are close to zero, partial pooling in action

# Extract and sort state random effects
state_effects = py_result.summary_random['state'].copy()
state_effects = state_effects.sort_values('mean')

n_states = len(state_effects)
y_pos = np.arange(n_states)

# Blue if CI excludes zero (state is distinguishable from average)
colors = ['#3b82f6' if not ((row['0.025quant'] < 0) and (row['0.975quant'] > 0))
          else '#94a3b8' for _, row in state_effects.iterrows()]

fig, ax = plt.subplots(figsize=(12, 8))
ax.hlines(y_pos, state_effects['0.025quant'], state_effects['0.975quant'],
          colors=colors, linewidth=1.5, alpha=0.6)
ax.scatter(state_effects['mean'], y_pos, c=colors, s=30, zorder=5)
ax.axvline(x=0, color='#e11d48', linestyle='--', linewidth=1.5, alpha=0.7)

ax.set_ylabel('State (ordered by effect)')
ax.set_xlabel('State Random Effect (log-odds)')
ax.set_title('State-Level Variation in Bush Support (Caterpillar Plot)')
ax.set_yticks([])
plt.tight_layout()
plt.savefig('election_state_effects.png', dpi=150, bbox_inches='tight')

State random effects caterpillar plot — Each horizontal line shows the 95% credible interval for a state's random effect. Blue indicates the interval doesn't cross zero (statistically distinguishable from average).

Step 7: State-Level Variance

How much do states vary after accounting for demographics? Convert precision (what INLA estimates) to variance and SD, then plot the distribution of state effects.

from scipy import stats

# Convert precision to variance and SD
prec_mean = py_result.summary_hyperpar['mean'].values[0]
state_var = 1 / prec_mean         # variance = 1 / precision
state_sd = np.sqrt(state_var)     # SD = sqrt(variance)

print(f"State effect precision: {prec_mean:.3f}")
print(f"State effect variance:  {state_var:.3f}")
print(f"State effect SD:        {state_sd:.3f}")

# Histogram of state effects with fitted normal overlay
state_means = py_result.summary_random['state']['mean'].values

fig, ax = plt.subplots(figsize=(8, 5))
ax.hist(state_means, bins=15, density=True, alpha=0.7,
        color='#3b82f6', edgecolor='white')
ax.axvline(0, color='#e11d48', linestyle='--', linewidth=2,
           label='Zero (no state effect)')

# Overlay the fitted N(0, state_sd) distribution
x = np.linspace(state_means.min() - 0.2, state_means.max() + 0.2, 100)
ax.plot(x, stats.norm.pdf(x, 0, state_sd), 'k-', linewidth=2,
        label=f'Fitted N(0, {state_sd:.2f})')

ax.set_xlabel('State Random Effect')
ax.set_ylabel('Density')
ax.set_title('Histogram of posterior mean state effects.\n'
             'The fitted normal distribution shows the estimated between-state variation.',
             fontsize=10, ha='center')
ax.legend()
plt.tight_layout()
plt.savefig('election_state_variance.png', dpi=150, bbox_inches='tight')

Distribution of state effects — Histogram of posterior mean state effects. The fitted normal distribution shows the estimated between-state variation.

State effect precision: 5.992
State effect variance:  0.167
State effect SD:        0.409

True variance used in simulation: 0.2
True SD used in simulation:       0.447

The estimated state precision is 5.99, corresponding to a variance of 0.167 and a standard deviation of 0.409 on the log-odds scale. The true simulation variance was 0.2 (SD = 0.447), so the model recovers this well. In practical terms, a one-SD shift in the state effect translates to roughly a 50% relative change in odds between typical pro-Bush and pro-Dukakis states.

Understanding Random Effects

Random effects can be confusing at first. Here's how to think about them:

Fixed vs. Random Effects

Aspect	Fixed Effects	Random Effects
What they model	Specific categories of interest (e.g., Male vs. Female)	Groups from a larger population (e.g., states as sample from all possible states)
Goal	Estimate each category's specific effect	Estimate the variance across groups + improve individual estimates via shrinkage
Interpretation	"Females have X% lower odds of voting Bush"	"States vary by SD = Y in their baseline Bush support"

Why IID for States?

We use model='iid' because we assume states are exchangeable - knowing one state's effect doesn't tell us about another's (no spatial correlation). For spatial applications, you might use model='besag' or model='bym' instead.

Key Takeaways

Multilevel models handle clustered data: voters within states are not independent
Random intercepts capture group-level variation: some states are more pro-Bush than demographics predict
Shrinkage improves estimates: states with little data borrow strength from the overall pattern
Race was the strongest predictor: Black voters had 75% lower odds of voting Bush (OR = 0.25)
Gender and age matter: women and 30–44-year-olds had significantly lower odds of supporting Bush
pyINLA syntax is simple: add 'random': [{'id': 'state', 'model': 'iid'}]

References

Gelman, A., and Hill, J. (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press. (Chapters 11-14)
Rue, H., Martino, S. and Chopin, N. (2009). "Approximate Bayesian inference for latent Gaussian models using integrated nested Laplace approximations." Journal of the Royal Statistical Society: Series B, 71(2), 319-392.
ANES (American National Election Studies) for historical election survey data.