BayesianDoubleML VI example

begin
    using BayesianDoubleML
    using Mooncake # explictly import Mooncake for AD
    using StableRNGs
    using PlutoUI
end

Data generation

Let's generate data as per Section 6 DiTraglia and Liu (2025).

begin
    # Define parameters
    n = 200
    p = 100
    alpha_true = 2.0

    rng = StableRNG(42)

    # Generate data as DataFrame
    df = make_plr_DTL2025(n, p, 2.0; alpha = alpha_true, rng = rng)
end;

Model setup

We then define the BDMLModel, using the hierarchical model (BDML-Hier) from the paper:

As per Section 6 of the paper, the BDML-Hier model "allows different standard deviations in the normal shrinkage priors for \(\delta\) and \(\gamma\) ... with a hierarchical prior that places independent Inverse-Gamma(2, 2) hyper-priors on \(\sigma^2_\delta\) and \(\sigma^2_\gamma\)."

model = BDMLModel(df, :y, :d; model_type = :hier)
BDMLHierarchicalModel (not fitted)
  Observations: 200
  Covariates: 100

Model fitting

And we then fit the model using Automatic Differentiation Variational Inference (ADVI).

In this example, we first try using the SimpleVIMethod with the AutoMooncake AD backend. (Note: AutoMooncake from Mooncake.jl provides extremely fast automatic differentiation, at the cost of a longer compile time.)

fit!(
    model,
    SimpleVIMethod(; ad_backend = AutoMooncake),
    n_iterations = 1_000,
    show_progress = false
);
begin
    summary(model)
    coeftable(model)
end
Bayesian Double ML Coefficient Table
======================================================================
Parameter: α (treatment effect)
Model type: hier
Inference method: VI
Credible interval level: 95.0% (HPD)
Number of posterior samples: 2000

  Parameter     Estimate   Std. Error         MCSE      P-value
  ---------     --------   ----------         ----      -------
  α               1.0050       0.1716       0.0000       0.0000

HPD Credible Intervals:
  α: [0.6716, 1.3453]

Diagnostics:
  Final ELBO: -725.7

Problems more suitable to ADVI

As we see above, for this problem, ADVI is not a good fit for the problem above where \(p\) is large relative to \(n\); ADVI is not as able to reach a good approximation, at least not with this data generation process. The true causal effect is 2.0, but the above model estimated 1.01.

However, ADVI is yields a good approximation in a variety of other real-world scenarios; let's try a case where e.g., n=1000, p = 31.

As a general rule of thumb: in anecdotal testing, ADVI methods are generally reliable on similar problems when \(n >> p\).

begin
    n2 = 1_000
    lower_p = floor(sqrt(n2)) |> Int
    upper_p = floor(n2 / 2) |> Int
    default_p = Int(floor(sqrt(n2)))
    @assert lower_p < upper_p
    @bind p2 Slider(lower_p:10:upper_p, show_value = true, default = default_p)
end
31
begin
    # Generate data as DataFrame
    df2 = make_plr_DTL2025(n2, p2, 2.0; alpha = alpha_true, rng = rng)
end;
model2 = BDMLModel(df2, :y, :d; model_type = :hier)
BDMLHierarchicalModel (not fitted)
  Observations: 1000
  Covariates: 31
fit!(
    model2,
    SimpleVIMethod(; ad_backend = AutoMooncake),
    n_iterations = 1_000,
    show_progress = false
);
begin
    summary(model2)
    coeftable(model2)
end
Bayesian Double ML Coefficient Table
======================================================================
Parameter: α (treatment effect)
Model type: hier
Inference method: VI
Credible interval level: 95.0% (HPD)
Number of posterior samples: 2000

  Parameter     Estimate   Std. Error         MCSE      P-value
  ---------     --------   ----------         ----      -------
  α               1.9534       0.0638       0.0000       0.0000

HPD Credible Intervals:
  α: [1.8234, 2.0702]

Diagnostics:
  Final ELBO: -2151.12