API Reference
Model Types
BayesianDoubleML.AbstractBDMLModel — Type
AbstractBDMLModelAbstract type for all BDML model specifications.
Model types encapsulate:
- Data (Y, D, X) - already standardized
- Standardization statistics (for transforming back)
- Pre-allocated temporaries (for performance)
- Metadata (n, p, model_type)
- Fitting results and state (isfitted, result, lastmethod)
Each concrete subtype represents a different model specification (basic vs hierarchical) and can be fitted with different methods (MCMC, VI unified, VI simple) via multiple dispatch.
After fitting with fit!(), results are stored in the model and can be extracted using coeftable(), extract_alpha(), summary(), etc.
BayesianDoubleML.BDMLBasicModel — Type
BDMLBasicModel <: AbstractBDMLModelBasic BDML model specification implementing BDML-Basic from DiTraglia & Liu (2025), Algorithm 1.
Fields
Y::Vector{Float64}: Standardized outcome variableD::Vector{Float64}: Standardized treatment variableX::Matrix{Float64}: Standardized control variables (covariates)stats::StandardizationStats: Statistics for transforming back to original scalen::Int: Number of observationsp::Int: Number of control variablesresult::Union{Nothing, AbstractBDMLResult}: Stores fitting results afterfit!()is_fitted::Bool: Whether the model has been fittedlast_method::Union{Nothing, AbstractInferenceMethod}: Method used in last fit
Model Specification (Section 4, Equations 12-13)
Reduced form: $Y = X'\delta + U$ (Eq. 12) $D = X'\gamma + V$ (Eq. 5)
Priors:
- $\delta \sim N(0, 25\cdot I_p)$ [Outcome reduced form coefficients]
- $\gamma \sim N(0, 25\cdot I_p)$ [Treatment reduced form coefficients]
- $\sigma_U \sim \text{Cauchy}^+(0, 2.5)$ [Outcome error scale]
- $\sigma_V \sim \text{Cauchy}^+(0, 2.5)$ [Treatment error scale]
- $R \sim \text{LKJ}(4)$ [Correlation matrix]
For VI, uses $\text{Beta}(2,2)$ correlation parameterization instead of LKJCholesky.
Causal effect recovery: $\alpha = \rho\cdot\sigma_U / \sigma_V$ (Eq. 15)
This is the BDML-Basic variation with fixed prior variances (Section 6, Table 1).
See also: BDMLHierarchicalModel, BDMLModel
References
- DiTraglia, F.J. & Liu, L. (2025). "Bayesian Double Machine Learning for Causal Inference", arXiv:2508.12688v1, Section 4, Algorithm 1, Section 6.
BayesianDoubleML.BDMLHierarchicalModel — Type
BDMLHierarchicalModel <: AbstractBDMLModelHierarchical BDML model specification implementing BDML-Hier from DiTraglia & Liu (2025), Algorithm 1.
Fields
Y::Vector{Float64}: Standardized outcome variableD::Vector{Float64}: Standardized treatment variableX::Matrix{Float64}: Standardized control variables (covariates)stats::StandardizationStats: Standardization statisticsn::Int: Number of observationsp::Int: Number of control variablesresult::Union{Nothing, AbstractBDMLResult}: Stores fitting results afterfit!()is_fitted::Bool: Whether the model has been fittedlast_method::Union{Nothing, AbstractInferenceMethod}: Method used in last fit
Model Specification (Section 4, Equations 12-13)
Reduced form: $Y = X'\delta + U$ (Eq. 12) $D = X'\gamma + V$ (Eq. 5)
Hierarchical priors:
- $\sigma^2_\delta \sim \text{InvGamma}(2, 2)$ [Variance hyperparameter for $\delta$]
- $\sigma^2_\gamma \sim \text{InvGamma}(2, 2)$ [Variance hyperparameter for $\gamma$]
- $\delta \sim N(0, \sigma^2_\delta\cdot I_p)$ [Outcome coefficients with adaptive shrinkage]
- $\gamma \sim N(0, \sigma^2_\gamma\cdot I_p)$ [Treatment coefficients with adaptive shrinkage]
- $\sigma_U \sim \text{Cauchy}^+(0, 2.5)$ [Outcome error scale]
- $\sigma_V \sim \text{Cauchy}^+(0, 2.5)$ [Treatment error scale]
- $R \sim \text{LKJ}(4)$ [Correlation matrix]
For VI, uses $\text{Beta}(2,2)$ correlation parameterization.
Causal effect recovery: $\alpha = \rho\cdot\sigma_U / \sigma_V$ (Eq. 15)
Hierarchical structure: This is equivalent to placing independent $\text{Student-}t(4)$ distributions on each coefficient marginally. The $\text{InvGamma}(2, 2)$ hyperprior provides adaptive shrinkage that learns the appropriate regularization from data (Section 6, Table 1).
This can improve performance when p is large relative to n, and the paper's simulations show BDML-Hier achieves better coverage (0.94) than BDML-Basic (0.91-0.93).
See also: BDMLBasicModel, BDMLModel
References
- DiTraglia, F.J. & Liu, L. (2025). "Bayesian Double Machine Learning for Causal Inference", arXiv:2508.12688v1, Section 4, Algorithm 1, Section 6.
BayesianDoubleML.BDMLModel — Function
BDMLModel(Y, D, X; model_type=:basic)Factory function to create appropriate BDML model type for Algorithm 1 from DiTraglia & Liu (2025).
Standardizes data once during model creation. This ensures data is only standardized once, even if fitted multiple times.
Arguments
Y::Vector{Float64}: Outcome variable (will be standardized)D::Vector{Float64}: Treatment variable (will be standardized)X::Matrix{Float64}: Control variables (will be standardized)
Keyword Arguments
model_type::Symbol=:basic::basicfor BDML-Basic (fixed prior variances):hierfor BDML-Hier (adaptive hierarchical priors)
Returns
AbstractBDMLModel: Either BDMLBasicModel or BDMLHierarchicalModel
Algorithm 1 Variations
BDML-Basic (model_type=:basic):
- Places independent $N(0, 25\cdot I)$ priors on $\delta$ and $\gamma$
- Fixed shrinkage across all covariates
- Recommended for interpretability and simplicity
BDML-Hier (model_type=:hier):
- Places hierarchical $\text{InvGamma}(2, 2)$ priors on $\sigma^2_\delta$ and $\sigma^2_\gamma$
- Adaptive shrinkage equivalent to $\text{Student-}t(4)$ on coefficients
- Better coverage in simulations (Table 1: 0.94 vs 0.91-0.93)
- Recommended as default choice
Both variations recover $\alpha$ via Equation 15: $\alpha = \rho\cdot\sigma_U / \sigma_V$
Examples
# Create basic model
model_basic = BDMLModel(Y, D, X; model_type=:basic)
# Create hierarchical model
model_hier = BDMLModel(Y, D, X; model_type=:hier)
# Fit with different methods
fit!(model_basic, MCMCMethod(:nuts))
fit!(model_hier, UnifiedVIMethod())Performance Notes
Standardization is performed once during model creation. Results are stored in the model after calling fit!().
References
- DiTraglia, F.J. & Liu, L. (2025). "Bayesian Double Machine Learning for Causal Inference", arXiv:2508.12688v1, Section 4, Algorithm 1.
See also: BDMLBasicModel, BDMLHierarchicalModel, fit!
BDMLModel(data::BDMLData; model_type=:basic)Create BDML model from BDMLData struct.
Convenience constructor for working with the existing BDMLData type.
Examples
data = BDMLData(Y, D, X)
model = BDMLModel(data; model_type=:basic)BDMLModel(df::DataFrame, y::Symbol, d::Symbol; model_type::Symbol=:basic, x_cols=nothing)Create a BDML model from a DataFrame, specifying the outcome and treatment columns.
Arguments
df::DataFrame: The data as a DataFramey::Symbol: Column name for the outcome variabled::Symbol: Column name for the treatment variable
Keyword Arguments
model_type::Symbol=:basic::basicfor BDML-Basic (fixed prior variances):hierfor BDML-Hier (adaptive hierarchical priors)
x_cols: Column names for covariates. Ifnothing(default), uses all columns exceptyandd
Returns
AbstractBDMLModel: Either BDMLBasicModel or BDMLHierarchicalModel
Examples
using DataFrames
# Generate data
df = make_plr_DTL2025(200, 100, 2.0)
# Default: use all columns except :y and :d as covariates
model = BDMLModel(df, :y, :d; model_type=:hier)
# Explicit covariate selection
model = BDMLModel(df, :y, :d; model_type=:hier, x_cols=[:X1, :X2, :X3])
# Using column indices or regex
model = BDMLModel(df, :y, :d; model_type=:hier, x_cols=names(df, r"^X"))See also: BDMLModel, BDMLBasicModel, BDMLHierarchicalModel
Model Accessors
BayesianDoubleML.nobs — Function
nobs(model::AbstractBDMLModel)Return the number of observations in the model.
Examples
model = BDMLModel(Y, D, X)
n = nobs(model) # Same as length(Y)BayesianDoubleML.ncovariates — Function
ncovariates(model::AbstractBDMLModel)Return the number of control variables (covariates) in the model.
Examples
model = BDMLModel(Y, D, X)
p = ncovariates(model) # Same as size(X, 2)BayesianDoubleML.model_type — Function
model_type(model::BDMLBasicModel)
model_type(model::BDMLHierarchicalModel)Return the model type symbol (:basic or :hier).
Useful for dispatch and debugging.
Examples
model = BDMLModel(Y, D, X; model_type=:basic)
model_type(model) # Returns :basicBayesianDoubleML.standardization_stats — Function
standardization_stats(model::AbstractBDMLModel)Return the standardization statistics for the model.
Used to transform results back to original scale.
Examples
model = BDMLModel(Y, D, X)
stats = standardization_stats(model)
# Use stats.Y_sd, stats.D_sd, etc.BayesianDoubleML.isfitted — Function
isfitted(model::AbstractBDMLModel)Return true if the model has been fitted (i.e., fit!() has been called).
Examples
model = BDMLModel(Y, D, X)
isfitted(model) # Returns false
fit!(model)
isfitted(model) # Returns trueInference Methods
Abstract and Types
BayesianDoubleML.AbstractInferenceMethod — Type
AbstractInferenceMethodAbstract type for all inference methods/algorithms.
Subtypes define HOW to fit a BDML problem:
- MCMC methods: NUTS, HMC, etc.
- VI methods: Unified (AdvancedVI), Simple (Turing vi())
Each concrete method type is dispatched on in the fit() function along with the problem type to execute the appropriate algorithm.
See also: MCMCMethod, UnifiedVIMethod, SimpleVIMethod
BayesianDoubleML.MCMCMethod — Type
MCMCMethod <: AbstractInferenceMethodMCMC (Markov Chain Monte Carlo) inference method.
Supports multiple MCMC algorithms via the algorithm field:
- :nuts - No-U-Turn Sampler (default, recommended)
Fields
algorithm::Symbol: Which MCMC algorithm (:nuts)target_acceptance::Float64: Target acceptance rate for NUTS (default: 0.8)max_depth::Int: Maximum tree depth for NUTS (default: 10)
Constructors
MCMCMethod(:nuts; target_acceptance=0.8, max_depth=10)
MCMCNUTS(; target_acceptance=0.8, max_depth=10) # ConvenienceExamples
# Default NUTS sampler
method = MCMCMethod(:nuts)
# Custom NUTS settings
method = MCMCMethod(:nuts; target_acceptance=0.9, max_depth=12)
# Convenience constructor
method = MCMCNUTS()
method = MCMCNUTS(; target_acceptance=0.9, max_depth=12)Notes
NUTS is the recommended MCMC algorithm. It automatically tunes the trajectory length and is robust to step size choices.
See also: MCMCNUTS
BayesianDoubleML.UnifiedVIMethod — Type
UnifiedVIMethod{F<:AbstractVariationalFamily} <: AbstractInferenceMethodUnified Variational Inference method using AdvancedVI with explicit Bijectors.
This is the primary VI implementation with:
- Explicit bijector transformations (unconstrained → constrained)
- Support for all AD backends (ReverseDiff, Mooncake, Zygote, ForwardDiff)
- Subsampling support for large datasets (n > 10,000)
- Multiple variational families (MeanField, LowRank) via type parameter
- Full control over the optimization process
Type Parameters
F<:AbstractVariationalFamily: The variational family type (MeanField, LowRank)
Fields
ad_backend::Type{<:AbstractADType}: AD backend (AutoReverseDiff, AutoMooncake, etc.)subsample::Union{Bool, Nothing}: Whether to use mini-batch gradientsbatch_size::Int: Mini-batch size (auto-computed if -1)n_montecarlo::Int: Number of Monte Carlo samples for gradient estimation (default: 10)family::F: Variational family instance (carries type-specific parameters like rank)
Constructors
# Default MeanField
UnifiedVIMethod()
# LowRank with specific rank
UnifiedVIMethod(; family=LowRank(10))
# Convenience aliases
MeanFieldVI() # Explicit mean-field
LowRankVI(10) # Low-rank with rank 10
UnifiedVI() # Backwards-compatible alias (defaults to MeanField)AD Backend Options
AutoReverseDiff(default): Most stable, tape compilation, no warmup neededAutoMooncake: 5-10x faster after warmup, requires compilationAutoZygote: Source-to-source, higher memory usageAutoForwardDiff: Forward-mode, good for small p
Subsampling
subsample=nothing: Auto-enable for n >= 10,000subsample=true: Force subsampling with auto batch sizesubsample=false: Force full-batchbatch_size: Manual control (default: min(256, ceil(n/1000)))
Examples
# Default - MeanField with ReverseDiff
method = UnifiedVIMethod()
# Low-rank with rank 10
method = UnifiedVIMethod(; family=LowRank(10))
# With Mooncake (fast after warmup)
method = UnifiedVIMethod(; ad_backend=AutoMooncake, family=LowRank(10))
# Explicit subsampling control with low-rank
method = UnifiedVIMethod(; subsample=true, batch_size=512, family=LowRank(5))
# Convenience aliases
method = MeanFieldVI() # Explicit mean-field
method = LowRankVI(10) # Low-rank with rank 10
method = UnifiedVI() # Backwards-compatibleNotes
This implementation uses explicit bijectors (LogNormal, Beta) to transform parameters from unconstrained space (where VI optimizes) to constrained space (where model is defined). This provides better AD compatibility than relying on Turing's automatic bijectors.
The variational family determines the covariance structure:
- MeanField: Diagonal only (fastest, assumes independence)
- LowRank: Diagonal + low-rank factors (balances speed and correlation capture)
See also: MeanField, LowRank, SimpleVIMethod
BayesianDoubleML.SimpleVIMethod — Type
SimpleVIMethod <: AbstractInferenceMethodSimple Variational Inference method using Turing's native vi() function.
This is an alternative VI implementation that:
- Uses Turing's built-in ADVI with automatic bijectors
- Works well with AutoMooncake (5-10x speedup after warmup)
- Has simpler code, less maintenance overhead
- Does NOT support subsampling (Turing limitation)
Best for: Production use with Mooncake when n < 10,000 and you want maximum performance after initial warmup.
Fields
ad_backend::Type{<:AbstractADType}: AD backend (AutoMooncake recommended, AutoReverseDiff works)
Constructor
SimpleVIMethod(; ad_backend=AutoMooncake)
SimpleVI(; kwargs...) # Convenience aliasExamples
# Default - Mooncake (recommended for this implementation)
method = SimpleVIMethod()
# With ReverseDiff
method = SimpleVIMethod(; ad_backend=AutoReverseDiff)
# Convenience alias
method = SimpleVI()Comparison with UnifiedVIMethod
| Feature | UnifiedVIMethod | SimpleVIMethod |
|---|---|---|
| Bijectors | Explicit | Automatic (Turing) |
| Subsampling | Yes | No |
| Mooncake | Has issues | Works well |
| ReverseDiff | Excellent | Good |
| Large data (n>10k) | Yes | No (memory) |
| Code complexity | More control | Simpler |
See also: UnifiedVIMethod, SimpleVI
MCMC Method Constructors
BayesianDoubleML.MCMCNUTS — Function
MCMCNUTS(; target_acceptance=0.8, max_depth=10)Convenience constructor for NUTS (No-U-Turn Sampler) MCMC method.
NUTS is the default and recommended MCMC algorithm. It automatically tunes the trajectory length and is robust to step size choices.
Arguments
target_acceptance::Float64=0.8: Target acceptance rate (typical range: 0.6-0.9)max_depth::Int=10: Maximum tree depth (limits trajectory length)
Examples
# Default NUTS
method = MCMCNUTS()
# Higher target acceptance for better exploration
method = MCMCNUTS(; target_acceptance=0.9)See also: MCMCMethod
VI Method Constructors
BayesianDoubleML.UnifiedVI — Function
UnifiedVI(; kwargs...)Convenience alias for UnifiedVIMethod() with MeanField family.
Backwards-compatible - defaults to mean-field approximation.
See UnifiedVIMethod for full documentation.
BayesianDoubleML.SimpleVI — Function
SimpleVI(; ad_backend=AutoMooncake)Convenience alias for SimpleVIMethod().
See SimpleVIMethod for full documentation.
BayesianDoubleML.MeanFieldVI — Function
MeanFieldVI(; kwargs...)Convenience constructor for MeanField VI.
Creates a UnifiedVIMethod with MeanField variational family.
Examples
method = MeanFieldVI() # Default MeanField
method = MeanFieldVI(; ad_backend=AutoMooncake) # With MooncakeSee also: UnifiedVIMethod, LowRankVI
BayesianDoubleML.LowRankVI — Function
LowRankVI(rank::Int; kwargs...)Convenience constructor for LowRank VI.
Creates a UnifiedVIMethod with LowRank variational family.
Arguments
rank::Int: Number of low-rank factors to use
Examples
method = LowRankVI(10) # Low-rank with 10 factors
method = LowRankVI(5; ad_backend=AutoMooncake) # With MooncakeSee also: UnifiedVIMethod, MeanFieldVI
BayesianDoubleML.LowRankScoreVI — Function
LowRankScoreVI(rank::Int; kwargs...)Convenience constructor for LowRankScore VI (using score gradient).
Creates a UnifiedVIMethod with LowRankScore variational family. Uses the score gradient (REINFORCE) estimator with VarGrad control variate.
Arguments
rank::Int: Number of low-rank factors to use
Examples
method = LowRankScoreVI(10) # Low-rank score gradient with 10 factors
method = LowRankScoreVI(5; ad_backend=AutoMooncake) # With MooncakeSee also: UnifiedVIMethod, LowRankVI
Variational Families
BayesianDoubleML.AbstractVariationalFamily — Type
AbstractVariationalFamilyAbstract type for variational distribution families in UnifiedVI.
Subtypes define the shape of the variational posterior approximation:
- MeanField: Diagonal covariance (independent dimensions)
- LowRank: Low-rank plus diagonal covariance structure
Each family type is dispatched on for initialization in initialize_variational_distribution.
BayesianDoubleML.MeanField — Type
MeanField <: AbstractVariationalFamilyMean-field (factorized) Gaussian variational family.
Uses a diagonal covariance matrix, assuming independence between all parameters in the variational posterior.
Best for: High-dimensional problems, fast inference, when parameters are approximately uncorrelated in the posterior.
BayesianDoubleML.LowRank — Type
LowRank <: AbstractVariationalFamilyLow-rank Gaussian variational family.
Uses a low-rank plus diagonal decomposition of the covariance: Σ = D² + U*U' where U is d×r with r << d.
Best for: Problems with structured correlations, balancing between mean-field and full-rank flexibility.
Constructor
LowRank(rank::Int) # rank is the number of low-rank factorsBayesianDoubleML.LowRankScore — Type
LowRankScore <: AbstractVariationalFamilyLow-rank Gaussian variational family using score gradient estimator (BBVI).
Uses a low-rank plus diagonal decomposition of the covariance: Σ = D² + U*U' where U is d×r with r << d.
Uses the score gradient (REINFORCE) estimator with VarGrad control variate, which can be more stable than the reparameterization gradient for some problems.
Best for: Problems where reparameterization gradient is unstable or when exploring different gradient estimators.
Constructor
LowRankScore(rank::Int) # rank is the number of low-rank factorsMethod Traits
BayesianDoubleML.uses_sampling — Function
uses_sampling(method::AbstractInferenceMethod)Return true if the method uses sampling (MCMC or Monte Carlo VI).
All current methods return true, but this enables future deterministic methods.
BayesianDoubleML.supports_subsampling — Function
supports_subsampling(method::AbstractInferenceMethod)Return true if the method supports mini-batch subsampling for large datasets.
Only UnifiedVIMethod supports subsampling. MCMC and SimpleVI do not.
BayesianDoubleML.is_deterministic — Function
is_deterministic(method::AbstractInferenceMethod)Return true if the method is deterministic (no random sampling).
Currently all methods use sampling. Future deterministic methods (e.g., Laplace approximation) would return true.
BayesianDoubleML.default_n_samples — Function
default_n_samples(method::AbstractInferenceMethod)Return the default number of samples/draws for the method.
Returns 2000 for MCMC, 2000 for VI draw phase.
BayesianDoubleML.default_n_iterations — Function
default_n_iterations(method::AbstractInferenceMethod)Return the default number of optimization iterations for the method.
MCMC uses iterations as warm-up/tuning. VI uses iterations for optimization.
Fitting Functions
BayesianDoubleML.fit! — Function
fit!(model::AbstractBDMLModel, method::AbstractInferenceMethod; force=false, kwargs...)Fit a BDML model using the specified inference method, storing results in the model.
This is a mutating function that modifies the model in-place. After fitting, results can be extracted using coeftable(), extract_alpha(), summary(), etc.
If the model has already been fitted, a warning is shown unless force=true is passed to allow refitting.
Arguments
model::AbstractBDMLModel: The model to fit (data + metadata)method::AbstractInferenceMethod: The inference algorithm to use
Keyword Arguments
force::Bool=false: Allow refitting an already-fitted model
For MCMC methods:
n_samples::Int=2000: Number of posterior samples to drawn_chains::Int=4: Number of MCMC chains to run
For VI methods:
n_iterations::Int=1000: Number of optimization iterationsn_draws::Int=2000: Number of posterior samples to draw after fitting
Returns
nothing (follows standard Julia mutating function convention)
Examples
# Create model
model = BDMLModel(Y, D, X; model_type=:basic)
# Fit with MCMC (NUTS)
fit!(model, MCMCMethod(:nuts); n_samples=2000, n_chains=4)
# Fit with VI (Unified)
fit!(model, UnifiedVIMethod(); n_iterations=1000)
# Extract results
coef_table = coeftable(model)
summary(model)
# Refit with force (changes the stored result)
fit!(model, SimpleVIMethod(); force=true)Multiple Dispatch
The actual implementation dispatches on both model and method:
_fit_impl(::BDMLBasicModel, ::MCMCMethod)- Basic model MCMC_fit_impl(::BDMLHierarchicalModel, ::MCMCMethod)- Hierarchical MCMC_fit_impl(::BDMLBasicModel, ::UnifiedVIMethod)- Basic model VI (Unified)_fit_impl(::BDMLHierarchicalModel, ::UnifiedVIMethod)- Hierarchical VI (Unified)_fit_impl(::BDMLBasicModel, ::SimpleVIMethod)- Basic model VI (Simple)_fit_impl(::BDMLHierarchicalModel, ::SimpleVIMethod)- Hierarchical VI (Simple)
See also: BDMLModel, MCMCMethod, UnifiedVIMethod, isfitted
fit!(model::AbstractBDMLModel; force=false, kwargs...)Fit a model using default method (MCMC with NUTS).
Convenience method that defaults to NUTS sampler.
Result Types
BayesianDoubleML.AbstractBDMLResult — Type
AbstractBDMLResultAbstract parent type for all Bayesian Double Machine Learning results.
Subtypes include:
BDMLMCMCResult: Results from MCMC inferenceBDMLVIResult: Results from Variational Inference
All subtypes implement the result interface with methods like:
extract_alpha(): Extract α samplescoeftable(): Generate coefficient tableconfint(): Compute credible intervals
BayesianDoubleML.BDMLMCMCResult — Type
BDMLMCMCResult <: AbstractBDMLResultResults from MCMC inference using NUTS or HMC sampling.
Fields
chain::MCMCChains.Chains: Full MCMC chain from Turingalpha_samples::Vector{Float64}: Causal effect samples (original scale)alpha_samples_standardized::Vector{Float64}: Causal effect samples (standardized scale)std_stats::StandardizationStats: Statistics for back-transformationmodel_type::Symbol: :basic or :hier
Usage
result = fit(problem, MCMCNUTS())
alpha_mean = mean(result.alpha_samples)
ci = credible_interval(result)See also: BDMLVIResult, AbstractBDMLResult
BayesianDoubleML.BDMLVIResult — Type
BDMLVIResult{P} <: AbstractBDMLResultResults from Variational Inference (ADVI) approximation.
Type Parameters
P: The concrete type of the variational posterior distribution
Fields
variational_posterior::P: Variational distribution from AdvancedVIalpha_samples::Vector{Float64}: Causal effect samples (original scale)alpha_samples_standardized::Vector{Float64}: Causal effect samples (standardized scale)std_stats::StandardizationStats: Statistics for back-transformationmodel_type::Symbol: :basic or :hiervariational_family::Symbol: :meanfield, :lowrank, or :fullrankvi_method::Symbol: :unified or :simple (which VI method was used)n_iterations::Int: Number of optimization iterations performedelbo_history::Vector{Float64}: ELBO values during optimizationconverged::Bool: Whether convergence criteria were metfinal_elbo::Float64: Final ELBO value
Usage
result = fit(problem, UnifiedVI())
alpha_mean = mean(result.alpha_samples)
println("Converged: ", result.converged)
println("Final ELBO: ", result.final_elbo)See also: BDMLMCMCResult, AbstractBDMLResult
BayesianDoubleML.BDMLData — Type
BDMLDataContainer for BDML input data with standardized types.
Fields
Y::Vector{Float64}: Outcome variableD::Vector{Float64}: Treatment variableX::Matrix{Float64}: Control covariates (n×p)n::Int: Number of observationsp::Int: Number of covariates
Constructor
BDMLData(Y, D, X)Automatically converts inputs to Float64 and validates dimensions.
BayesianDoubleML.StandardizationStats — Type
StandardizationStatsStatistics from data standardization, used for back-transformation.
Fields
Y_mean::Float64: Mean of outcome before standardizationY_sd::Float64: Std dev of outcome before standardizationD_mean::Float64: Mean of treatment before standardizationD_sd::Float64: Std dev of treatment before standardizationX_mean::Vector{Float64}: Means of covariates before standardizationX_sd::Vector{Float64}: Std devs of covariates before standardization
Used internally to transform results back to original scale after fitting on standardized data.
Extraction Functions
BayesianDoubleML.extract_alpha — Function
extract_alpha(chain::MCMCChains.Chains)Extract $\alpha$ samples from MCMC chain using the paper's transformation $(Eq. 15)$.
Mathematical Derivation (DiTraglia & Liu 2025)
The BDML model uses a bivariate reduced form parameterization (Section 4):
Structural equation: $Y = \alpha D + X'\beta + \epsilon$, where $\epsilon \perp V$ (Eq. 6)
Reduced form: $Y = X'\delta + U$, where $U = \epsilon + \alpha V$ (Eq. 12) $D = X'\gamma + V$ (Eq. 5)
Since $\epsilon$ is uncorrelated with V by assumption, the covariance between the reduced form errors is: $\text{Cov}(U, V) = \text{Cov}(\epsilon + \alpha V, V) = \alpha \cdot \text{Var}(V)$
Therefore, the causal effect $\alpha$ can be recovered from the error covariance: $\alpha = \text{Cov}(U, V) / \text{Var}(V) = \sigma_{UV} / \sigma^2_V$ (Eq. 15)
Using the correlation parameterization $\sigma_{UV} = \rho \cdot \sigma_U \cdot \sigma_V$: $\alpha = \rho \cdot \sigma_U \cdot \sigma_V / \sigma^2_V = \rho \cdot \sigma_U / \sigma_V$
Implementation
For MCMC with LKJCholesky, the chain stores the Cholesky factor L where: $\rho = L[2,1] / L[1,1]$
Since $L[1,1] = 1$ for correlation matrices, $\rho = L[2,1]$.
This function extracts $\sigma_U$, $\sigma_V$, and $\rho$ from the MCMC chain, then computes: $\alpha = \rho \cdot \sigma_U / \sigma_V$
Arguments
chain::MCMCChains.Chains: MCMC chain containing posterior samples
Returns
Vector{Float64}: Posterior samples of $\alpha$
References
- DiTraglia, F.J. & Liu, L. (2025). "Bayesian Double Machine Learning for Causal Inference", arXiv:2508.12688v1, Section 4, Equation 15.
extract_alpha(result::BDMLVIResult)Extract $\alpha$ samples from a BDMLVIResult.
extract_alpha(vi_samples::Matrix{Float64}, p::Int, model_type::Symbol)Extract α samples from VI posterior samples matrix using paper's Equation 15.
Mathematical Derivation (DiTraglia & Liu 2025, Section 4)
From the bivariate reduced form model (Equations 13-15): Y = X'δ + U, where U = ε + αV (Eq. 12) D = X'γ + V (Eq. 5)
The causal effect α is recovered from error covariance: α = Cov(U, V) / Var(V) = σUV / σ²V (Eq. 15)
For VI models, the correlation is parameterized as ρraw ~ Beta(2, 2) on [0, 1], then transformed to ρ = 2*ρraw - 1 on [-1, 1].
Therefore: α = ρ * σU / σV = (2*ρraw - 1) * σU / σ_V
Arguments
vi_samples::Matrix{Float64}: Samples from variational posterior (parameters × samples)p::Int: Number of control variables (covariates)model_type::Symbol: :basic or :hier
Returns
Vector{Float64}: Alpha samples (α = ρ·σU / σV)
Parameter Indexing
For hierarchical models:
- [σ²δ, σ²γ, δ(1:p), γ(1:p), σU, σV, ρ_raw]
- Indices: σ²δ=1, σ²γ=2, δ=3:(2+p), γ=(3+p):(2+2p), σU=(3+2p), σV=(4+2p), ρ_raw=(5+2p)
For basic models:
- [δ(1:p), γ(1:p), σU, σV, ρ_raw]
- Indices: δ=1:p, γ=(p+1):2p, σU=(2p+1), σV=(2p+2), ρ_raw=(2p+3)
References
- DiTraglia, F.J. & Liu, L. (2025). "Bayesian Double Machine Learning for Causal Inference", arXiv:2508.12688v1, Section 4, Equation 15.
extract_alpha(model::AbstractBDMLModel)Extract alpha samples from a fitted BDML model.
Delegates to the stored result. Throws an error if the model has not been fitted.
Statistical Functions
Coefficient Table
BayesianDoubleML.coeftable — Function
coeftable(result::BDMLMCMCResult; level=0.95)Compute coefficient table for MCMC results with HPD credible intervals. Uses MCMCChains for ESS and MCSE calculations.
coeftable(result::BDMLVIResult; level=0.95)Compute coefficient table for VI results with HPD credible intervals. Note: ESS and MCSE are not computed for VI results as samples are drawn independently.
coeftable(model::AbstractBDMLModel; level=0.95)Compute coefficient table for a fitted BDML model.
Delegates to the stored result. Throws an error if the model has not been fitted.
BayesianDoubleML.BDMLCoeftable — Type
BDMLCoeftableCoefficient table following StatsAPI conventions for regression results.
Stores parameter estimates, standard errors, credible intervals, and diagnostics for Bayesian Double Machine Learning models.
Fields
coefnames::Vector{String}: Parameter namescoef::Vector{Float64}: Point estimatesstderror::Vector{Float64}: Standard errorsmcse::Vector{Float64}: Monte Carlo standard errorscilower::Vector{Float64}: Lower bound of credible intervalciupper::Vector{Float64}: Upper bound of credible intervalpvalue::Vector{Float64}: Two-sided p-valuesess::Vector{Float64}: Effective sample sizes (MCMC only)elbo::Union{Float64, Nothing}: Final ELBO value (VI only)level::Float64: Credible interval level (e.g., 0.95)nsamples::Int: Number of posterior samplesmodel_type::Symbol: :basic or :hiermethod_type::Symbol: :mcmc or :vi
Usage
result = fit(problem, MCMCNUTS())
ct = coeftable(result)
println(ct) # Pretty-printed tableDiagnostics
BayesianDoubleML.confint — Function
confint(ct::BDMLCoeftable)Return confidence/credible intervals as matrix [lower upper].
confint(result::AbstractBDMLResult; level=0.95)Return confidence/credible intervals for the treatment effect from a BDML result. Computes HPD credible intervals at the specified level.
confint(model::AbstractBDMLModel; level=0.95)Return confidence/credible intervals from a fitted BDML model.
Delegates to the stored result. Throws an error if the model has not been fitted.
BayesianDoubleML.ess — Function
ess(ct::BDMLCoeftable)Return effective sample size.
ess(result::BDMLMCMCResult)Compute effective sample size from MCMC chain. Uses MCMCChains.ess for proper multi-chain ESS calculation. Returns the minimum ESS across all parameters as conservative estimate.
ess(model::AbstractBDMLModel)Return effective sample size from a fitted BDML model (MCMC only).
Delegates to the stored result. Throws an error if the model has not been fitted or if the model was fitted with VI.
BayesianDoubleML.pvalues — Function
pvalues(ct::BDMLCoeftable)Return p-values.
pvalues(result::AbstractBDMLResult)Return p-value for testing H0: α = 0 from a BDML result.
pvalues(model::AbstractBDMLModel)Return p-values from a fitted BDML model.
Delegates to the stored result. Throws an error if the model has not been fitted.
BayesianDoubleML.hpd_interval — Function
hpd_interval(samples::Vector{Float64}; level::Real=0.95)Compute the Highest Posterior Density (HPD) interval for a vector of samples. HPD intervals are the shortest intervals containing the specified probability mass.
BayesianDoubleML.mcse — Function
mcse(result::BDMLMCMCResult)Compute Monte Carlo Standard Error from MCMC chain. Uses MCMCChains.mcse for proper multi-chain MCSE calculation. Returns the maximum MCSE across all parameters as conservative estimate.
mcse(model::AbstractBDMLModel)Return Monte Carlo standard error from a fitted BDML model (MCMC only).
Delegates to the stored result. Throws an error if the model has not been fitted or if the model was fitted with VI.
BayesianDoubleML.rhat — Function
rhat(result::BDMLMCMCResult)Compute R-hat (potential scale reduction factor) from MCMC chain. R-hat ≈ 1.0 indicates good convergence. Values > 1.05 suggest non-convergence. Returns the maximum R-hat across all parameters as conservative estimate.
rhat(model::AbstractBDMLModel)Return R-hat convergence diagnostic from a fitted BDML model (MCMC only).
Delegates to the stored result. Throws an error if the model has not been fitted or if the model was fitted with VI.
BayesianDoubleML.rhat_statistic — Function
rhat_statistic(result::BDMLMCMCResult)Alias for rhat(result) - compute R-hat convergence diagnostic. R-hat ≈ 1.0 indicates good convergence.
rhat_statistic(model::AbstractBDMLModel)Alias for rhat(model) - compute R-hat convergence diagnostic (MCMC only).
BayesianDoubleML.chain_info — Function
chain_info(result::BDMLMCMCResult)Get chain summary information: (nchains, nsamplesperchain, total_samples)
chain_info(model::AbstractBDMLModel)Return chain summary information from a fitted BDML model (MCMC only).
Delegates to the stored result. Throws an error if the model has not been fitted or if the model was fitted with VI.
StatsAPI Functions
BayesianDoubleML.coef — Function
coef(result::AbstractBDMLResult)Module-level wrapper for StatsAPI.coef. See StatsAPI.coef for details.
BayesianDoubleML.stderror — Function
stderror(result::AbstractBDMLResult)Module-level wrapper for StatsAPI.stderror. See StatsAPI.stderror for details.
BayesianDoubleML.vcov — Function
vcov(result::AbstractBDMLResult)Module-level wrapper for StatsAPI.vcov. See StatsAPI.vcov for details.
Utility Functions
BayesianDoubleML.credible_interval — Function
credible_interval(model::AbstractBDMLModel; level=0.95)Return credible intervals from a fitted BDML model.
Delegates to the stored result. Throws an error if the model has not been fitted.
credible_interval(result::BDMLVIResult; level=0.95)Compute credible interval for α from BDMLVIResult.
credible_interval(samples::Vector{Float64}; level=0.95)Compute credible interval for a vector of samples.
Internal Functions
These functions are primarily for internal use but are documented here for developers.
Model Functions
BayesianDoubleML.bdml_basic — Function
bdml_basic(Y, D, X)Bayesian Double Machine Learning - Basic Model.
Implements the BDML-Basic variation of Algorithm 1 from DiTraglia & Liu (2025), Section 4. This model uses fixed prior variances on the reduced form coefficients.
The Bivariate Reduced Form Model
The BDML approach avoids regularization-induced confounding by using a bivariate reduced form parameterization (Section 4, Equations 13-15):
Outcome equation: $Y = X'\delta + U$ (Eq. 12)
Treatment equation: $D = X'\gamma + V$ (Eq. 5)
Joint error distribution: $[U; V] | X \sim N(0, \Sigma)$ (Eq. 13)
where $\Sigma$ is the $2\times2$ covariance matrix: $\Sigma = [\sigma^2_U, \sigma_{UV}; \sigma_{UV}, \sigma^2_V]$
Causal effect recovery: The causal effect $\alpha$ is recovered from the error covariance via: $\alpha = \text{Cov}(U, V) / \text{Var}(V) = \sigma_{UV} / \sigma^2_V = \rho \cdot \sigma_U / \sigma_V$ (Eq. 15)
This works because $U = \epsilon + \alpha V$ (Eq. 6), where $\epsilon$ is the structural error uncorrelated with V. Therefore: $\text{Cov}(U, V) = \text{Cov}(\epsilon + \alpha V, V) = \alpha \cdot \text{Var}(V)$.
This Variation: BDML-Basic
Priors on reduced form coefficients:
- $\delta \sim N(0, 25 \cdot I_p)$ [$\text{Normal}(0, 5^2)$ for outcome coefficients]
- $\gamma \sim N(0, 25 \cdot I_p)$ [$\text{Normal}(0, 5^2)$ for treatment coefficients]
These are fixed-variance priors that provide uniform shrinkage across all covariates. This corresponds to "BDML-Basic" in paper Section 6, Table 1.
Priors on error covariance:
- $\sigma_U \sim \text{Cauchy}^+(0, 2.5)$ [Half-Cauchy for outcome error scale]
- $\sigma_V \sim \text{Cauchy}^+(0, 2.5)$ [Half-Cauchy for treatment error scale]
- $R \sim \text{LKJ}(4)$ [LKJ prior with $\eta=4$ for correlation matrix]
The correlation parameter $\rho$ is extracted from the LKJCholesky factor $R_\text{chol}.L[2,1]$.
For adaptive shrinkage that learns the appropriate regularization from data, use bdml_hier() (BDML-Hierarchical) instead.
References
- DiTraglia, F.J. & Liu, L. (2025). "Bayesian Double Machine Learning for Causal Inference", arXiv:2508.12688v1, Section 4, Algorithm 1, Section 6.
See also: bdml_hier, bdml_basic_vi
BayesianDoubleML.bdml_hier — Function
bdml_hier(Y, D, X)Bayesian Double Machine Learning - Hierarchical Model.
Implements the BDML-Hier variation of Algorithm 1 from DiTraglia & Liu (2025), Section 4. This model uses adaptive hierarchical priors for data-driven shrinkage on the reduced form coefficients.
The Bivariate Reduced Form Model
The BDML approach avoids regularization-induced confounding by using a bivariate reduced form parameterization (Section 4, Equations 13-15):
Outcome equation: $Y = X'\delta + U$ (Eq. 12)
Treatment equation: $D = X'\gamma + V$ (Eq. 5)
Joint error distribution: $[U; V] | X \sim N(0, \Sigma)$ (Eq. 13)
where $\Sigma$ is the $2\times2$ covariance matrix: $\Sigma = [\sigma^2_U, \sigma_{UV}; \sigma_{UV}, \sigma^2_V]$
Causal effect recovery: The causal effect $\alpha$ is recovered from the error covariance via: $\alpha = \text{Cov}(U, V) / \text{Var}(V) = \sigma_{UV} / \sigma^2_V = \rho \cdot \sigma_U / \sigma_V$ (Eq. 15)
This works because $U = \epsilon + \alpha V$ (Eq. 6), where $\epsilon$ is the structural error uncorrelated with V. Therefore: $\text{Cov}(U, V) = \text{Cov}(\epsilon + \alpha V, V) = \alpha \cdot \text{Var}(V)$.
This Variation: BDML-Hier
Hierarchical priors on reduced form coefficients: The key difference from BDML-Basic is the adaptive prior structure:
Hyperpriors on variance hyperparameters:
- $\sigma^2_\delta \sim \text{InvGamma}(2, 2)$
- $\sigma^2_\gamma \sim \text{InvGamma}(2, 2)$
Conditional priors on coefficients:
- $\delta | \sigma^2_\delta \sim N(0, \sigma^2_\delta \cdot I_p)$
- $\gamma | \sigma^2_\gamma \sim N(0, \sigma^2_\gamma \cdot I_p)$
Why this works: This hierarchical structure is equivalent to placing independent $\text{Student-}t(4)$ distributions on each coefficient $\delta_j$ and $\gamma_j$. The $\text{InvGamma}(2, 2)$ hyperprior induces heavy tails that allow some coefficients to escape shrinkage while regularizing others. This provides adaptive shrinkage that learns the appropriate regularization level from the data (Section 6, Table 1).
Priors on error covariance:
- $\sigma_U \sim \text{Cauchy}^+(0, 2.5)$ [Half-Cauchy for outcome error scale]
- $\sigma_V \sim \text{Cauchy}^+(0, 2.5)$ [Half-Cauchy for treatment error scale]
- $R \sim \text{LKJ}(4)$ [LKJ prior with $\eta=4$ for correlation matrix]
The correlation parameter $\rho$ is extracted from the LKJCholesky factor $R_\text{chol}.L[2,1]$.
Recommendation: The paper's simulation study (Section 6, Table 1) shows BDML-Hier achieves better coverage (0.94) compared to BDML-Basic (0.91-0.93), making it the recommended default choice for most applications.
For fixed shrinkage with simpler interpretation, use bdml_basic() instead.
References
- DiTraglia, F.J. & Liu, L. (2025). "Bayesian Double Machine Learning for Causal Inference", arXiv:2508.12688v1, Section 4, Algorithm 1, Section 6.
See also: bdml_basic, bdml_hier_vi
Data Functions
BayesianDoubleML.DGP.make_plr_DTL2025 — Function
make_plr_DTL2025(n::Int, p::Int, sigma_epsilon::Real; alpha=2.0, rng=Random.default_rng())Generate synthetic data matching the simulation design from DiTraglia & Liu (2025), Section 6, Table 1.
This DGP corresponds to the "fixed" design described in Equations (20)-(21) of the paper, which is used to generate the results reported in Table 1.
Data Generating Process (Equations 20-21)
Covariates: $X_i \sim \text{iid Normal}_p(0, I_p)$
Errors: $(\epsilon_i, V_i)' | X \sim \text{iid Normal}([0; 0], [\sigma^2_\epsilon, 0; 0, 1])$
where:
- $\epsilon_i$: Structural error (uncorrelated with $V_i$)
- $V_i$: Treatment error (variance = 1)
- $\sigma_\epsilon$: Standard deviation of $\epsilon$ (varies across simulations)
Coefficient distribution: $\beta \sim \text{Normal}_p(\mu_\beta, \sigma^2_\beta \cdot I_p)$
Fixed parameters (per paper):
- $\alpha = 2$ (true causal effect)
- $\gamma = \iota_p / \sqrt{p}$ (treatment coefficients)
- $\mu_\beta = -\gamma/2$ (mean of outcome coefficients)
- $\sigma^2_\beta = 1/p$ (variance of outcome coefficients)
Construction:
- $D_i = X_i'\gamma + V_i$ (Equation 4: Treatment reduced form)
- $Y_i = \alpha \cdot D_i + X_i'\beta + \epsilon_i$ (Equation 5: Outcome structural)
Arguments
n::Int: Number of observations (paper uses: 200)p::Int: Number of covariates (paper uses: 100)sigma_epsilon::Real: Std dev of structural error $\epsilon \in \{1, 2, 4\}$alpha::Real: True causal effect (default: 2.0)rng::AbstractRNG: Random number generator for reproducibility (default:Random.default_rng())
Returns
df::DataFrame: DataFrame containing:y::Vector{Float64}: Outcome variable (length n)d::Vector{Float64}: Treatment variable (length n)X1, X2, ..., Xp::Vector{Float64}: Covariates as columns
Extract the components for use with BDMLModel:
df = make_plr_DTL2025(n, p, sigma_epsilon; alpha=2.0, rng)Paper Reference
Section 6, "Simulation Study", Equations (20)-(21):
DiTraglia, F.J. & Liu, L. (2025). "Bayesian Double Machine Learning for Causal Inference", arXiv:2508.12688v1.
Table 1 Settings
The paper reports results for three values of $\sigma_\epsilon$:
| $\sigma_\epsilon$ | BDML-Hier Coverage | BDML-Hier RMSE | BDML-Basic Coverage | BDML-Basic RMSE |
|---|---|---|---|---|
| 1 | 0.94 | 0.09 | 0.93 | 0.11 |
| 2 | 0.94 | 0.18 | 0.91 | 0.22 |
| 4 | 0.94 | 0.35 | 0.92 | 0.46 |
All settings use n=200, p=100, and $\alpha=2$.
Examples
Basic usage
# Replicate one row of Table 1 (n=200, p=100, σ_ε=2)
using BayesianDoubleML
using Random
df = make_plr_DTL2025(200, 100, 2.0; rng=MersenneTwister(42))
Implementation Notes
Randomness:
- The function generates new random draws for $X$, $eta$, $psilon$, and $V$ in each call
- Pass an
rngargument (e.g.,MersenneTwister(seed)) for reproducibility - Following the paper, each replication should use a fresh seed or different seed
Coefficient generation:
- $\gamma$ is fixed at $\iota_p/\sqrt{p}$ (as specified in paper)
- $\beta$ is randomly drawn from $N(\mu_\beta, \sigma^2_\beta\cdot I)$ for each replication
- This mimics the paper's design where $\beta$ varies across replications
Error structure:
- $\epsilon$ and $V$ are independent (covariance matrix is diagonal)
- $\text{Var}(V) = 1$ (normalized)
- $\text{Var}(\epsilon) = \sigma^2_\epsilon$ (varies across simulation settings)
Confounding:
- $X$ affects both $D$ (through $\gamma$) and $Y$ (through $\beta$)
- This creates confounding that BDML is designed to handle
- The specific structure ($\gamma = \iota_p/\sqrt{p}$, $\mu_\beta = -\gamma/2$) creates realistic correlation
See also: make_plr_DTL2025
Index
BayesianDoubleML.AbstractBDMLModelBayesianDoubleML.AbstractBDMLResultBayesianDoubleML.AbstractInferenceMethodBayesianDoubleML.AbstractVariationalFamilyBayesianDoubleML.BDMLBasicModelBayesianDoubleML.BDMLCoeftableBayesianDoubleML.BDMLDataBayesianDoubleML.BDMLHierarchicalModelBayesianDoubleML.BDMLMCMCResultBayesianDoubleML.BDMLVIResultBayesianDoubleML.LowRankBayesianDoubleML.LowRankScoreBayesianDoubleML.MCMCMethodBayesianDoubleML.MeanFieldBayesianDoubleML.SimpleVIMethodBayesianDoubleML.StandardizationStatsBayesianDoubleML.UnifiedVIMethodBayesianDoubleML.BDMLModelBayesianDoubleML.DGP.make_plr_DTL2025BayesianDoubleML.LowRankScoreVIBayesianDoubleML.LowRankVIBayesianDoubleML.MCMCNUTSBayesianDoubleML.MeanFieldVIBayesianDoubleML.SimpleVIBayesianDoubleML.UnifiedVIBayesianDoubleML.bdml_basicBayesianDoubleML.bdml_hierBayesianDoubleML.chain_infoBayesianDoubleML.coefBayesianDoubleML.coeftableBayesianDoubleML.confintBayesianDoubleML.credible_intervalBayesianDoubleML.default_n_iterationsBayesianDoubleML.default_n_samplesBayesianDoubleML.essBayesianDoubleML.extract_alphaBayesianDoubleML.fit!BayesianDoubleML.hpd_intervalBayesianDoubleML.is_deterministicBayesianDoubleML.isfittedBayesianDoubleML.mcseBayesianDoubleML.model_typeBayesianDoubleML.ncovariatesBayesianDoubleML.nobsBayesianDoubleML.pvaluesBayesianDoubleML.rhatBayesianDoubleML.rhat_statisticBayesianDoubleML.standardization_statsBayesianDoubleML.stderrorBayesianDoubleML.supports_subsamplingBayesianDoubleML.uses_samplingBayesianDoubleML.vcov