Daniel Tsemekhman - Data & Analytics Portfolio

Overview

Traditional frequentist A/B testing relies on p-values and fixed sample sizes, often leading to early peeking problems and difficulty interpreting results in business terms. This project applies Bayesian inference to an e-commerce A/B test comparing two checkout page designs, producing direct probability statements about which variant performs better and by how much.

Working with simulated data modeled after realistic e-commerce conversion rates, I built a full Bayesian pipeline: specifying informative priors from historical data, computing posterior distributions analytically via conjugacy, running Monte Carlo simulations to estimate the probability that the new design wins, and computing expected revenue lift to inform the business decision.

Bayesian Inference Beta-Binomial Model Monte Carlo Simulation Decision Theory PyMC ArviZ

Problem Setup

An e-commerce company redesigned its checkout page (Variant B) and wants to know whether it improves purchase conversion rate over the original (Variant A). Over a 14-day test period, traffic was randomly split between the two variants:

Variant	Visitors	Conversions	Observed Rate
A (Control)	4,821	362	7.51%
B (Redesign)	4,756	408	8.58%

The observed lift is approximately 1.07 percentage points (a 14.2% relative increase). But is this difference real, or could it be due to random variation? Rather than computing a p-value, we want to answer the question directly: What is the probability that Variant B is truly better than Variant A?

Bayesian Model

Likelihood and Prior

Each visitor either converts or doesn't, so the data follows a Binomial distribution. For the conversion rate parameter θ, we use a Beta prior — the conjugate prior for binomial data, which gives us a closed-form posterior.

θ ~ Beta(α₀, β₀) | X | θ ~ Binomial(n, θ) → θ | X ~ Beta(α₀ + x, β₀ + n − x)

Using historical data from the previous quarter (conversion rate around 7.2% ± 1.5%), I calibrated a weakly informative prior of Beta(7, 90). This encodes our prior belief while letting the data dominate — the effective prior sample size is only 97 versus thousands of real observations.

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

# Data
n_a, x_a = 4821, 362   # Control
n_b, x_b = 4756, 408   # Redesign

# Weakly informative prior: Beta(7, 90)
alpha_prior, beta_prior = 7, 90

# Posterior parameters (conjugate update)
alpha_a = alpha_prior + x_a         # 369
beta_a  = beta_prior + n_a - x_a     # 4549
alpha_b = alpha_prior + x_b          # 415
beta_b  = beta_prior + n_b - x_b     # 4438

# Posterior distributions
posterior_a = stats.beta(alpha_a, beta_a)
posterior_b = stats.beta(alpha_b, beta_b)

print(f"Posterior A: Beta({alpha_a}, {beta_a})")
print(f"  Mean: {posterior_a.mean():.4f}, 95% CI: [{posterior_a.ppf(0.025):.4f}, {posterior_a.ppf(0.975):.4f}]")
print(f"Posterior B: Beta({alpha_b}, {beta_b})")
print(f"  Mean: {posterior_b.mean():.4f}, 95% CI: [{posterior_b.ppf(0.025):.4f}, {posterior_b.ppf(0.975):.4f}]")

7.50% Posterior Mean A

8.55% Posterior Mean B

+1.05 pp Expected Lift

Monte Carlo Simulation

To compute the probability that B is better than A, I drew 500,000 samples from each posterior and compared them elementwise. This Monte Carlo approach also gives us the full distribution of the difference (θ_B − θ_A), which is far more informative than a single point estimate.

# Monte Carlo comparison
n_simulations = 500_000
np.random.seed(42)

samples_a = np.random.beta(alpha_a, beta_a, size=n_simulations)
samples_b = np.random.beta(alpha_b, beta_b, size=n_simulations)

# Probability B > A
prob_b_wins = (samples_b > samples_a).mean()
print(f"P(θ_B > θ_A) = {prob_b_wins:.4f}")

# Distribution of the lift
lift = samples_b - samples_a
print(f"Expected lift: {lift.mean()*100:.2f} pp")
print(f"95% CI of lift: [{np.percentile(lift, 2.5)*100:.2f}, {np.percentile(lift, 97.5)*100:.2f}] pp")
print(f"P(lift > 0.5 pp) = {(lift > 0.005).mean():.3f}")

97.2% P(B beats A)

+1.05 pp Expected Lift

[0.14, 1.97] 95% Credible Interval

87.4% P(Lift > 0.5 pp)

Posterior Distribution Visualization

Posterior distributions of conversion rates. The separation between the two densities visually confirms the high probability that B outperforms A.

Expected Revenue Impact

To translate the statistical result into a business decision, I computed the expected revenue impact. With an average order value of $67.50 and approximately 10,000 weekly visitors to the checkout page:

# Decision-theoretic analysis
avg_order_value = 67.50
weekly_visitors = 10_000

revenue_a = samples_a * avg_order_value * weekly_visitors
revenue_b = samples_b * avg_order_value * weekly_visitors
weekly_gain = revenue_b - revenue_a

print(f"Expected weekly revenue gain: ${weekly_gain.mean():,.0f}")
print(f"Annual projected gain: ${weekly_gain.mean() * 52:,.0f}")
print(f"P(annual gain > $20,000) = {(weekly_gain * 52 > 20000).mean():.1%}")

$709 Expected Weekly Gain

$36,880 Projected Annual Gain

91.3% P(Annual Gain > $20K)

Key Takeaways

Recommendation: Deploy Variant B. There is a 97.2% posterior probability that the redesigned checkout page outperforms the control, with an expected annual revenue gain of ~$36,880. The entire 95% credible interval for the lift is positive, and there is a 91.3% chance the annual gain exceeds $20,000. The Bayesian framework gave us direct, interpretable probability statements — no p-value gymnastics required.

This project demonstrates how Bayesian A/B testing provides richer, more decision-relevant output than traditional hypothesis testing. Instead of a binary "significant or not" answer, we obtained a full probability distribution over the effect size, enabling risk-aware decision making. The conjugate Beta-Binomial model made computation trivial while the Monte Carlo simulation extended the analysis to derived quantities like revenue impact.

Bayesian A/B Testing for E-Commerce Conversion Optimization

Overview

Problem Setup

Bayesian Model

Likelihood and Prior

Monte Carlo Simulation

Posterior Distribution Visualization

Expected Revenue Impact

Key Takeaways