Skip to content

What is MAB and How Does It Work?

MAB stands for Multi-Armed Bandit, the core optimization algorithm that powers JustAI’s variant testing. Unlike traditional A/B testing that splits traffic evenly for a fixed period, MAB dynamically allocates more traffic to better-performing variants in real time.

JustAI uses Thompson Sampling, a Bayesian approach to the multi-armed bandit problem:

  1. Explore phase — Traffic is distributed relatively evenly across all variants to gather initial performance data
  2. Exploit phase — As performance signals emerge, the system shifts more traffic toward winning variants while maintaining a configurable exploration rate (controlled by the epsilon parameter)
  3. Continuous learning — Even after winners emerge, a portion of traffic continues testing all variants to catch performance shifts

This means you waste less traffic on losing variants and converge on the best content faster than sequential A/B testing.

  • The explore/exploit balance is controlled by the epsilon parameter, which is configurable per template — there is no hardcoded exploration phase
  • Variants need sufficient send volume for the bandit to learn. Higher volume campaigns converge faster
  • The control/treatment split at the top level is still a true randomized A/B test, maintaining statistical validity
  • You can view traffic allocation over time in the template’s Overview dashboard

For a deeper explanation, see How the Bandit Works. For information on the underlying algorithms, see Ranking Algorithms.