Skip to content

Reading Results

The results dashboard shows how your experiment is performing across control, treatment, and individual variants. Here’s how to read it.

At the top of the results view, you’ll see an aggregate comparison:

  • Control — Your original copy, served to a fixed percentage of traffic.
  • Treatment — All JustAI variants combined, served to the remaining traffic.

This is the first thing to check. If treatment is outperforming control, JustAI is delivering value. The lift percentage shows exactly how much.

Lift is the percentage improvement of treatment over control on your selected metric.

Example: Control has a 2.0% click rate. Treatment has a 2.4% click rate. Lift = +20%.

Projected results extend this: if you applied the winning treatment to 100% of traffic, what would the total impact be? This helps quantify the business case for shipping.

Within the treatment group, individual variants are ranked by performance on your key metric. The bandit automatically shifts traffic toward top performers, so variant traffic won’t be evenly split after the initial period.

What to look for:

  • Top-performing variants — These are receiving the most traffic and driving the best results.
  • Underperforming variants — These have been deprioritized by the bandit. Consider archiving them to free up the exploration budget.
  • New variants — Recently added variants may still be in the exploration phase with limited data. Give them time before judging.

Results include significance indicators that tell you how confident the system is in the observed differences. A result is meaningful when:

  • The variant has accumulated enough sends (typically 1,000+).
  • The observed difference is unlikely to be due to chance.

Don’t make shipping decisions on results that haven’t reached significance. Early data is noisy.

Use date range filters to focus your analysis:

  • After adding new variants — Filter to the period after the new variants went live. Including pre-addition data will skew the comparison.
  • After milestoning — Filter to the current milestone period. Previous milestone data used a different control baseline.
  • Seasonal analysis — Narrow the window to compare performance during specific campaigns or time periods.

The dashboard can break down performance by user attributes (e.g., locale, plan type, lifecycle stage). This reveals:

  • Which segments respond best to which variants
  • Whether a variant that looks average overall is actually a strong winner for a specific audience
  • Opportunities for more targeted personalization

You can toggle between two counting methods:

  • Unique — Counts each user once, regardless of how many times they triggered the event. Best for understanding reach and per-user behavior.
  • All events — Counts every event instance. Better for understanding total volume (e.g., total revenue, total clicks).

Choose the mode that matches your optimization goal. For most engagement metrics (open rate, click rate), unique is the standard. For revenue, all-events is typically more useful.