For tests where the arms trade one-off orders against subscription signups. Enter what each arm produced. Statistical significance is checked per metric in Lockbox; this panel answers the different question of which arm produced more value.
When the test goal is subscription signups, your analytics is nearly blind. It sees single payments, not who became a subscriber, how long they stay, or what their orders are worth over time. So teams fall back on what they can see, and a subscriber gets counted like a one-off conversion: one order, one AOV, done. A variant that trades a few one-off sales for subscription signups looks like a loser on the dashboard while quietly being the most valuable test of the year.
You don't need perfect data to fix this. You need a defensible model with explicit assumptions. Subscription economics reduce to three numbers: order value, order frequency, and monthly churn. Since you rarely know churn, this tool works in scenarios (conservative, base, optimistic) and, for test decisions, inverts the question entirely: how long must a subscriber stay for the variant to win? That break-even number is defensible even with zero subscription data, because it doesn't claim to know the answer. It tells the room exactly what they're betting on.
"It is better to be roughly right than precisely wrong."
ATTRIBUTED TO JOHN MAYNARD KEYNES
For tests where the arms trade one-off orders against subscription signups. Enter what each arm produced. Statistical significance is checked per metric in Lockbox; this panel answers the different question of which arm produced more value.
With monthly churn c, the chance a subscriber is still active in month m is (1−c)ᵐ. Expected active months within a horizon of H months:
The horizon cap keeps the model honest: an uncapped 1/c at 4% churn claims 25 months of revenue, most of it far in the future. The default 24-month cap limits how much the distant tail can inflate today's decision. Churn is assumed constant, which slightly overstates early-month retention for most programs (real churn is highest in months 1–3), so treat results as mildly optimistic.
Computed per scenario. This is the single number that changes how a subscription test is read.
Per arm, per scenario:
The verdict compares arms in all three scenarios. If the same arm wins in all three, the decision is robust to the churn assumption. If the winner flips between scenarios, the tool says so — that means the decision genuinely depends on a number you don't have, and the break-even below is the honest way to present it.
When one arm has fewer one-off orders but more subscribers, the value tie-point is the subscriber lifetime that exactly cancels the one-off deficit:
Reported in months, with the implied churn rate solved numerically from the capped-geometric formula. The claim "the variant wins if subscribers stay at least 4.2 months" requires no subscription data at all — it's a threshold, not an estimate, and the room can judge whether 4.2 months is a safe bet for your product.