Why hypotheses and success criteria matter
When you validate an idea, you are not trying to “prove you’re right.” You are trying to reduce uncertainty fast by testing the most important assumptions. A clear hypothesis turns a vague belief (“people will want this”) into a testable statement. Success criteria define what “good enough evidence” looks like before you run the test, so you don’t move the goalposts after seeing results.
Without hypotheses, you collect random feedback and end up with opinions that are hard to act on. Without success criteria, you can rationalize almost any outcome as a win (“some people liked it”), which leads to spending time and money on an idea that hasn’t earned it.
Core definitions (in practical terms)
Hypothesis
A hypothesis is a specific, testable claim about how a customer will behave or what outcome will occur under certain conditions. It should be written so that evidence can clearly support or contradict it.
- Bad (vague): “Users will love our app.”
- Better (testable): “When shown a 60-second demo and pricing, at least 20% of visitors will click ‘Start free trial’.”
Assumption
An assumption is something you believe is true but haven’t verified. Hypotheses are assumptions written in a testable form.
Success criteria
Success criteria are the measurable thresholds you set before running a test to decide whether the hypothesis is supported enough to proceed, revise, or stop. They include:
- Listen to the audio with the screen off.
- Earn a certificate upon completion.
- Over 5000 courses for you to explore!
Download the app
- Metric: what you will measure (e.g., conversion rate, reply rate, preorders).
- Threshold: the minimum value that counts as success (e.g., ≥ 15%).
- Time window / sample: how long you’ll run the test or how many observations you need.
- Decision rule: what you’ll do if you hit, miss, or land in-between.
What makes a hypothesis “clear”
Clarity comes from removing wiggle room. A clear hypothesis usually includes five elements:
- Who: the audience segment you’re testing (e.g., “independent fitness coaches who sell 1:1 packages”).
- Trigger / situation: the context in which they encounter the offer (e.g., “after seeing a pricing page”).
- Action / outcome: what they will do (e.g., “book a call,” “pay a deposit,” “use the feature weekly”).
- Value mechanism: why they would do it (e.g., “because it saves 2 hours/week on scheduling”).
- Measurable threshold: how much is enough (e.g., “≥ 10 bookings from 200 visitors”).
When any of these are missing, you get ambiguous results. For example, “People will pay for it” is unclear because it doesn’t specify who, how much, or under what conditions.
Types of hypotheses you should write (and why)
Most early-stage validation can be organized into a small set of hypothesis types. Writing them explicitly helps you test in the right order.
1) Demand hypothesis (interest exists)
This tests whether people show meaningful interest when presented with the idea.
- Example: “From targeted outreach to 50 prospects, at least 15 will reply and 5 will request more details within 7 days.”
2) Value hypothesis (the promise is compelling)
This tests whether the specific benefit you claim resonates enough to drive action.
- Example: “When the offer is framed as ‘reduce reporting time from 3 hours to 30 minutes,’ the landing page conversion will be at least 2x higher than the generic ‘all-in-one dashboard’ framing.”
3) Willingness-to-pay hypothesis (money changes hands)
This tests whether interest translates into payment or a strong financial commitment signal.
- Example: “At least 3 out of 10 qualified leads will agree to a $200 deposit for early access after a 20-minute call.”
4) Channel hypothesis (you can reach people predictably)
This tests whether a distribution channel can deliver prospects at a cost and volume that could work.
- Example: “Using LinkedIn messages, we can book 8 calls per 100 messages sent, with no more than 2 hours of founder time per week.”
5) Usability / activation hypothesis (people can get value quickly)
This tests whether users can complete key steps and experience the promised benefit.
- Example: “In a guided prototype test, 80% of participants will complete the core workflow in under 5 minutes without help.”
6) Retention hypothesis (value repeats)
This tests whether the product becomes part of a routine.
- Example: “Among 20 trial users, at least 30% will use the core feature weekly for 4 weeks.”
Step-by-step: how to formulate hypotheses
Step 1: List your assumptions and rank them by risk
Write down everything that must be true for the idea to work. Then rank each assumption by:
- Impact: if false, does the idea fail?
- Uncertainty: do you have evidence already?
Start with assumptions that are both high-impact and high-uncertainty. This prevents you from optimizing details before you know the core is viable.
Step 2: Convert the top assumption into a testable statement
Use a simple template:
We believe that [WHO] will [ACTION/OUTCOME] when [CONDITION/TRIGGER] because [VALUE MECHANISM].Then add measurement:
We will consider this true if [METRIC] is at least [THRESHOLD] within [TIME WINDOW / SAMPLE].Example:
We believe that boutique accounting firms will request a demo when shown an offer to cut monthly close time by 30% because it reduces overtime and errors. We will consider this true if at least 8 out of 40 targeted outreach recipients request a demo within 10 days.Step 3: Define what you will measure (choose one primary metric)
Pick a single primary metric that best represents the behavior you care about. Secondary metrics are allowed, but they should not override the primary decision.
- Primary metric examples: demo requests, deposits paid, email signups, completed onboarding, weekly active usage.
- Secondary metric examples: time on page, qualitative comments, feature preferences.
A common mistake is choosing a metric that is easy to collect but weakly connected to the business (e.g., social likes). Prefer metrics that require effort, commitment, or money.
Step 4: Set a threshold that forces a decision
Success criteria should be ambitious enough to protect you from false positives, but realistic enough that you can hit them if the idea is truly promising.
To set a threshold, use one (or more) of these approaches:
- Benchmark approach: compare to typical conversion rates for similar actions (e.g., cold outreach reply rates, landing page conversion). If you don’t know benchmarks, start with a conservative range and adjust after one test cycle.
- Economics approach: work backward from what you would need for the business to work (e.g., if you need 10 customers/month and expect 10% close rate from calls, you need 100 calls/month; then set criteria for call booking rate).
- Opportunity cost approach: set the bar high enough that passing means the idea deserves more time than your other options.
Also define a failure threshold (clearly not working) and an uncertain zone (needs iteration). This avoids binary thinking and makes next steps obvious.
Step 5: Specify the test conditions and keep them stable
Write down the exact conditions of the test so results are interpretable:
- What message/offer will be shown?
- What price (if any) will be presented?
- What channel will be used?
- What counts as a qualified lead?
- How long will you run it?
If you change multiple variables at once (message, price, audience, channel), you won’t know what caused the outcome. If you must change something mid-test, treat it as a new test with new success criteria.
Step-by-step: how to define strong success criteria
1) Choose a “commitment ladder” metric
Different actions signal different levels of commitment. In general, stronger signals are better for validation because they reduce the chance you’re measuring polite interest.
- Low commitment: “This is cool,” social likes, casual survey answers.
- Medium commitment: email signup, waitlist join, booking a call.
- High commitment: deposit, preorder, paid pilot, signed letter of intent.
Pick the highest-commitment metric that is feasible at your stage. For example, if you can’t ethically take preorders yet, a deposit for a paid pilot or a scheduled call with a clear agenda may be the next best signal.
2) Define the denominator (so the metric is meaningful)
“10 signups” means nothing without knowing out of how many exposures. Always define the denominator:
- Signups per unique visitors
- Replies per messages sent
- Deposits per qualified calls
- Activations per trial starts
This prevents you from celebrating raw counts that come from tiny samples or unqualified traffic.
3) Set a minimum sample size or time window
Small samples can mislead you. You don’t need advanced statistics to be disciplined; you need a rule that prevents overreacting to noise.
- Example rule: “Run until we have 100 landing page visitors from the intended channel” or “Conduct 15 prototype sessions.”
- Time-based rule: “Run for 7 days” (useful when traffic is steady).
Choose a sample size that you can realistically reach and that matches the decision’s importance. Higher-stakes decisions require more evidence.
4) Add a quality filter (qualification criteria)
Not every response is equal. Define what counts as a qualified signal. For example, a “demo request” might only count if the person matches your target profile and confirms they have the relevant need and authority.
Example qualification checklist for a B2B demo request:
- They currently use a workaround or competitor
- They experience the problem at least weekly
- They can influence purchase
- They confirm budget range or willingness to pay
Then define success criteria using qualified counts, not total counts.
5) Predefine the decision and next action
Write what you will do for each outcome:
- If success: proceed to the next riskiest hypothesis (often willingness-to-pay or activation).
- If uncertain: revise one variable (message, offer, price framing) and rerun.
- If failure: stop or pivot the assumption (different value mechanism, different channel, different pricing model).
This turns validation into a repeatable process rather than an emotional debate.
Examples: turning vague ideas into hypotheses + success criteria
Example A: Service business (done-for-you)
Vague belief: “Small teams want help with social media.”
Clear hypothesis:
We believe that seed-stage B2B SaaS founders will book a 30-minute call when offered a done-for-you LinkedIn posting service that saves them 3 hours/week because they lack time to write consistently.Success criteria:
- Metric: qualified calls booked / outreach messages sent
- Threshold: ≥ 6 qualified calls per 80 messages within 10 days
- Qualification: founder confirms they currently post ≤ 1x/week and want to increase frequency
- Decision rule: if ≥ 6 calls, test willingness-to-pay with a paid pilot offer; if 3–5, revise messaging; if ≤ 2, reconsider offer or channel
Example B: Digital product (subscription)
Vague belief: “People will subscribe to our tool.”
Clear hypothesis:
We believe that freelance designers will start a trial after seeing a pricing page for a tool that generates client-ready proposals in under 10 minutes because it reduces admin time and helps them close faster.Success criteria:
- Metric: trial starts / unique visitors to pricing page
- Threshold: ≥ 8% trial start rate from 300 visitors over 14 days
- Secondary: ≥ 25% of trial users create at least one proposal within 24 hours
- Decision rule: if primary hits but activation misses, improve onboarding; if primary misses, revise value proposition or pricing
Example C: Physical product (preorder signal)
Vague belief: “Customers will buy this accessory.”
Clear hypothesis:
We believe that commuters who bike to work will place a preorder for a weatherproof backpack insert that keeps electronics dry because it prevents damage without needing a new bag.Success criteria:
- Metric: preorders / landing page visitors from targeted ads
- Threshold: ≥ 2.5% preorder rate from 1,000 visitors within 21 days
- Guardrail: refund request rate ≤ 5% after preorder confirmation email
- Decision rule: if preorder rate hits, proceed to supplier quotes; if not, test alternative positioning or price
Common pitfalls (and how to fix them)
Pitfall 1: Testing opinions instead of behavior
“Would you use this?” invites polite answers. Replace it with a behavior-based test and metric (click, book, pay, use). If you must ask questions, tie them to tradeoffs: “Which would you choose today?” and then measure what they actually do next.
Pitfall 2: Success criteria that are too easy
If your threshold is “at least 1 person is interested,” almost every idea passes. Raise the bar until passing would genuinely change your confidence and justify the next investment.
Pitfall 3: Moving the goalposts after results
Decide thresholds in advance and write them down. If you want to change criteria, do it before the next test, not after seeing data.
Pitfall 4: Mixing multiple hypotheses in one test
If you test a new channel, new audience, new message, and new price simultaneously, you can’t diagnose failure. Keep one primary hypothesis per test, and treat other changes as controlled variables.
Pitfall 5: Ignoring negative signals because of a few enthusiastic people
Outliers can be useful, but they can also mislead. Use your success criteria to avoid being swayed by a single excited response. If you see outliers, write a new hypothesis about that niche and test it intentionally.
A practical worksheet you can copy
Use this structure to write each hypothesis and its success criteria in a consistent way:
Hypothesis name: (e.g., Demand via cold outreach, v1)We believe that [WHO] will [ACTION] when [CONDITION] because [VALUE MECHANISM].Test method: (landing page / outreach / prototype session / paid pilot)Primary metric: [metric] = [numerator] / [denominator]Success threshold: ≥ [X] by [date] or within [N] observationsFailure threshold: ≤ [Y] by [date] or within [N] observationsUncertain zone: between [Y] and [X] (iteration required)Qualification rules: (what counts, what doesn't)Decision rule:- If success: next hypothesis to test is [ ... ]- If uncertain: change [one variable] and rerun- If failure: pivot [value mechanism / channel / pricing / segment]How to choose the next hypothesis to test
Once you have one hypothesis and criteria, you need a sequence. A simple rule is: test the assumption that would most quickly make the idea non-viable if false. Often, that means starting with demand and willingness-to-pay signals before polishing features.
You can maintain a “hypothesis backlog” like this:
- H1 (highest risk): People take a meaningful first step (book/pay).
- H2: The price point is acceptable.
- H3: The channel can reach enough people efficiently.
- H4: Users can achieve the promised outcome quickly.
- H5: They repeat usage or renew.
For each, write success criteria that justify moving to the next. This keeps your validation focused and prevents you from building prematurely.