How to Run Your First A/B Test (Step-by-Step)

Q: How much traffic do I need to run a test?

There’s no hard minimum, but you need at least a few hundred conversions per variation to reach statistical significance. If your page gets fewer than 100 visitors per day, focus on high-traffic pages first or plan for a longer test duration.

Q: Can I test more than two variations at once?

Yes. You can create multiple variations, but more variations require more traffic because each gets a smaller share. For your first test, two variations (Original vs. one alternative) is the best approach.

Q: What if my test shows no significant difference?

A neutral result is still a result. It means the change doesn’t meaningfully impact user behavior. Review session replays to understand why, then use that insight to design a better test. Not every hypothesis will be correct.

Q: Will the experiment cause a visible flicker when it loads?

In most cases, changes apply fast enough that users don’t notice. If you’re modifying above-the-fold elements on a page with significant render time, you can use activation settings to control exactly when the experiment loads.

Key Takeaways

A good A/B test starts with a clear hypothesis grounded in real user data, not a random guess
Inspectlet’s visual editor lets you build variations without writing code—just click and edit page elements directly
Choosing the right goal type (URL view, click, custom event, or engagement) determines whether your results are meaningful
Never call a winner until you reach statistical significance—typically 95% or higher—regardless of how promising the data looks early on
Watch session replays of each variation to understand why a version won, not just that it won

Before Your First Test: What to Test and Why

The biggest mistake teams make with A/B testing isn't technical—it's picking the wrong thing to test. A random change to a random page rarely produces meaningful results. Great tests start with evidence.

Finding Test Candidates from Real Data

Before you touch the experiment builder, spend 30 minutes reviewing your existing analytics data. You're looking for pages where users struggle, hesitate, or drop off. The best sources for test candidates are:

Session recordings: Watch 15–20 recordings of users on your highest-traffic conversion pages. Look for patterns: do users scroll past your call-to-action? Do they hover over a button but not click? Do they start filling out a form and abandon it? These behavioral patterns are goldmines for test ideas.
Heatmaps: Click heatmaps show you where users actually click versus where you want them to click. If your "Start Free Trial" button gets fewer clicks than the navigation menu on a landing page, that's a test candidate.
Form analytics: If your signup form has a 60% abandonment rate and most users drop off at the "Company Size" field, you have a clear hypothesis: removing or simplifying that field might increase completions.
Funnel data: Any step in your conversion funnel with a drop-off rate above 40% deserves investigation. That's where an A/B test can have the highest impact on revenue.

Forming a Hypothesis

Every test needs a hypothesis written before you create the experiment. A good hypothesis follows this formula:

“Because we observed [evidence], we believe [change] will [expected outcome], which we’ll measure by [metric].”

For example: “Because we observed that 70% of users on the pricing page never scroll to the comparison table, we believe moving the most popular plan to the top of the page will increase Plan Select clicks by 15%, which we’ll measure by the click-through rate on the plan selection button.”

Without a hypothesis, you're gambling. With one, you're running a scientific experiment—and even a “losing” test teaches you something about your users.

Step 1: Create the Experiment

Once you have your hypothesis, it's time to build the experiment. In Inspectlet, go to your dashboard, navigate to the A/B Testing section, and create a new experiment. Give it a descriptive name that references your hypothesis—something like “Pricing Page — Popular Plan First” rather than “Test 1.”

Using the Visual Editor

The visual editor opens your live site in an editing environment with two modes:

Interact mode: Browse your site normally. Click links, open dropdowns, navigate to the exact page and state you want to test. Use this mode to get the page into the right starting condition before you begin editing.
Edit mode: Click any element on the page to modify it. You can change text, swap colors, hide or show elements, move components around, and adjust styling—all without writing a single line of code.

Switch to Edit mode when you're ready to make changes. Click the element you want to modify and an editing panel appears. You might change a headline, update a button label, rearrange sections, or hide a distracting element. The editor supports undo and redo, so experiment freely—you won't break anything.

Building Your Variation

Your experiment starts with the Original—the page as it exists today. You then create one or more variations that implement your hypothesis. For your first test, stick to a single variation (Original vs. Variation A). This keeps the test clean and requires less traffic to reach significance.

Each variation can include:

Visual changes: Text edits, color changes, element visibility, layout modifications—anything you can do in the visual editor
Custom JavaScript: For more complex changes, you can add JavaScript that runs when the variation loads. This is useful for dynamic content or logic-driven changes.
URL redirects: If your variation is an entirely different page (a complete redesign, for example), you can redirect users to a different URL instead of modifying the existing page

First Test Tip

Change only one thing in your first test. If you change the headline, button color, and layout all at once, you won't know which change drove the result. Single-variable tests are easier to learn from and faster to reach statistical significance.

You can enable or disable individual variations without deleting them. This is useful if you want to pause a variation that's clearly underperforming while letting the remaining ones continue collecting data.

Step 2: Set Your Goals

Goals define what “winning” means for your test. Without goals, you'll have data about how many visitors saw each variation but no way to measure which one performed better. Inspectlet supports four goal types, each suited to different scenarios:

URL View Goals

A URL view goal counts a conversion whenever a visitor reaches a specific page. This is the most common goal type for funnel-based tests. Use it when success means the user navigated somewhere specific:

Reaching a “Thank You” or confirmation page after a form submission
Visiting the checkout page from a product page
Landing on a specific feature page from the homepage

Click Goals

A click goal tracks when a visitor clicks a specific element on the page. This is ideal when your test focuses on engagement with a particular component rather than full-funnel navigation:

Clicking a “Start Free Trial” button
Clicking a pricing plan selection
Clicking a video play button

Custom Event Goals

Custom event goals trigger when your code fires a specific event using Inspectlet’s tagSession method. This gives you full flexibility to define conversions based on any application logic:

// Fire when a user completes a multi-step action
__insp.push(['tagSession', 'completed_onboarding']);

Use custom events for actions that don't have a dedicated URL or clickable element—like completing an in-app tutorial, reaching a scroll depth threshold, or triggering a specific application state.

Engagement Goals

An engagement goal counts any meaningful interaction with the page—specifically, the first mouse-down event. This is a broad goal useful for measuring whether a variation encourages users to interact with the page at all, rather than bouncing immediately.

Choosing the Right Goal

Match your goal type to your hypothesis. If your hypothesis is “moving the CTA above the fold will increase signups,” a URL view goal on the signup confirmation page is the right choice. If it's “a more compelling button label will get more clicks,” use a click goal on that button. Your goal should directly measure the outcome your hypothesis predicts.

You can attach multiple goals to a single experiment. This lets you track both a primary metric (the one tied to your hypothesis) and secondary metrics (engagement, intermediate steps) to get the full picture.

Step 3: Configure and Deploy

Traffic Split

By default, traffic is split evenly between your variations. For a two-variation test (Original vs. Variation A), each gets 50% of visitors. This is the right starting point for most first tests because it reaches statistical significance fastest.

If you're nervous about a dramatic change, you can allocate less traffic to the variation—say 80/20—but know that this will take significantly longer to produce statistically meaningful results. For your first test, stick with an even split.

Activation Settings

Activation settings control when the experiment loads for a visitor. By default, the experiment activates as soon as the page loads. This works well for most tests.

However, if you're testing a page that loads dynamically (a single-page application, for instance), you may need to activate the experiment manually using JavaScript:

__insp.push(['activateExperiment', experimentId]);

This ensures the experiment doesn't fire until the relevant content has actually rendered on the page. We'll cover this in more detail in the advanced section at the end of this guide.

Deploying the Experiment

When your variations are built, your goals are set, and your traffic split is configured, it's time to deploy. Click Save to store your experiment, then Deploy to push it live. The experiment state changes from “Saved” to “Deployed,” and visitors will immediately start seeing your variations.

You can pause and resume a deployed experiment at any time without losing data. This is useful if you spot an issue after launching—pause the experiment, fix the problem, and resume.

Pre-Launch Checklist

Before deploying, verify your experiment in a private/incognito window. Switch between variations in the visual editor to confirm each one looks correct. Check that your goals are properly configured. Test on both desktop and mobile screen sizes. Five minutes of QA can save you from running a broken test for two weeks.

Step 4: Wait for Results

This is where most first-time testers struggle—patience. Once the experiment is live, you need to resist the urge to check results every hour and declare a winner after 48 hours.

How Long to Run a Test

The minimum test duration depends on three factors: your traffic volume, the size of the effect you're measuring, and the baseline conversion rate. As a general rule:

High-traffic pages (1,000+ daily visitors): You might reach significance in 1–2 weeks
Medium-traffic pages (200–1,000 daily visitors): Plan for 2–4 weeks
Low-traffic pages (under 200 daily visitors): You may need 4–8 weeks, and should consider testing a page with more traffic instead

Regardless of traffic, always run a test for at least one full business cycle—typically one or two complete weeks. This accounts for day-of-week effects (weekday vs. weekend behavior can differ dramatically) and avoids skewed results from anomalous days.

Reading the Results Page

Inspectlet’s results page shows you the data that matters for each goal you've configured:

Visitors: How many unique visitors were assigned to each variation
Conversions: How many visitors in each variation completed the goal
Conversion rate: Conversions divided by visitors, expressed as a percentage
Delta vs. control: How the variation's conversion rate compares to the original—shown as a percentage change like “+40%” or “−12%”
Statistical significance: A percentage representing how confident you can be that the difference is real, not random chance. Displayed as a value like “97.2%”

When to Call a Winner

Wait until statistical significance reaches 95% or higher before making any decisions. Below 95%, the observed difference could easily be noise. Here's a practical framework:

Below 80% significance: Too early to draw conclusions. Keep the test running.
80–94% significance: Promising but not conclusive. Continue running unless you've already exceeded your planned duration.
95%+ significance: You can confidently declare a result. If the variation outperforms the original, it's your winner.

If after a reasonable duration (4+ weeks) your test hasn't reached 95% significance, the practical difference between the variations is likely too small to matter. That's still a valid result—it tells you this particular change doesn't meaningfully impact user behavior.

Step 5: Act on Results

Implementing the Winner

When your test produces a clear winner, the next step is making the change permanent. Since Inspectlet experiments run client-side in the browser, the experiment itself can continue serving the winning variation—but best practice is to implement the winning change directly in your codebase. This eliminates the brief flicker that can occur before the experiment loads and removes the dependency on the experiment script for a permanent change.

Watching Session Replays of Each Variation

Here's where Inspectlet's combination of A/B testing and session replay becomes uniquely powerful. Instead of just knowing that Variation A converted 25% better than the Original, you can watch recordings of real users in each variation to understand why.

Filter your session recordings by experiment variation and look for behavioral differences:

Did users in the winning variation scroll less before finding the CTA?
Did users in the losing variation hesitate or hover over confusing elements?
Did the winning variation reduce form abandonment at a specific field?
Were there unexpected interactions—users engaging with elements you didn't change?

This qualitative layer turns a simple “A beat B” into a deep understanding of user behavior that informs your next ten tests, not just this one.

Run Your First A/B Test Today

Visual editor. Built-in session replay. No code changes required.

Start Free

Common First-Test Mistakes

After helping thousands of teams run their first experiments, these are the mistakes we see most often:

1. Testing Too Many Things at Once

Changing the headline, button color, layout, and copy in a single variation makes it impossible to know which change caused the result. If the variation wins, you can't isolate the impact. If it loses, you might discard a good idea because a bad idea dragged it down. Start with single-variable tests and graduate to multivariate testing once you're comfortable.

2. Ending Tests Too Early

Checking your results on day two and seeing “Variation A is up 80%!” at 60% significance is not a result. Early data is volatile—small sample sizes produce wild swings. Teams that end tests early waste their effort because the “winner” often reverses once more data comes in. Commit to your planned duration and significance threshold before launching.

3. Ignoring Qualitative Data

Numbers tell you what happened. Session replays tell you why. A variation might win by 10%, but watching the replays reveals that users were confused by the new layout and succeeded despite it—meaning a cleaner version could win by 30%. Always review session recordings alongside your quantitative results. The CRO guide covers this mixed-methods approach in detail.

4. Not Having a Hypothesis

Without a hypothesis, you're not testing—you're guessing. A hypothesis gives you a framework for interpreting results. If your test “fails” but you had a clear hypothesis, you've learned that your assumption was wrong. That's valuable. If your test “fails” and you had no hypothesis, you've learned nothing.

Five Great First Tests to Try

If you're not sure where to start, these five tests work well for nearly every website. Each targets a high-impact area and is simple enough to set up in under 30 minutes:

1. CTA Button Text

Test your primary call-to-action button copy. Generic text like “Submit” or “Click Here” almost always loses to specific, benefit-oriented text. Try “Start My Free Trial” vs. “Get Started” vs. “See It in Action.” Use a click goal on the button to track engagement, and a URL view goal on the next page to track follow-through.

2. Form Field Reduction

If your signup or contact form has more than four fields, test a shorter version. Remove the fields that aren't essential for the initial conversion (you can ask for additional information later). Every field you remove reduces friction. Use form analytics data to identify which fields have the highest abandonment rates, then remove or combine those first.

3. Headline Variants

Your homepage or landing page headline is the first thing visitors read. Test three approaches: a benefit-focused headline (“Increase Conversions by 30%”), a problem-focused headline (“Stop Losing Customers at Checkout”), and a curiosity-driven headline (“What 10,000 Session Recordings Revealed”). Use an engagement goal to measure whether the headline pulls users into the page.

If you have testimonials, logos, or case study references, test their placement. Move them higher on the page—above the fold, near the CTA, or alongside your pricing. Many teams bury social proof at the bottom of the page where few visitors ever see it. Session replays will show you exactly how far users scroll before converting or bouncing.

5. Pricing Page Layout

The pricing page is typically one of the highest-leverage pages on your site. Test layout changes: horizontal plan cards vs. a vertical comparison table, highlighting a “Most Popular” plan, adding a monthly vs. annual toggle, or simplifying the feature comparison. Use a URL view goal on the checkout or signup page to measure which layout drives more plan selections.

Advanced: Manual Activation for Single-Page Apps

If your site is built with React, Angular, Vue, or another single-page application framework, pages don't reload during navigation. This means a standard experiment that activates on page load may fire at the wrong time—before the target content has rendered.

The solution is manual activation. Instead of letting the experiment activate automatically, you trigger it from your code when the relevant content is ready:

// Activate when the pricing page component mounts
useEffect(() => {
    if (window.__insp) {
        __insp.push(['activateExperiment', 12345]);
    }
}, []);

This pattern works with any framework. The key is calling activateExperiment after the page content your experiment modifies is present and rendered. In React, this typically means inside a useEffect hook. In Angular, it's in ngAfterViewInit. In Vue, use the mounted lifecycle hook.

Manual activation also opens up advanced use cases like activating experiments based on user actions (e.g., only test a new checkout flow for users who have items in their cart) or based on URL parameters.

Since Inspectlet experiments run entirely in the browser with JavaScript, no server-side changes are needed regardless of your tech stack. The visual editor and activation system work with static sites, server-rendered applications, WordPress, Shopify, and every major JavaScript framework.

Frequently Asked Questions

Do I need to write code to run an A/B test?

No. Inspectlet's visual editor lets you build variations by clicking and editing elements directly on your page. You only need code for advanced scenarios like manual activation on single-page apps or custom event goals.

How much traffic do I need to run a test?

There's no hard minimum, but as a practical guideline, you need at least a few hundred conversions per variation to reach statistical significance within a reasonable timeframe. If your page gets fewer than 100 visitors per day, focus on high-traffic pages first or plan for a longer test duration.

Can I test more than two variations at once?

Yes. You can create multiple variations (Original, A, B, C, etc.), but more variations require more traffic because each variation gets a smaller share of visitors. For your first test, two variations (Original vs. one alternative) is the best approach. Save multi-variant tests for after you've built some experience.

What if my test shows no significant difference?

A neutral result is still a result. It means the change you tested doesn't meaningfully impact user behavior—which is useful knowledge. Review session replays to understand why the change had no effect, then use that insight to design a better test. Not every hypothesis will be correct, and that's expected.

Will the experiment cause a visible flicker when it loads?

In most cases, the experiment applies changes fast enough that users don't notice. If you're modifying above-the-fold elements on a page with significant render time, you can use activation settings to control exactly when the experiment loads, minimizing any visual transition.