PostHog Handbook Library / Marketing

999 words. Estimated reading time: 5 min.

Experiments

Elevator pitch

PostHog Experiments run on top of Feature Flags, so every experiment is a controlled rollout by default. Multiple metric types — funnels, trends, retention, ratio metrics — let you measure what actually matters, not just what's easy to track. Watch session replays of each variant's users. Measure side effects across your entire product.

For self-driving development, this means enabling PostHog Code to create experiments and re-query the results automatically - making experiments the evaluation layer for agents.

Most experiment platforms run tests. PostHog runs tests and closes the loop from signal to fix to evaluation.

The unique belief (in terms of experiments)

In traditional product development, an A/B test is the last step before shipping. In self-driving development they are tactically important because they turn product data into a control system. They let the agent make changes without pretending it has perfect judgment. The agent can be wrong safely, because every change has a measured blast radius and a pass/fail signal.

That’s the PostHog-shaped belief: autonomous coding only becomes trustworthy when the agent can measure whether its own work improved the product.

Experiments are how the autonomy loop knows whether it's working. Without experiments, agents can ship changes indefinitely without knowing if anything improved. Experiments are the feedback signal that makes self-driving product development trustworthy — not just fast.

Who this is for

Who this isn't for

Messaging

Message 1: Experiments as the evaluation layer for agents

Problem: An agent that ships code without measuring impact isn't self-driving — it's just automated code generation. The autonomy loop requires a closed feedback cycle: change made → outcome measured → agent learns. Experiments are part of that.

Solution: PostHog Code can scaffold experiments end-to-end. The same metric that triggered the fix signal is used as the experiment goal. Post-merge, PostHog Code re-queries the result and evaluates whether the change worked — without human intervention.

Supporting features:

Message 2: From flag to experiment to decision in one platform

Problem: Running an experiment can require a flag in LaunchDarkly, analytics in Amplitude, and a session tool in FullStory. Alternatively, you rely on a dedicated tool like Optimizely or VWO with a more limited feature set.

Solution: PostHog Experiments are built on Feature Flags. Every experiment starts as a flag variant. Every result is visible in PostHog Analytics. Every variant's sessions are watchable in PostHog Replay. Everything is in one place, with one data model - and all available to the PostHog MCP and PostHog Code to facilitate agentic workflows.

Supporting features:

Message 3: Watch replays of your experiment variants

Problem: Experiment results tell you which variant won. They don't tell you why. The "why" usually requires separate qualitative research or guesswork about what users were experiencing differently.

Solution: PostHog lets you filter session replays by experiment arm. Watch what users in the control group did. Watch what users in the treatment group did. Understand the behavioral difference, not just the metric difference.

Supporting features:

Battle cards

vs Optimizely

Their approach: Mature web experimentation with no-code visual editor, strong CRO tooling. Enterprise-focused, expensive. No native session replay or product analytics.

Where PostHog wins:

Objections

"We can't run experiments with our traffic volume"

Answer: PostHog shows minimum detectable effect and required sample size before you start an experiment. It won't let you launch an underpowered test without a clear warning. For low-traffic products, PostHog supports Bayesian-style continuous monitoring via sequential testing, reducing the time to a confident result.

"We already use LaunchDarkly flags for experiments"

Answer: PostHog's experiments use the same flag infrastructure you'd use with LaunchDarkly — but with analytics and session replay built in. You don't have to migrate your flags to start running experiments. Connect PostHog to your event stream, define a goal metric, and you can measure the impact of existing LaunchDarkly flags in PostHog today.

Selling to enterprise

Enterprise experimentation customers get volume discounts, group analytics for account-level experiment analysis, SSO, access controls, EU data residency, and SOC 2. Contracts follow the four-lever framework.

The consolidation pitch is strong: Optimizely or VWO contracts at enterprise are often $50k+/year for a tool that only runs tests. PostHog covers experiments, analytics, session replay, feature flags, and error tracking for a comparable or lower total spend — and adds the agent evaluation loop that no pure experimentation vendor offers.

Canonical URL: https://posthog.com/handbook/marketing/positioning/experiments

GitHub source: contents/handbook/marketing/positioning/experiments.md

Content hash: 2f6d2cf9c1ad1788