What you'll understand by the end of this lesson
- What CRO is, in precise and usable language
- Why the people who know a product best are often the worst judges of it
- The specific cognitive biases that cause smart teams to make expensive mistakes
- Why CRO is accessible to anyone in marketing, not just data specialists
A £150 million lesson
In February 2014, Marks & Spencer launched a complete overhaul of their website. The project took years and cost £150 million.
The results were immediate and brutal. Online sales dropped 8.1% in the first quarter. Customers couldn't find products. Items disappeared from shopping carts. The checkout failed for entire UK cities. Six million registered customers were forced to create new accounts.
The M&S team wasn't careless. They had experienced leadership and a significant budget. What they lacked was a method for understanding how real customers would actually respond before the changes became irreversible.
That is the gap CRO fills.
What CRO means, precisely
CRO is the practice of using data collected from real visitors to understand how changes to a website affect behaviour — and applying that understanding to systematically increase the percentage of visitors who complete a desired action.
A few things worth unpacking:
"Data collected from real visitors" — not from the product team's intuition, not from what competitors appear to be doing. From actual visitors to your actual site.
"How changes affect behaviour" — not whether a change looks better or feels more on-brand. Whether it measurably changes what visitors do.
"Desired action" — this needs to be defined, singular, and measurable before any optimisation begins. A site trying to simultaneously optimise for newsletter signups, purchases, and brand awareness will optimise for none of them.
"Systematically" — this is the word that separates CRO from a lucky experiment. A system means an ongoing cycle: observe, hypothesise, test, learn. Not a one-time project.
Why your own judgment isn't enough
In 1999, psychologists Dunning and Kruger published a study that has since become one of the most cited in psychology. Core result: people who perform poorly at a task dramatically overestimate how well they're doing.
This describes exactly what happens when internal teams evaluate their own websites.
Every person who has worked on a site for months has a curse of familiarity. You know what every element does. You understand every design decision. You can no longer experience the site the way a first-time visitor does — in a few seconds, with no context, deciding whether it's worth their time.
Two specific biases explain most of the damage:
Confirmation bias
Your brain notices evidence that supports what you already believe and discounts evidence that doesn't. When a team is excited about a redesign, early positive signals feel like validation and warning signs feel like noise.
Novelty bias
New, visually striking ideas feel like they'll work — because they generate internal excitement. But your visitors weren't in the brainstorm. They don't share the excitement. They just see a page and decide whether it answers their question.
When a new design generates a lot of internal excitement, that's exactly the moment to become more rigorous — not less. The excitement is evidence of novelty bias, not evidence of quality.
You are a sample size of one
When you form an opinion about your website, you're contributing one data point from someone maximally unrepresentative of your visitor base. You have deep product knowledge. Your visitors have none of that — and your opinion can't stand in for theirs.
What breaks down without CRO
The M&S case is dramatic, but the same pattern plays out quietly at smaller scale all the time. The costs accumulate across four areas:
Time. Work gets done that produces no measurable improvement, because success was never defined before starting.
Management overhead. After a launch underperforms, senior people get involved and bring their own opinions — which are also a sample size of one. A team with data can defend its decisions. A team without data is just hosting a debate.
Borrowed bad ideas. Without a framework for evaluating ideas on your own site, the natural move is to copy what competitors appear to be doing. But you have no visibility into whether those choices are working for them.
Playing it safe. Teams that have been burned enough times stop taking creative risks. Writing becomes generic. Layouts become interchangeable. Playing it safe is the end state of a team that has no safe way to test whether a risk is worth taking.
These effects compound. A few missed quarters leads to pressure, which leads to more opinions and less testing, which leads to more misses. Without a testing habit, the cycle accelerates.
What changes when a team does this well
| Before | After |
|---|---|
| "I think we should change the headline" | "Here's the data and my hypothesis" |
| Big redesigns with months of risk | Small, controlled experiments with clear exit criteria |
| Arguments settled by seniority | Arguments ended by what the data shows |
| Copying what looks successful | Testing what actually works for your visitors |
The last row matters most. In a team with a real testing habit, a test that "fails" is not a waste — it eliminates a hypothesis and improves the next one. The learning compounds.
Marks & Spencer's £150M website overhaul is used in this lesson to illustrate what?
If your own judgment is unreliable — how do you turn the ideas your team does have into something actually worth testing?