Which experiment should I do next?

In science, we constantly face this dilemma. Which experiment will advance my research the fastest?

Through my experiences and interactions with different researchers, I’ve noticed a pattern:

Engineers prefer safe experiments—those they know will work.
Scientists often pursue high-risk, high-reward experiments that frequently fail.

So, which approach is correct?

Neither! The optimal strategy for advancing an R&D project is to conduct experiments where each test has a 50% chance of falsifying the null hypothesis.

Let’s break this down with a simple analogy.

A Needle in a Haystack

Imagine you’re searching for a needle in a haystack—a metaphor for making an important discovery.

The haystack contains 1 billion pieces of straw.
The only way to detect the needle is with a metal detector that beeps to indicate "yes" for a given scanned pile.
You can perform one scan per day.

What’s the fastest way to find the needle? The metal detector is an analogy for a scientific experiment, and one scan per day is an analogy for the average time a single scientific experiment takes.

Strategy 1: The Tedious One-by-One Approach

The most basic approach is to scan each piece of straw one at a time.

On average, it would take:

$\frac{1, 000, 000, 000}{2} = 500, 000, 000 days \approx 1.4 Million years$

Clearly, this is not an efficient strategy.

Strategy 2: The Engineer’s Approach

An engineer takes a systematic approach:

Scan the entire pile.
Remove 10% of the haystack and rescan.
Repeat until the needle is found.

The engineer likes this method because the majority of the time, the metal detector beeps, and everything is under control as they make progress.

In the first scan, the needle is always detected (100% probability). After removing 10% and rescanning:

90% chance the needle is still in the larger pile (with 900M pieces of straw).
10% chance the needle is in the removed pile (with 100M pieces of straw).

The expected pile size after the second scan is:

$N = 0.9 (900 M) + 0.1 (100 M) = 810 M + 10 M = 820 M$

We generalize this with the recurrence relation:

$N_{n+1} = N_n(2P^2 - 2P + 1)$

where $P$ is the probability of retaining the needle in each scan.

For P = 90%, we compute:

$R = 2(0.9)^2 - 2(0.9) + 1 = 0.82$

The number of scans needed:

$R^i = \frac{1}{10^9}$

Taking the logarithm:

$i = \frac{-9}{\log(0.82)} \approx 104 \text{ scans}$

Or, 105 scans if we also count that first calibration scan. In any case, the engineer finds the needle (makes the discovery) in a little over 3 months. Much faster than scanning one-by-one!

Strategy 3: The Risk-Taking Undergraduate

An undergraduate student, encouraged by their advisor to try “crazy” experiments, has only a 15% chance of success per experiment.

For P = 15%, we get:

$R = 2(0.15)^2 - 2(0.15) + 1 = 0.745$

The number of scans needed:

$i = \frac{-9}{\log(0.745)} \approx 70 \text{ scans}$

So the undergraduate student is done in a little over 2 months. This is even more efficient than the engineer’s method!

At first, this seems counterintuitive. But in this case, failing more often actually speeds up progress because of the sporadic breakthroughs that make up for the many failed experiments.

The engineer is frustrated.

The analogy here could be that an undergraduate student is testing a new sample prepared in a way that has not been tested before, using an instrument that they have never used before, hoping to see an effect that has been described in the literature. It is unlikely to succeed, but if it does it will pay off big! Or so they think. By the way, this was me during my undergrad research! Actually, my P number was probably closer to 2% than the 15% used in this analogy since I did a lot of swinging for fences and not much hitting!

Strategy 4: The Postdoc’s Eureka Moment

A postdoctoral scholar, having experimented with both conservative and risky approaches, asks:

What is the most efficient probability of success?

To find the optimal $P, she minimizes:$

$R = 2P^2 - 2P + 1$

Taking the derivative:

$\frac{dR}{dP} = 4P - 2$

Setting it to zero:

$4P - 2 = 0 \Rightarrow P = \frac{1}{2}$

Eureka! The postdoc realizes that the optimal way to advance research is to conduct experiments with a 50% probability of falsifying the null hypothesis at each step. She divides the pile of hay into two equal sized piles at each step.

The postdoc would find the needle in around:

$i = \frac{- 9}{\log (0.5)} \approx 30 scans$

So the postdoc is done in a month. In this analogy, the postdoc can make 3 scientific discoveries in the time it takes the engineer to make 1! This is a big difference. The difference between designing safe experiments and risky experiments can yield a factor of 3 in productivity!

In this analogy, the postdoc is constantly designing experiments with unknown outcomes. "Will my sample have a greater or smaller signal than my last one? Both are equally likely in my guesstimation, but after this experiment I will have a conclusive answer!" she thinks before performing her next conclusive experiment. Her research makes inevitable progress forwards at a maximum rate.

Takeaways for Researchers

So what does this mean for you as a researcher?

1. Design experiments with a hypothesis.

Have a clear hypothesis, and its opposite - a null hypothesis - that you can rule out one way or another.

2. Make experiments conclusive.

Each test should definitively rule out a possibility. Usually, this is falsifying the null hypothesis.

3. Aim for a 50% success rate.

If your experiments always succeed in falsifying the null hypothesis, you’re being too conservative.
If they always fail in falsifying the null hypothesis, you’re wasting time.
The sweet spot? 50%—where each experiment has an equal chance of disproving your null hypothesis (supporting your hypothesis).

By following this approach, you can maximize your research efficiency and accelerate discovery.

Happy experimenting!

random thoughts of a procrastinating researcher

Wednesday, February 12, 2025

The 50% Rule: How to Optimize Your Research Strategy