It was a quiet Friday until a QA guy shouted “Hey, we just hit a showstopper problem!”
The problem was difficult to reproduce, which, say, only 5% of trials resulted in the failure. In the follow up test, QA installed a previous version of the software and ran 30 times but they couldn’t reproduce. Now, can we conclude it is a regression bug?
Not really. Even if the previous version has the bug, it may not fail just by luck. Let’s do the math.
The probability of zero failure in 30 trials when the underlying failure rate is 5% follows binomial distribution, which is . It means we still have more than 20% chance of the bug existing also in the previous version.
To reasonably conclude a regression bug, we may want to reduce the chance to less than 5%. To get that level, we need 59 trials because