This post is from the CollabNet VersionOne blog and has not been updated since the original publish date.
Agile Lean Metrics – How Big Should Your Experiment Be?
Agile 2013 in Nashville, Tennessee has officially wrapped up. The atmosphere was lively and informative, with all the great vendors and speakers making this a great show. If you missed this year’s conference, make sure you start planning today for next year, which will be held in Orlando, Florida! These conferences are a great place to learn about Agile if you are new, or to re-energize your ongoing efforts. It’s also a great opportunity to speak directly with experts to get the specific help you need.
One of the questions that came to me at the conference was about metrics; specifically about how many customers to test in an experiment in order to decide whether the hypothesis has been validated or refuted. In the inquirer’s case, they were running into a problem where their development team was waiting for the experiment to finish so they could make a decision to move forward or to pivot. This wait time started to become very long and so the process didn’t seem to be very fast at all. When I asked how big their cohorts for the test were, he said they determined that in order to make a good decision, they would need to test 50% of their users. Yikes!
Remember, the point of this exercise is to learn quickly so we know whether to move forward or to pivot. On theleanstartup.com site, Eric Ries identifies the 5th principle of The Lean Startup methodology to be the Build-Measure-Learn feedback loop, and states that, “all successful startup processes should be geared to accelerate that feedback loop.”
If we spend too much time in the learn cycle, the cost on this effort can quickly burn through valuable resources, and put a damper on the entire effort. So, one of the things we need to make sure of is that our cohorts are sized appropriately so that we can get feedback more quickly. One of the most common ways to do this is to determine the cohorts based on their signup date. If you are moving a little slower, you may have these dates set by week. If you are moving quickly, you may even have this by day.
Another thing to consider is statistical significance. We do this by taking a sample of the population (market size) and testing the hypothesis against that sample. In order to determine whether the results are significant, meaning that it’s reasonable enough to believe that the results are representative of the entire population, we can either sharpen our math skills, or just use a calculator, like this one, to help us out. If you play around with this, you’ll notice that even though you may drastically increase the population size, the sample size necessary to get the confidence level you want won’t increase all that much.
A couple things to keep in mind when doing this is to first recognize that we may not necessarily know with absolute certainty who our customers are. That may be the point of the tests when starting out. Second, we should also be aware that we probably don’t know for certain what the market size is. This is another hypothesis that will continue to be tested as we iterate through the Build-Measure-Learn feedback loop.
So, don’t get hung up on making sure that every single customer has been validated against before you consider your learning to be significant. We want to move quickly through these cycles so we can learn quickly. Now that I put an emphasis on speed, I’ll say this, speed isn’t everything you should be concerned about, but that’s for another post.