Scalable Agile Estimation and Normalization of Story Points: Estimation Challenges Galore! (Part 2 of 5)
In Part 1 of this multi-part blog series, I introduced the topic of the blog series and provided an overview. Scalable agile estimation methods are required to provide reliable estimates of workload (work effort) and also reliable velocity metrics (both estimated velocity and observed velocity) at the team, program and portfolio levels of large-scale agile projects. Without reliable estimates of workload and reliable velocity metrics at all levels, effective and meaningful determination of costs, return on investments and project prioritization cannot be made. For scalable agile estimation methods to work properly, story points across the space dimension (teams, epic hierarchies, feature group hierarchies, goals, project hierarchies, programs, portfolios, etc.) as well as across the time dimension (sprints, release cycles) need to have the same meaning and the same unit of measure. In other words, story points need to be normalized so they represent the same amount of work across the space and time dimensions.
In this Part 2 I first review the key requirements that must be satisfied for the traditional velocity-based planning to work properly. I then present three key challenges associated with the traditional velocity-based planning, explain how those challenges get exacerbated as agile projects begin to scale up, and what we need to do to mitigate them.
Critical requirements for traditional velocity-based planning to work properly
It must be recognized that the historical velocity of an agile team is a reliable leading indicator of its future velocity only when the below conditions have been met:
- Exactly the same members of a well-jelled agile team will continue for the next sprint. Agile team members are not substitutable commodities. Team members may differ greatly in their individual contributions and even a single team member change may have a major impact on team dynamics and productivity. Bringing in a brilliant prima donna new team member may actually reduce team productivity and velocity.
- The team has demonstrated a stable velocity over the last 3 to 4 sprints. This is required for the past average velocity to serve as a leading indicator for the future velocity.
- The number of work days available in the next sprint is almost the same as the number of work days in the past few sprints; i.e., the capacity of an agile team stays the same sprint after sprint.
- Any technology or platform change efforts will be relatively consistent over time (i.e., no drastic changes from sprint to sprint).
- Same amount of learning effort by team member is required from sprint to sprint (i.e., no drastic changes in learning effort from sprint to sprint).
- No new major constraints or impediment are anticipated or discovered.
- Neither the team nor its members are assigned to multiple concurrent projects.
- Planned work for the team remains consistent across sprints, i.e., the team is not disrupted by unplanned work or unexpected or poorly managed dependencies on other teams or external vendors.
- There is not a significant amount of work carried over from sprint to sprint.
There is a good parallel between this set of requirements and weather forecasting based on “Yesterday’s Weather Model.” If the weather has been stable for the last several days, tomorrow’s weather can be forecasted based on the recent weather pattern. Similarly, under the set of requirements listed above, velocity for the next sprint (tomorrow’s weather) can be estimated to be close to the average velocity over the last 3 to 4 sprints (yesterday’s weather pattern). Therefore, the above set of requirements is popularly referred to as “yesterday’s weather model.” Agile teams produce a more credible forecast of velocity, duration, and costs when they are working in a stable yesterday’s weather model pattern.
Key challenges with story point estimation and velocity metric
There are three key challenges associated with story point estimation and velocity metrics for large-scale agile projects. These challenges also exist in large enterprises that may have a large number of mostly independent projects, where it is often necessary to aggregate effort estimation and velocity metric data across multiple projects to provide visibility for senior management. Senior management often wants to know organizational velocity history and projected velocity trends, progress barometers and reports aggregated using story points and velocities of a large number of projects. These challenges are less severe and manageable for small agile projects consisting of a single or few agile teams. These challenges cannot be ignored for large-scale agile projects.
Challenge 1 – A single story point is unlikely to represent the same amount of work across teams and across sprints: Using agile project management tools (such as VersionOne), enterprises often do story point roll-up (adding up) across both space and time dimensions:
- Space dimension: project hierarchies (which may span different application domains), epic hierarchies which may span different teams or projects, feature group hierarchies, goals, programs, and portfolios
- Time dimension: sprints (iterations), release cycles
Enterprises also do other kinds of story point math, such as story point averaging, % completion in progress bars, burn-up charts showing the accepted number of story points across days in a sprint or across sprints in a release cycle, velocity trends and projections for teams, programs, portfolios, and the whole enterprise. In addition, many reports are generated to show story point and velocity metric consolidating data across projects in a program or across programs in a portfolio.
For example, if story point 1 for Team A represents 1 ideal day of work, and story point 1 for Team B represents 1.5 ideal days of work, then rolling up or addition of story points across Teams A and B that make a program does not make sense. In fact, any math involving story points across Team A and Team B does not make sense as the amount of work represented by a story point for each team is different, i.e., story point scales used by different teams are not the same. If you simply add up those story points, it would be similar to adding up financial results of different international divisions of a multi-national company by simply adding up their financial data represented in each division’s local currency without any currency conversion, i.e., currency normalization.
For the same reasons, epic points calculated based on a roll-up of story points of stories making up the epic will not make sense if those stories are estimated by different teams with different story point scales. Program or portfolio or enterprise velocities calculated by rolling up velocity numbers of different teams with different story point scales will not make sense.
For any story point math or story point reporting to make sense, a single story point of work needs to indicate the same amount of work across teams and sprints. This is a major and critical requirement which is often taken for granted without validation: do story points represent an equal amount of work effort across both space and time dimensions? Is this requirement satisfied? How do you know? Or is it an article of faith born out of blissful ignorance? Unless proven to be true through actual measurements in an enterprise, we must assume that the amount of work represented by one story point across space and time dimensions is not equal.
This challenge will be exacerbated with larger agile projects as there will be a lot more teams, projects, programs, portfolios, epics, etc., giving rise to natural variations in story point sizes and scales.
Challenge 2 – Bottom-up story point data is not available for estimating work during program and portfolio planning: In an enterprise, business initiatives often drive portfolio planning. Business initiatives are realized with large epics that often span multiple release cycles. A portfolio is managed with a set of coordinated programs, where epics are broken down into features during program planning. Each feature may take an entire release cycle and may need multiple sprints to complete. A program is assigned to multiple teams, features are broken down into small stories that can be parceled out to different teams, and each story is small enough to be completed in a single sprint. During portfolio and program planning sessions, most epics and features are not yet broken down into stories. So it is not practical to roll up story point estimates of all lower-level stories (most stories and their estimates are not available yet). As a result, it is very difficult to estimate how much work is involved in an epic hierarchy, program or portfolio.
Challenge 3 – Yesterday’s weather model requirements may not apply: When an agile team is about to plan its very first sprint or has completed only 1 or 2 sprints, there is very little reliable historical velocity data that can be used for forecasting the velocity of an upcoming future sprint. This can be characterized as the “start-up phase” problem; the expectation is that a team will reach stable velocity within a few sprints, and start benefiting from the stable environment of yesterday’s weather model. Keep in mind that at any point in time if a team composition changes in a major way, it is effectively a new team; it is thrown back to the start-up phase and has to wait for at least few sprints to regain stable environment and reestablish stable velocity. The yesterday’s weather model requirements need to be examined for each sprint (they cannot be taken for granted). Whenever yesterday’s weather model is not applicable, past velocity measure is not a leading indicator of future velocity for an agile team. In the start-up phase, we need other techniques for forecasting future velocity.
Furthermore, yesterday’s weather model requirements are difficult to hold as agile projects scale up. Even if yesterday’s weather model may be valid for a specific individual team, it becomes really challenging to hold that model as you scale up from a single-team agile project to a multi-team agile program, to a multi-program agile portfolio to a multi-portfolio enterprise. If dependencies among multiple teams are not minimized and managed well, they will adversely impact the velocity of one or more teams. In a large project, it is not enough to resolve team-level impediments. There are impediments that need to be resolved at program and portfolio levels too, which may impact several teams.
Let us assume that for a single team, the probability of holding yesterday’s weather model requirements is 90%, i.e., there is 10% probability that one or more requirements of the yesterday’s weather model will not be satisfied for a single team as it moves from one sprint to the next sprint. What is the probability that yesterday’s weather model is applicable to a program of 4 teams running in parallel (i.e., so-called Scrum of Scrum of 4 teams)? That probability will reduce from 90% for each individual team to 0.9 x 0.9 x 0.9 x 0.9 = 0.656 = 66% for a program of 4 teams. There is only a 66% probability that all the requirements under yesterday’s weather model will hold for the next sprint for an entire program of 4 teams. With similar probability math, you can determine that there is only 19% probability that all the requirements under yesterday’s weather model will hold for the next sprint for an entire portfolio of 4 programs (a total of 16 teams); and there is only 0.1% (almost zero) probability that all the requirements under yesterday’s weather model will hold for the next sprint for an entire enterprise of 4 portfolios (a total of 64 teams).
It may be easier to apply yesterday’s weather model to a small geographic region (a team); it is not easy at all to scale it up to a large geographic area (programs, portfolios, and enterprise), spanning a larger duration of time (multiple sprints and release cycles). Sooner or later, the weather pattern is going to change somewhere over a large space (adversely impacting one or more teams in an enterprise), and at some point in time over a large time span (adversely impacting one or more sprints).
The bottom line: We may not be able to depend on the past velocity data to forecast future velocity for a larger program or portfolio because the basic requirements for stable yesterday’s weather model for programs and portfolios are difficult to hold and sustain. We need other solutions for estimating and forecasting without depending on yesterday’s weather model.
In Part 3 of the blog series, I will explain two existing and published scalable agile estimation methods and present my critique of those methods. The first method is the one covered by Mike Cohn in his Agile Estimating and Planning book. The second method is the story point normalization method used in SAFe. The SAFe method is based on identifying a baseline story of 1 ideal developer day (1 IDD) effort by each team. Therefore, I refer to this method as the “1-IDD Normalization Method” (1NM for short).
In Part 4 I will present a scalable agile estimation method, called Calibrated Normalization Method (CNM). I have developed, taught and applied CNM by working with clients in my agile training and coaching engagements since 2010. Part 4 emphasizes the CNM bottom-up estimation (from teams to programs up to portfolios). I will also explain how CNM can be used by large enterprises that have a large number of projects that may be largely independent of each other, i.e., not necessarily organized into programs and portfolios.
In Part 5 I will explain how CNM performs the top-down estimation (from portfolios to programs down to teams). CNM estimates the scope of work at the portfolio and program levels without the need to know lower-level story point details. In Part 5 I will also compare and relate 1NM with CNM, and explain how CNM fully address all three challenges explained here in Part 2.
Acknowledgments: I have greatly benefited from discussions and review comments on this blog series from my colleagues at VersionOne, especially Andy Powell, Lee Cunningham, and Dave Gunther.
Part 1: Introduction and Overview of the Blog Series – published on 14 October 2013.
Part 3: Review of published scalable agile estimation methods – published on 7 November 2013.
Part 4: Calibrated Normalization Method for Bottom-Up Estimation – published on 18 November 2013.
Part 5: Calibrated Normalization Method for Top-Down Estimation – published on 2 December 2013.