Why you might be wasting your time with Story Point estimation

If you and your team use story point estimation then you really need to read this.

Story Point Estimation

Story size is a relative sizing method commonly calculated using planning poker. The fibonacci scale is one approach but there are others including 1,2,4,8,16, and S, M, L, XL.

planning poker cards fibonacci

Measuring Cycle Time

Cycle time is the measure of how long a story takes to move from the point at which work starts on it, to the point considered to be done.

basic card wall with cycle time label

The problem with story points

When teams use planning poker I’ve observed in pretty much every case that they infer a correlation between relative size and how long a story should take to complete. For example, it wouldn’t be unreasonable to assume that a story estimated at 13 points will take longer to complete than a story of 1 points. In other cases I’ve observed an assumption that defects are generally smaller than stories. Viewed in a chart these preconceptions might look something like this:

As you can see, the amount of time it takes for the work to be done is neatly aligned to the relative size estimates.

However, back in the real world the picture is somewhat different. The reality is, there is often little correlation between story size and the amount of time it takes to do the work.

What does the data tell us?

We collected data from 25 teams across 4 different organisations. Each dataset contained on average 500 data points from a mixture of stories and defects. In total, the data contained estimated size and total time taken for over 12500 work items.

The chart below shows the data for one of the teams. This chart is interesting as it shows the larger the sizing the less predictable the time taken to complete is – apart from the three 8 point stories which should of been sized as 1 point stories??

story size and cycle time scatter

I then pondered over the idea that maybe some teams are only considering the dev effort when sizing so I narrowed the definition of done down to just measure the cycle time from Ready for Dev to Dev Done/Ready for Test. Now clearly, considering your code to be done if it hasn’t been tested, UAT’d, and deployed to Live is pretty insane and I’m certainly not recommending it. What I’m interested in here though is the accuracy of story sizing when just considering dev effort. The following chart plots just the cycle time for stories from Ready for Dev to Dev Done.

dev only sizing vs cycle time

Again, low correlation. Here’s data from two more teams measuring cycle time from Ready for Dev to Ready for Live:

Team 1

So if sizing of stories can’t give us any indication of how long it will take to deliver work then what’s the point of story sizing? In my next article I’ll cover The Purpose of Story Sizing but for this article let’s look at why story size doesn’t always correlate with cycle time.


To understand why estimated size rarely aligns with time taken you need to explore what causes of variation you have within your delivery process. Variation comes in many forms and each company and team will present their own nuances. Here’s just a few sources of variation common to most delivery teams:

  • Lack of visibility of work.
  • Waiting for people to become available.
  • Availability of people in different time-zones (for distributed development teams).
  • Story size variation.
  • Story complexity variation.
  • The time it takes to resolve questions when relying upon other team members.
  • Context switching due to having too much work in progress.
  • Unrealistic timelines.
  • Poor prioritisation of work.
  • Hand-offs to other teams or 3rd parties.
  • Lack of automated test coverage resulting in regression and extended manual testing time.
  • Incorrect sequencing of work resulting in local dependencies.
  • Staff churn / turnover.
  • Variation in capabilities of individuals.
  • Misinterpretation of requirements.
  • Effect of productivity variance.
  • Waiting time when people are diverted to deal with incidents or other work.
  • Technical environment issues.
  • Poorly written user stories.
  • Sharing staff across multiple projects.
  • Release contention across multiple teams.
  • Environment contention – availability of technical environments at the right time.
  • Processes external to the team introducing waiting time.
  • Waiting for sign-off from departments external to the team.
  • Waiting for assets to be delivered from other teams, e.g. UI assets.
  • Build up of technical debt.
  • High defect rates resulting in rework.

There are many tools in the Agile / Kanban toolbox that can help us to reduce variation (you can’t eliminate variation). If you can reduce variation you become more predictable. If you become more predictable it makes planning easier. If you make planning easier you can manage expectations more effectively.

About Ian Carroll

Ian is a consultant, coach, trainer and speaker on all topics related to Lean, Kanban and Agile software development.

Connect with Ian at the following


Leave a Comment

Your email address will not be published. Required fields are marked *

Application for a free Agile Coaching session

I would like to speak with an advisor