Agile Performance Metrics… Are they useful or not? Which should we use?

Today’s post is related with a hot and controversial topic that is the Agile Performance Metrics.

Why? Well, as you might be aware we have different opinions from from different stakeholders on what concerns Agile Performance Metrics.

  • Why should we use?
  • What value they bring?
  • How should be applied?

For some time all the discussions that I had regarding this subject I’ve encountered people with different opinions as described bellow:

  • We shouldn’t use the performance metrics because they don’t bring any value.
  • We should’t use the performance metrics because stakeholders and management will start to compare teams.
  • We shouldn’t use performance metrics because we don’t have indicators that shows the real work of the teams.
  • Etc.

Nevertheless, I also encountered positive opinions as described bellow:

  • They are great since we can understand what is going on and tackle issues as soon as soon as possible.
  • We can understand the business and teams behaviour.
  • The information is already there and this will allow us to have a clear view of what is going on.
  • We an be assertive when we need to take decisions since we have detailed information.
  • Etc.

Textbook definition tell us that “A performance metric is that which determines an organization’s behavior and performance. Performance metrics measure an organization’s activities and performance. It should support a range of stakeholder needs from customers, stakeholders to employees.”. [1]

Saying this, my opinion is that the primary objective is to provide the teams and the business with metrics that enable them to track their performance over time against its previous performance at any time.

Why should we use it?

  • We should use it since this help us to measure the impact of our initiatives.
  • For example, focusing on quality, we expect to see a drop in defects and if not, then our strategy isn’t working and we need to try something else.
  • Again emphasising that this is not for comparing teams but it’s to look at our wider initiatives, checking if they are having the desired effect.
  • Nevertheless, this will allow us to achieve our common business objectives by taking actions at the correct time and increase the performance for all our teams.

Many people believe that from an Agile perspective we should only use velocity as metric. Meaning that, with this metric they can understand the business and team behaviours.

Well… I agree that velocity is a good metric, but I also believe that in order to keep improving we should implement new indicators to have a more accurate information regarding behaviours.

Saying this, in my opinion we should use the described bellow:

  • Delivered Work Items

As you are aware one of the Agile Principles is continuous delivery of valuable software. [2]

For that we have the acronym INVEST originated in an article by Bill Wake in 2003. This helps to remember a widely accepted set of criteria, or checklist, to assess the quality of a user story. If the story fails to meet one of these criteria, the team may want to reword it, or even consider a rewrite (which often translates into physically tearing up the old story card and writing a new one). [3]

A good user story should be:

  • “I” ndependent (of all others)
  • “N” egotiable (not a specific contract for features)
  • “V” aluable (or vertical)
  • “E” stimable (to a good approximation)
  • “S” mall (so as to fit within an iteration)
  • “T” estable (in principle, even if there isn’t a test for it yet)

Saying this, in my opinion Delivered Work items measures how much work a team delivers over time, based on the number of work items accepted per team member.

More work items delivered means more business value has been delivered or more technical debt has been reduced.

Never forgetting that points are a relative measure for each team. Each Item accepted applies through the DOD.

What does this implicate?

We want to give value to smaller work items (INVEST – Independent, Negotiable, Valuable, Estimable, Small, Testable).

Screen Shot 2015-02-07 at 00.07.53

Throughput is the number of items completed over a period of time per full-time person on the team. In this variant it’s the number of work items moving into the schedule state “Accepted” minus the number of work items moving out of “Accepted” within the timebox, divided by the number of full-time equivalents.

Why this is important?

Throughput shows how well work is moving across the board to accepted. This can be adversely affected if work items are frequently blocked or inconsistently sized.

  • Consistency of Delivered Work Items

Consistency of Delivered Work Items measures how consistent a team is at producing work over time. High consistency means stakeholders can confidently plan when work will be delivered.

This is also an indication that teams are better at estimating their work and flowing consistently-sized increments of value.

Consistency score is a percentile ranking derived from a three month window of a team’s Variability of Throughput. Predictability can also be configured to include variability of velocity or the standard deviation of time in-progress.

What does this implicate?

Variability of throughput is computed by finding the average (mean) and standard deviation of throughput for 3 month time periods. The coefficient of variance is the standard deviation divided by the mean.

Screen Shot 2015-02-07 at 00.09.13

For the Throughput StdDev and Throughput Mean we could use the last 3 months.

Why is this Important?

Consistency is critical to accurately planning delivery dates and budgeting.

The variability of Delivered Work Items shows if your team delivers the same amount of work each month. This can be adversely affected if Work Items are frequently blocked or inconsistently sized.

  • Work Items Lead Time

Work Items Lead Time measures the ability of a team to deliver functionality soon after it’s started. 

High Lead Time can dramatically impact the ability of businesses to react to changing markets and customer needs.

What does this implicate?

The Lead Time score is a ranking calculated by the amount of time required for a team to take a single story or defect from start to finish.

Screen Shot 2015-02-07 at 00.10.14

P50 is the Median (Sort and pick the middle for those who didn’t go to school)

Note: It’s Median and not Average because there are usual irregularities that would negatively affect the average (for example, a single user story that took 2 weeks to complete).

Why is this Important?

Low Lead Time is central to Agile. Teams that are more responsive have faster time to market, and react more quickly to customer feedback.

Time in Process is the number of business days a typical user story is in the “In-progress” or “Completed” column. Similar to lead time, cycle time, and time to market.

Accurate planning and budgeting requires work items that are estimated to be short, testable and easily deliverable. Time in Process shows how long the median of Work Items are in process.

  • Quality

Quality measures how disciplined a team is in preventing defects.

What does this implicate?

Screen Shot 2015-02-07 at 00.10.54

All defects should be assign to each team accordantly with their Product/Component ownership and User Story. Worst case scenario in case that we are unable to do that, I believe that its more fare if we count all defects from the Product/Component and then we split trough all the teams working in the Product/Component (this means that every team inside the same Product has the same Quality).

How could combine these four metrics in order to understand the global behaviour?

Please find below two exemples on how could we combine the four metrics in order to get a simple, tranparent and quick view that can be read by all stakeholders including Management.

  • Performance Chart (Higher is better)

Performance Chart

This chart provides a way for business and teams to track their performance over time against its previous performance.

  • Overall Chart (Higher is better)

In the Overall Chart, each metric (described below) weighs 25% of the performance, totalling 100%. Once more provides a way for business and teams to track their performance over time against its previous performance.

Overall Performance Chart

Its important to keep in mind that each metric should be calculated at each month. This way we can avoid any issue regarding different Sprint start and end time by any team.

Why is this important?

Sustainable delivery of value requires a consistently high level of Quality for all work delivered to production.


[1] Wikipedia | Performance Metric

[2] Agile Manifesto | Principles behind the Agile Manifesto

[3] Agile Alliance | Invest