Course announcement: FlowViz Fundamentals

FlowViz is a free data visualisation template for Power BI that allows you to better visualize flow metrics from the data your teams’ Kanban board generates in Azure DevOps, Azure DevOps Server, TFS or GitHub Issues. The chart contains industry recognised flow metrics (Throughput, Work In Progress, Cycle Time, Work Item Age) as well as probabilistic forecasting/Monte Carlo Simulation for answering “when will it be done” or “what will we get”. It is inspired by the work of many leaders in our field, such as Larry Maccherone, Troy Magennis, Dan Vacanti and Prateek Singh. 

Since launching in 2019, it has been downloaded hundreds of times and continues to be one of the most popular tools in the marketplace for flow metrics. 

I'm pleased to announce the next iteration of the product: FlowViz Fundamentals course.

FlowViz Fundamentals provides a comprehensive overview of every chart within FlowViz, detailing how they are calculated, what to look for in the charts, exercises where you have different teams' charts presented and you have to observe what you can identify, as well as follow on videos where I explain what the patterns/anti-patterns are.

The course is online and self-study, allowing you to go at your own pace and contains over 50 videos for you to work through. No additional downloads are needed, with all the videos and example exercises (with sample data) built into the course.

Project to Product nudges for large organisations

As large and typically “traditional” organisations start to embark, maintain momentum or pivot on their journey with new ways of working, the buzzwords of “Project to Product” are likely to be in the conversation in your organisation.

One of the primary drivers of this has been the book of the same name from Mik Kersten. Whilst the book is no doubt useful, it does not quite delve into the practical application within your organisation, particularly around various aspects of product management.

In a recent Twitter interaction with Chris Combe, I felt the questions he was asking were very much in-line with questions I had/get asked about in organisations on this journey:

The following is a summary of thinking I’ve been encouraging and questions I’ve been asking in recent weeks (as well as over the past few years in different contexts). These are what I believe to be appropriate nudges in the different conversations that will be happening in the different parts of an organisation on this project to product journey. To be clear, it is not to be confused with a framework, model or method, these are simply prompts (or tricks up your sleeve!) that I believe may help in progressing conversations you may be having.

Defining product

One of the first challenges faced is actually defining what is a product in your context. You don’t need to reinvent the wheel here, with plenty of good collateral out there in the industry already. A starting definition I like to use is:

Product — something that closely meets the demands of a particular market and yields enough value to justify its continued existence. It is characterised by its benefits, features, functions and uses that satisfy the needs of the consumer.

Due to the fact many folks will be used to approaching work from a project perspective, it helps to clearly articulate to people what the differences between the two are. In it’s most simple form, I like to go with:

Project — a temporary endeavour undertaken to create or deliver a solution, service or result. A project is temporary in that it has a defined beginning and end in time, and therefore defined scope and resources.

However, definitions alone are not enough and you will find be left open to interpretation. There is already good information on the key differences out there already, such as this paper from IT Revolution, or this neat graphic from Matt Philip’s blog post:

Be prepared to be able to talk to this at detail with folks, explaining how some of the thinking may work currently in your organisation and what that may look like differently with this lens over the top.

IT Applications (Technology Products)

It’s highly likely given the software lens of the Project to Product book that the immediate focus on product definition will be in Technology/IT. If that’s the case, it might help to elaborate one step further with the following definition I like to use:

Technology Product — commercial and/or internal technology product that adds value to the user. Products may be assembled from other products, using platforms and/or unique components.

I find this definition works quite well in the myriad of application, operations, platform and component teams most large organisations start off with. In particular it allows people to start thinking about what part they may play in the bigger system of work.

SNow way all those are products!

One experience you might have is similar to what I’ve observed (in two large organisations now) which is where folks tend to fall into the trap of thinking that just because you can raise a ticket against something in ServiceNow (SNow), then that’s a product. In my role prior to Nationwide, some IT Service Management folks (who actually were one of the strongest advocates in moving to this way of working) were trying to get ahead of the curve and unfortunately made this same mistake. They extracted the configuration item records from ServiceNow, finding there were over 800 applications in our IT estate. Which prompted a flabbergasted view of, “we’ve got 800+ products, so we’re going to have 800+ teams?!?!”

First of all, chances are pretty high that a number of these are not actually products! As part of this move to new ways of working, use it as an opportunity to consolidate your estate (or what will hopefully become known as your product portfolio). For one, you’re likely to find items in the list that are already decommissioned yet the records have not been updated.

Another question is around usage — this is where you refer back to your definition of product. How do you know it’s used? Do you have any telemetry that points to that? In the example I mentioned there was practically no analytics on usage for nearly all 800+ applications. As a proxy, we took the decision to start with incident data, making the assumption that those applications with higher call volumes were likely to have high usage. Those without an incident raised for 2 years are candidates to investigate further, becoming retired and not a part of our product world.

You can also use this as a learning point when incrementally moving to product teams. For those initial pilots, a key data point should be embedding that product telemetry to validate those assumptions made on those “products”. Depending on your technology, some of this can be more sophisticated (more modern applications will tell you pretty much everything that happens on the user journey), whereas for older, legacy technology you may have to rely on proxy measures (e.g. number of logins).

Grouping products

Even with a consolidated list, this still doesn’t mean a 1:1 ratio for every product and product team. It’s likely a number of these products will be releasing less frequently (maybe monthly or quarterly). This does not mean you are in “BAU” or “maintenance mode” (for more on this — please read Steve Smith’s latest post), which is another unlearning needed. What you can do is take these smaller, less frequently releasing products and group them into a product family. A definition I like to use for this is:

Product Family — a logical grouping of smaller products that address a similar business need in different but related ways. They may or may not align to a specific business unit or department.

What you may then choose to do is logically group these products / product families further. This however is highly context dependent, so be prepared to get this wrong! In Nationwide we use Member Needs Streams, within three long-lived Missions. This absolutely works for those in the Missions working to delight Members needs, however becomes more troublesome if you’re in an area working to the needs of colleagues rather than members (for example, we have a shared technology platform area with multiple teams). You may choose to go with Value Streams, however this can often be misinterpreted. I’ve had this experience in two organisations, prior to Nationwide the organisation simply couldn’t get their head around what a value stream truly was. More recently, within Nationwide, some folks have baulked at a value stream as it is wrongly misinterpreted as extracting value from members, aka treating them the same as shareholders, which is not what we want and is a distinguishing factor for us being a mutual.

In making this decision, definitely explore different approaches. For example, your organisation may even have a business capability or value chain defined already. Whatever you chose, be mindful that it will likely be wrongly misinterpreted. Be armed with real examples of what it is and is not, specific to your context.

Products for other departments (i.e. not just IT!)

Once you’re aligned on product definition, questions will arise from those teams who do not ‘build’ technology (IT) products. What products do they have? Thankfully there is again thinking already in this space, with this great blog from Jeff Gothelf that articulates how other departments in the organisation can think about what they offer to the rest of the organisation as “products”.

Source:

Do HR, Finance and Legal make products?

When we consider our context, within Nationwide we recently announced changes to our approach with remote working, going with the slogan of Locate for our day. This is essentially the formal recognition that we’re now increasingly seeing hybrid events with our way of working, remote and flexible working is here to stay, however there will be instances where in person collaboration may be more effective, therefore it’s a collective (our) decision on work location. You could articulate this with a product lens, like so:

Defining non-technology products may be something that comes later on the journey (which is fine!) but experience tells me it helps to have applied this thinking to your organisation so that when the conversation inevitably happens, you have some ready made examples.

Not everything is a product/not everyone will be a Product Manager

Another trap to avoid is going too far with your ‘product’ definition. Consider your typical enterprise applications used by pretty much everyone, taking Microsoft Teams as an example. It would be unfair (and not true) to put someone in a Product Manager role and say they are the Product Manager for MS Teams. Imagine introducing yourself at a conference to peers as the Product Manager for MS Teams (when working at say, Nationwide) — I’m sure that would confuse a lot of people! If you can’t make priority decisions around what to build then don’t kid yourself (or let others fool you!) into thinking you’re a Product Manager. This will only damage your experience and credibility when you do move into a product role. Similarly, don’t move people into a Product Manager role and not be clear on what the products are. This is again unfair and setting people up to fail, feeling lost in their new role.

Product Teams

With clarity on what a product is and the types of products you have in your organisation, it’s also important to consider what your organisation means by product teams. In my view, the best description of a product team (or what you should be aspiring to get to) is from Marty Cagan, in both the book Inspired and the Silicon Valley Product Group (SVPG) blog (here and here).

Unfortunately, most people don’t have time (or at times, a desire!) to do further reading, therefore something quick and visual for people to grasp is likely to aid your efforts. John Cutler has a handy infographic and supporting blog around the journey to product teams:

In my experience, use this to sense check where you are currently, as well as inform your future experiments in ‘types’ of product team. It also isn’t a model, so don’t think you need to just go to the next row down, you can skip ahead a few rows if you want to!

Tailoring language to your context is important too, for example ‘Run’ may be ‘Ops’ in your world — help it make sense to people who have never done this before, avoid new jargon unless you need to do that in order to make a clear distinction between x and y. The second from last row (Product Team w/ “mini CEO”) may be the end state for a lot of teams in your organisation, but it doesn’t mean you can’t experiment with having some ‘truly’ Product Teams (the last row).

Product Management Capability

A final nudge for me would be focusing on capability uplift within your people. This way of working will excite people, in particular those who like the idea of/have read about Product Management as a career. Whilst you should definitely capitalise on this momentum and positivity, it’s important to ensure you have people experienced in your organisation flying the flag / defining what “good” looks like from a product management perspective.

If your people are learning from YouTube videos (rather than peers) or taking on a role not knowing what Products they’ll be Product Managers for, or your most senior people in the “Product” part of the organisation (Community of Practice or Centre of Excellence) have never been in a product role, chances are your efforts will fall flat. And no, before you think about it, this ‘expertise’ will not come from the big 4! Be prepared to have to recruit externally to supplement that in-house enthusiasm with a group of people who can steer the ship in the right direction.

Summary

Hopefully this helps you and your context with nudging some of your thoughts in the right direction. Like I said it is by no means a model or “copy and paste” job, simply learnings from a practitioner.

What about your own experiences? Does any of the above align with what you have experienced? Have I missed things you feel are important nudges too? Let me know in the comments :)

There’s no such thing as a committed outcome

A joint article by both myself and Ellie Taylor, a fellow Ways of Working Enablement Specialist here at Nationwide.

The Golden Thread

As we start a new year, focusing on outcomes has been a common topic of discussion in our work in the last few weeks.

Within Nationwide, we are working towards instilling a ‘Golden Thread’ in our work — starting with our strategy, then moving down to three year and in-year outcomes, breaking this down further into quarterly Objectives and Key Results (OKRs) and finally Backlog Items (and if you’d like to go even further — Tasks).

This means that everyone can see how the work they’re doing fits (or maybe does not fit!) into the goals of our organisation. When you consider the work of Daniel Pink, we’re really focusing here on that ‘purpose’ aspect in trying to instil that in everything we do.

As Enablement Specialists, part of our role is to help coach the leaders and teams we work with across our Member Missions to really focus on outcomes over outputs. This is a common mantra you’ll hear recited in the Agile world, with some even putting forth the argument that it should in fact be the other way round. However, orientating around outcomes is our chosen path, and thus we must focus on helping facilitate the gathering of good outcomes.

Yet, in moving to new, outcome oriented ways of working, a pattern (or anti-pattern) has emerged— one of which is the concept of a ‘committed outcome’.

“You can’t change that, it’s a committed outcome”

“We’ve got our committed outcomes and what the associated benefits are”

“Great! We’ve got our committed outcomes for this year!”

This became a hot topic amongst our team — what really is an outcome and is it possible to have a committed outcome?

What is an Outcome?

If we were to look at a purely dictionary definition of the word, an outcome is “something that follows as a result or consequence”. If you want help in what this means in a Lean-Agile context, the book Outcomes over Output helpfully defines an outcome as “a change in human behaviour”. Which we can then tweak for our context to mean ‘something that follows as a result or consequence, which could also be defined as a change in member or colleague behaviour’.

However this definition brings about uncertainty. How can we have certainty over the outcomes given they’re changes in behaviour? Well a simple way is to call them committed outcomes. That way we’ll know what we’ll be getting. Right?

Well this doesn’t quite work…

Outcomes & Cynefin

This is where those focused in introducing new ways of working and particularly leadership should look to leverage Cynefin. Cynefin is a sense-making model originally developed by Dave Snowden to help leaders make decisions by understanding how predictable or unpredictable their problems are. In the world of business agility, it’s often a first port of call to help us understand the particular context teams are working in, and then in turn how best to help them maximise their flow of work by selecting a helpful approach from the big bag of tools and techniques.

The model introduces 4 domains: Clear, Complicated, Complex and Chaotic, which can then be further classified as predictable or unpredictable in nature.

There is also a state of Confused where it is unclear which domain the problem fits into and further work is needed to establish this before attempting to identify the solution.

Source: About — Cynefin Framework

Both Clear and Complicated problems are considered by the model to be predictable since they have repeatable solutions. That is, the same solution can be applied to the same problem, and it will always work. The difference between the two being the level of expertise needed to solve.

The solution to Clear problems is so obvious that a child can solve it, or if expertise is needed the solution is still obvious and there’s normally only one way to solve the problem. Here is where it’s appropriate to use “best practice”. [example: tying shoelaces, riding a bike)

In the Complicated domain, more and more expertise is needed the more and more complicated the problem gets. The outcome is predictable because it has been solved before, but it will take an expert to get there. [example: a mechanic fixing a car, a watchmaker fixing a watch]

The Complex and Chaotic domains are considered unpredictable. The biggest difference between the two from a business agility perspective is whether it is safe to test and learn; ‘yes’ for Complex and definitely ‘no’ for Chaotic.

Complex problems are ones where the solution, and the way in which to get to that solution (i.e. the practices) emerge over time because we’ve never done them before, in this context, environment, etc. Cause and effect are only understood with hindsight, so you can only possibly know what happens and any side-effects of a solution once created. The model describes this activity as probing, trying out some stuff to find out what happens. And this is key to why this domain is a sweet spot for business agility. We need to try out something in such a way, typically small, that ensures that if it doesn’t work as expected, the consequences are minimised. This is often referred in our community as ‘safe to fail.’

And finally, Chaos. This is typically a transient state; it resolves itself quickly (not always in your favour!) and is very unpredictable. This domain is certainly not a place for safe to fail activity. Decisive action is the way forward, but there is also a high degree of choice in the solution and so often novel practices emerge.

Ok, so what’s the issue?

The issue here is that when we think back to focusing on outcomes, and specifically when you hear someone say something is a committed outcome, what’s more likely is that it’s a (committed) output.

It’s something you’re writing down that you’re going to ‘do’ or ‘produce’. Due to the fact that most of what we do sits in the Complex domain, we can’t possibly know (for certain) whether what we are going to ‘do’ will definitely achieve the outcome we’re after until we do it. We also don’t even know if the outcome we think we are after is the right one. Thus, it is nonsensical (and probably impossible!) to ‘commit’ to it. It’s unfortunately trying to apply thinking from the Clear domain to something that is Complex. This is a worry, as now these outcomes become something that we’ll do (output) rather than something we’ll go after (outcome).

In lots of Agile Transformation or Ways of Working initiatives, this manifests itself at team level where large numbers of Scrum teams are stuck in a world where they still fixate on “committed number of items/story points” — ignoring the fact that this left Scrum ten years ago. Scrum Teams commit to going after both long-term (product) and short-term (sprint) goals, expressed as outcomes, with the work they do in the product/sprint backlog being how they’re trying to go after those. They do this because they know their work is complex. The same goes for the wider organisation wide ‘transformation’, which is treated as a programme where, by a pre-determined end date (usually 12 months), we will all be ‘transformed’. This of course can only be demonstrated in output (number of teams, number of people trained and certified, etc.) due to the mindset it is being approached with.

The problem with committing to an outcome (read: output) is that it stifles empowerment, creativity, and innovation, turning your golden thread from something meaningful, purposeful that celebrates accountable freedom, to output oriented, feature factory measured agile theatre.

Ultimately, this means any new ways of working approach is likely to be sub-optimal at best — it’s a pivot without a pivot, leading to everyone in the system delivering output, delivering this output faster, yet perplexed at the lack of meaningful impact. Meaning we neglect the outcomes and experimenting with many ways in our real desired results of delighting members and colleagues.

What we want to focus on are the problems we want to solve, which comes back to the member, user or colleague behaviours that drive business results and the things we can do to help nudge these. Ideally, we’d then have meaningful and, where appropriate, flow and value based measures to quantify these and track progress.

Summary

In closing, some key points to reaffirm when focusing on outcomes:

  • Outcomes are something that follows as a result or consequence

  • Cynefin is a sense-making model originally developed by Dave Snowden to help leaders make decisions by understanding how predictable or unpredictable their problems are.

  • Cynefin has 4 domains: Clear, Complicated, Complex and Chaotic, which can then be further classified as predictable or unpredictable in nature.

  • There is also a state of Confused where it is unclear which domain the problem fits into and further work is needed to establish this before attempting to identify the solution.

  • The work we do regarding change is generally in the Complex domain

  • As the work is Complex, there is no way we can possibly ‘commit’ to what an outcome will be as the relationship between the two is not known

  • Outcomes are things that we’d like to happen (more engaged staff, happier members) because of the work that we do

  • When you hear committed outcomes — people most likely mean outputs

  • Use the outputs as an opportunity to focus on the real problems we want to solve

  • The problems we want to solve should come back to the member, user or colleague behaviours that drive business results (which are the actual ‘outcomes’ we want to go after)

What do you think about when referring to outcomes? 

Have you had similar experiences?

Let us know in the comments below or tweet your thoughts to Ellie or myself :)

Capacity planning in less than five minutes

Source: Memes Monkey

Following on from my recent Story Pointless series, I realised there was one element I missed. Think of this like a bonus track from an album, I’ll take Ludacris’ Welcome to Atlanta if we’re talking specifics. Anyway, quite often we’ll find that stakeholders are less interested in Stories, instead focusing on Epics or Features, specifically how ‘big’ they are and of course, when will they be done and/or how many of these will they get. Over the years, plenty of us have been trained that the only way to work this out is through story point estimation of all the stories within said Epic or Feature, meaning we get everyone in a room and ‘point’ each item. It’s time to challenge that notion.

A great talk I watched recently from Prateek Singh used whisk(e)y in helping provide an alternative method of doing so, distilling the topic (see what I did there) into something straight forward for ease of understanding. I’d firmly encourage anyone using or curious about the use of data and metrics to find the time to watch it using the link below:

LAG21 — Prateek Singh — Drunk Agile — How many bottles of whisky will I drink in 4 months? — YouTube

Whilst I’m not going to spoil the session by giving a TLDR, I wanted to focus on the standout slide in how to do capacity planning approach using probabilistic methods:

As you’ll know from previous posts, using data to inform better conversations is something which is key to our role at Nationwide as Ways of Working (WoW) Enablement Specialists. So again let’s use what we have in our context to demonstrate what this looks like in practice.

Putting this into practice

Before any sort of maths, we first start with how we break work down. 

We take a slightly different approach to what Prateek describes in his context (Feature->Story work breakdown structure).

Within Nationwide, we are working towards instilling a ‘Golden Thread’ in our work — starting with our Society strategy, then moving down to strategic (three year and in-year) outcomes, breaking this down further into quarterly Objectives and Key Results (OKRs), then Epics and finally Stories.

This means that everyone can see/understand how the work they’re doing fits (or maybe does not fit!) into the goals of our organisation. When you consider the work of Daniel Pink, we’re really focusing here on that ‘purpose’ aspect in trying to instil that in everything we do.

So with that in mind, we’ll focus on number of Stories within an Epic, as a replacement for Prateek’s Feature->Story breakdown. Our slightly re-jigged formula for capacity planning looking like so:

How many stories

We start by running a Monte Carlo Simulation for the number of stories and the period we wish to forecast for, which allows us to get a percentage likelihood. For a detailed overview on how to do this, I recommend part two of the story pointless series. Assuming you’ve read or already understand this approach, you’ll know that to do this we first of all need input data, which comes in the form of Throughput. We get this from our ThoughtSpot platform via our our flow metrics dashboard:

From this chart we have our weekly Throughput data for the last 15 weeks of 11, 36, 7, 5, 6, 6, 13, 6, 7, 4, 2, 7, 5, 4, 5. These will be our ‘samples’ that we will feed into our model.

For ease of use, I’ve gone with using troy.magennis Throughput Forecaster, which I also mentioned previously. This will do 500 simulations of what our future looks like, using these numbers as the basis for samples. Within that I enter the time horizon I’d like to forecast for which, in this instance, we’re going to aim for our number of stories in the next 13 weeks. We use this as it roughly equates to a quarter, normally the planning horizon we use at an Epic level.

The output of our forecast looks like so:

We’re happy with a little bit of risk so we’re going to choose our 85th percentile for the story count. So we can say that for this team, in the next quarter they are 85% likely to complete 77 stories or more.

Epic Size

The second part of our calculation is our Epic size. 

From the video, Prateek presented that the best way to visualise this as a scatter plot, with the feature sizes — that is the number of Stories within a Feature, plotted against the Feature completion date. The different percentiles are then mapped against the respective Feature ‘sizes’. We can then take that percentile to add a confidence level against the number of Stories within a Feature.

In the example you have 50% of Features with 5 stories or less, 85% of Features with 17 stories or less and 95% of Features with 22 stories or less.

As mentioned previously, we use a different work breakdown, however the same principles apply. Fortunately we again have this information to hand using our ThoughtSpot platform:

So, in taking the 85th percentile, we can say that for this team, 85% of the time, our Epics contain 18 stories or less.

Capacity Planning

With the hard part done, all we need to do now is some simple maths!

If you think back to the original formula:

How Many (Stories) / Epic Size (Stories per Epic) = Capacity Plan (# of Epics)

Therefore:

How many stories (in the next quarter): 77 or more (85% confidence)

Epic size: 18 stories or less

Capacity Plan: 77/18 = 4 (rounded)

This team has capacity for 4 Epics (right sized) in the next quarter.

And that’s capacity planning in less than five minutes ;)

Ok I’ll admit…

There is a lot more to it than this, like Prateek says in the video. Right sizing alone is a particular skillset you need for this to be effective.

#NoTetris

The overarching point is that teams should not be trying to play Tetris with their releases, and it being more like a jar of different sized pebbles.

As well as this, we need to be mindful that forecasts can obviously go awry. Throughput can go up or down, priorities can change, Epics/Features can grow significantly in scope and even deadlines change. It’s essential that you continuously reforecast (#ContinuousForecasting) as and when you get new information. Make sure you don’t just forecast once and for any forecasts you do, make sure you share them with a caveat or disclaimer (or even a watermark!).

You could of course take a higher percentile too. You might want to use the 95th percentile for your monte carlo simulation (72 stories) and epic size (25), which would mean only 2 Epics. You also can use it to help protect the team —there might be 26 Epics that stakeholders want want yet we know this is only at best 50% likely (104 stories / 4 stories in an Epic) — so let’s manage expectations from the outset.

Summary

A quick run through of the topic and what we’ve learnt:

  • Capacity planning can be made much simpler than current methods suggest. Most would say it needs everyone in a room and the team to estimate everything in the backlog — this is simply not true.

  • Just like in previous posts, this is an area we can cover probabilistically, rather than deterministically. There are a range of outcomes that could occur and we need to consider this in any approach.

  • A simple formula to do this very quick calculation:

  • How Many Stories (85%) / Epic Size = Capacity Plan

  • To calculate how many stories, run a monte carlo simulation and take your 85th percentile for number of stories.

  • To calculate Epic/Feature size, visualise the total number of stories within an Epic/Feature, against the date completed, plotting the different percentiles and take the 85th one.

  • Divide the two, and that’s you capacity plan for the number of right sized Epics or Features in your forecasted time horizon.

  • Make sure you continuously forecast, using new data to inform decisions, whilst avoiding playing Tetris with capacity planning

Could you see this working in your context? Or are you already using this as a practice? Reply in the comments to let me know your thoughts :)

Story Pointless (Part 3 of 3)

The final of this three-part series on moving away from Story Points and how to introduce empirical methods within your team(s). 

Part one refamiliarised ourselves with what story points are, a brief history lesson and facts about them, the pitfalls of using them and how we can use alternative methods for single item estimation. 

Part two looked at probabilistic vs. deterministic thinking, the use of burndown/burnups, the flaw of averages and monte carlo simulation for multiple item estimation.

Part three focuses on some common questions and challenges posed with these new methods, allowing you to handle those questions you may get asked when wanting to introduce a new approach in your teams/organisation.

The one question I get asked the most

Would you say story points have no place in Agile?

My personal preference is that just like Agile has evolved to be a ‘better way’ (in most contexts) than Waterfall, the methods described in this series are a ‘better way’ than using Story Points. Story Points make sense to be used in contexts where you have little or no dependencies and spend more time ‘doing’ than ‘waiting’.

Troy Magennis — What’s the Story About Agile Data

The problem is that so few teams in a large organisation like ours have this context yet have been made to “believe” story points are the right thing to do. For contexts like this, teams are much better off estimating the time they will spend ‘blocked’ or ‘waiting’, rather than the active time ‘doing’.

Common questions posed for single item estimation

But the value is in the conversation, isn’t that what story points are about?

Gaining a shared understanding of the work is most definitely important! The problem is that there are much better ways of understanding the problem you’re trying to solve than giving something a Fibonacci number and debating if something is a ‘2’ or a ‘3’ or why someone feels that way about a number. You don’t need a ‘number’ to have a conversation — don’t confuse estimation with analysis! The most effective way to learn and understand the problem is by doing the work itself. This method provides a much more effective approach in getting to that sooner than story points do.

Does this mean all stories are the same size?

No! This is a common misconception you may hear. What we care about is “right sizing” our items, meaning they are no larger than an agreed size. 

 This is derived by using the 85th (or the number of your choice!) percentile, as mentioned in part one.

What about task estimation?

Source

Not using tasks encourages collaboration, swarming and getting stories (rather than tasks) finished. It’s quite ironic that proponents of Scrum encourage sub tasks, yet one of the creators of Scrum (Jeff Sutherland) holds a different view, supported by data. In addition to this, Microsoft found that using estimates in hours had errors as large as ±400% of the estimate.

Should we not use working days and exclude weekends?

Whilst there is nothing to say excluding weekends is ‘bad’ — it again comes back to the principle of talking in the language of our customer. If we have a story that we say on 30th April 2021 has a 14-day cycle time at 85% likelihood — when is reasonable to expect it? It would be fair to say this is on or around 14th May.

Yet if we meant 14 working days this would be 21st May (due to the bank holiday) — which is a whole week extra! Actual days again makes it easier for our customers/stakeholders to understand as we’re talking in their language.

Do you find stakeholders accepting this? What options do you have when they don’t?

Stakeholders should (IMO) never *tell* a team how to estimate, as that is owned by the team. What I do communicate is the options they have on the likelihood. I would let them know 85% still has risk and if they want less risk (i.e., 90th/95th percentile) then it means a longer time, but the decision with risk is with them.

Why choose the 85th percentile?

The 85th percentile is common practice purely as it ‘feels right’. For most customers or stakeholders they’ll likely interpret this as “highly likely”, which will be good enough for them. Feel free to choose a higher percentile if you want less risk (but recognise it will be a longer duration!).

Common questions posed for multiple item estimation

Does this mean all stories are the same size?

No! See above.

Do we need to have lots of data for this?

Before considering how much data, the most important thing is stability of your system/process. For example if your work is highly seasonal, you might want to consider this in your input data to your forecast if the future work will be less ‘hectic’.

However, let’s get back to the question. You can get started with as little as three samples (three weeks or say three sprints worth) of data. The sweet spot is 7–15 samples, anything more than 15 and you’ll likely need to discard old data as it may negatively impact your forecasts.

With 5 samples we are confident that the median will fall inside the range of those 5 samples, so that already gives us an idea about our timing and we can make some simple projections.

(Source: Actionable Agile Metrics For Predictability)

With 11 samples we are confident that we know the whole range, as there is a 90% probability that every other sample will fall in that range.

(Source: German Tank Problem)

What if I don’t have any previous data?

Tools like the Excel sheet from Troy provide the ability to estimate your range in completed stories. Once you start the work, populate with your actual samples and change the spreadsheet to use ‘Data’ as an input.

What about if it’s a new technology/our team has changed?

Good question — throw away old data! Given you only need a few samples you should not let different contexts/team setups influence your forecast.

Should I do this at the beginning of my project/release and send to our stakeholders?

You should do it then and continue to doso as and when you get new samples, do not just do one forecast! Ensure you practice #ContinuousForecasting and caveat that any forecasts are a point in time based on current data. Remember, short term forecasts (i.e., a sprint) will be more accurate than longer ones (e.g., a year long forecast done at the start of a financial year).

What about alternative types of Monte Carlo Simulation? Markov chain etc.?

This is outside the scope of this article, but please check out this brilliant and thorough piece by Prateek Singh comparing the different types of Monte Carlo Simulation.

So does the opinion of individuals not matter?

Of course not :) These methods are just introducing an objective approach into that conversation, and getting us away from methods that can easily be manipulated by ‘group think’. Use it to inform your conversation, don’t just treat it as the answer.

Isn’t this more an “advanced” practice anyway? We’re pretty new to this way of working…

No! There is nothing in agile literature that says you have to start with story points (or Scrum/sprints for that matter), nor that you have to have been practicing other methods before this one. The problem with starting with methods such as story pointing is they are starting everyone off in a language no one understands. These other methods are not. In a world where unlearning and relearning is often the biggest factor in any adoption of new ways of working, I’d argue it’s our duty to make things easier for our people where we can. Speaking in a language they understand is key to that.

Conclusions

Story points != Agile. 

Any true Agilista should be wanting to stay true to the manifesto and always curious about uncovering better ways of working. Hopefully this series presents some fair challenges to traditional approaches but, more importantly, alternatives you can put into practice right away in your context.

Let me know in the comments if you liked this series, if it challenged you, anything you disagree with and/or any ways to make it even better.

— — — — — — — — — — — — — — — — — — — — — — — — — —

References for this series:

Story Pointless (Part 2 of 3)

The second in a three-part series on moving away from Story Points and how to introduce empirical methods within your team(s). 

Part one refamiliarised ourselves with what story points are, a brief history lesson and facts about them, the pitfalls of using them and how we can use alternative methods for single item estimation.

Part two looks at probabilistic vs. deterministic thinking, the use of burndown/burnups, the flaw of averages and monte carlo simulation for multiple item estimation.

Forecasting

You’ll have noticed in part one I used the word forecast a number of times, particularly when it came to the use of Cycle Time. It’s useful to clarify some meaning before we proceed.

What do we mean by a forecast?

Forecast — predict or estimate (a future event or trend).

What does a forecast consist of?

A forecast is a calculation about the future that includes both a range and a probability of that range occurring.

Where do we see forecasts?

Everywhere!

Sources: FiveThirtyEight & National Hurricane Centre

Forecasting in our context

In our context, we use forecasting to answer the key questions of:

  • When will it be done?

  • What will we get?

Which we typically do by:

Which we then visualize as a burnup/burndown chart, such as the example below. Feel free to play around with the inputs:

https://observablehq.com/embed/@nbrown/story-pointless?cells=viewof+work%2Cviewof+rate%2Cchart

All good right? Well not really…

The problems with this approach

The big issue with this approach is that the two inputs into our forecast(s) are highly uncertain, both are influenced by;

  • Additional work/rework

  • Feedback

  • Delivery team changes (increase/decrease)

  • Production issues

Neither inputs can be known exactlyupfront nor can they be simply taken as a single value, due to their variability.

And don’t forget the flaw of averages!

Plans based on average, fail on average (Sam L. Savage — The Flaw of Averages)

The above approach means forecasting using average velocity/throughput which, at best, is the odds of a coin toss!

Source:

Math with bad drawings — Why Not to Trust Statistics

Using averages as inputs to any forecasting is fraught with danger, in particular as it is not transparent to those consuming the information. If it was it would most likely lead to a different type of conversation:

But this is Agile — we can’t know exactly when something will be done!?!…

Source: Jon Smart — Sooner, Safer, Happier

Estimating when something will be done is particularly tricky in the world of software development. Our work predominantly sits in the domain of ‘Complex’ (using Cynefin) where there are “unknown unknowns”. Therefore, when someone asks, “when will it be done?” or “what will we get?” — when we estimate, we cannot give them a single date/number, as there are many factors to consider. As a result, you need to approach the question as one which is probabilistic (a range of possibilities) rather than deterministic (a single possibility).

Forecasts are about predicting the future, but we all know the future is uncertain. Uncertainty manifests itself as a multitude of possible outcomes for a given future event, which is what science calls probability.

To think probabilistically means to acknowledge that there is more than one possible future outcome which, for our context, this means using ranges, not absolutes.

Working with ranges

Communicating such a wide range to stakeholders is definitely not advisable nor is it helpful. In order to account for this, we need an approach that allows us to simulate lots of different scenarios.

The Monte Carlo method is a method of using statistical sampling to determine probabilities. Monte Carlo Simulation (MCS) is one implementation of the Monte Carlo method, where a real-world system is used to describe a probabilistic model. The model consists of uncertainties (probabilities) of inputs that get translated into uncertainties of outputs (results).

This model is run a large number (hundreds/thousands) of times resulting in many separate and independent outcomes, each representing a possible “future”. These results are then visualised into a probability distribution of possible outcomes, typically in a histogram.

TLDR; this is getting nerdy so please simplify

We use ranges (not absolutes) as inputs in the amount of work and the rate we do work. We run lots of different simulations to account for different outcomes (as we are using ranges).

So instead of this:

https://observablehq.com/embed/@nbrown/story-pointless?cells=viewof+work%2Cviewof+rate%2Cchart

We do this:

https://observablehq.com/embed/@nbrown/story-pointless?cells=chart2%2Cviewof+numberOfResultsToShow%2Cviewof+paceRange%2Cviewof+workRange

However, this is not easy on the eye! 

So what we then do is visualise the results on a Histogram, showing the distribution of the different outcomes.

We can then attribute percentiles (aka a probability of that outcome occurring) to the information. This allows us to present a range of outcomes and probability of those outcomes occurring, otherwise known as a forecast.

Meaning we can then move to conversations like this:

The exact same approach can be applied if we had a deadline we were working towards and we wanted to know “what will we get?” or “how far down the backlog will we get to”. The input to the forecast becomes the number of weeks you have, with the distribution showing the percentage likelihood against the number of items to be completed.

Tools to use

Clearly these simulations need computer input to help them be executed. Fortunately there are a number of tools out there to help:

  • Throughput Forecaster — a free and simple to use Excel/Google Sheets solution from troy.magennis that will do 500 simulations based on manual entry of data into a few fields. Probably the easiest and quickest way to get started, just make sure you have your Throughput and Backlog Size data.

  • Actionable Agile — a paid tool for flow metrics and forecasting that works as standalone SaaS solution or integrated within Jira or Azure DevOps. This tool can do up to 1 million simulations, plus gives a nice visual calendar date for the forecasts and percentage likelihood.

Source:

Actionable Agile Demo

  • FlowViz — a free Power BI template that I created for teams using Azure DevOps and GitHub Issues that generates flow metrics as well as monte carlo simulations. The histogram visual provides a legend which can be matched against a percentage likelihood.

Summary — multiple item forecasting

  • A forecast is a calculation about the future that includes both a range and a probability of that range occurring

  • Typically, we forecast using single values/averages — which is highly risky (odds of a coin toss at best)

  • Forecasting in the complex domain (Cynefin) needs to account for uncertainty (which using ‘average’ does not)

  • Any forecasts therefore need to be probabilistic (a range of possibilities) not deterministic (a single possibility)

  • Probabilistic Forecasting means running Monte Carlo Simulations (MCS) — simulating the future lots of different times

  • To do Monte Carlo simulation, we need Throughput data (number of completed items) and either a total number of items (backlog size) or a date we’re working towards

  • We should always continuously forecast as we get new information/learning, rather than forecasting just once

Ok but what about…

I’m sure you have lots of questions, as did I when first uncovering these approaches. To help you out I’ve collated the most frequently asked questions I get, which you can check out in part three

— — — — — — — — — — — — — — — — — — — — — — — — — —

References: