Forecasting is not a skill I use frequently, which means it takes me time to remember each of the steps and the small details I’ve forgotten. There’s a lot of great information out there, but reading it and refreshing my memory takes longer than I’d like it to. My goal here is to describe the process I use to make this easier in the future (i.e. not the why, but the how). And if someone suggests improvements to my process, all the better!

Don’t worry if you aren’t familiar with some of these terms and tools - I’ll link to explanations as each is used.
One final note before we get started: risk management and recording assumptions are a crucial part of delivery forecasting. That should be done in parallel with these steps.
Everyone has their own way to do this part. My preferred method is Story Mapping(Jeff Patton on User Story Mapping, Hate Estimating? Try doing a User Story Map instead) using Miro.
A key component of a forecast is the size of the backlog. Ideally we would estimate the size of every backlog item.
When that’s not feasible, we can use sampling to reduce the backlog-to-be-sized to a manageable amount (Sampling, an Introduction and Sampling, probability and certainty, Sampling, probability and certainty) with a minimal impact on forecast quality. More samples means higher confidence that the samples are representative of the whole, but also takes more time to size.
| Samples | Probability the next sample falls within the previously seen range | 
|---|---|
| 2 | 33.3% | 
| 3 | 50.0% | 
| 4 | 60.0% | 
| 5 | 66.7% | 
| 6 | 71.4% | 
| 7 | 75.0% | 
| … | … | 
| 11 | 83.3% | 
| … | … | 
| 20 | 90.5% | 
| … | … | 
| 30 | 93.5% | 
When possible I use 20 samples (90.5%), but 7 samples (75.0%) or 11 samples (83.3%) will still get good results.
In my experience, the best results come from splitting the backlog items “as small as possible and no smaller” - it’s easier to estimate small stories. If we can take that one step further and make them all “close enough” to the same size, we can simply use story count - no size estimation necessary.
When that’s not feasible, relative sizing works well. I like to use a simple small/medium/large scale. In the following steps we’ll need numbers, however, so I use a simple scale: small (2), medium (4), large (8). You may be wondering “a small story is 2 what? hours? days?” - and the answer is that it doesn’t matter. The important part is that they express the relationship between the size of stories. As long as we’re careful to use the same units later on, the math works out.
You may be wondering “aren’t these story points?”. I avoided that phrase because there’s so much baggage around it in the Agile world. There’s no need for anything complicated here.
Troy Magennis has a Worksheet with more guidance on methodology and an entire chapter in his book for even more depth. He also discusses the tradeoff between story points and item count.
Download a copy of the Story Count Forecaster. This tool takes as an input the sampled backlog items and their sizes, performs a Monte Carlo analysis, and forecasts the total backlog size.
On tab 2 (Enter Features or Epics Here): enter the list of sampled backlog items along with their size
On tab 3 (Forecast Story Count or Points):
The key here is to keep our units consistent. If the samples were sized with story points, then the output here will be in story points. If we use story count, then the output will be a number of stories. The output value is always pre-split units of work (see step 4 for split rate).
Choose confidence levels that match your client’s risk tolerance. Lower confidence sets the lower bound for total backlog size, similarly with higher confidence. I usually use 75% and 95%.
Experientially, I’ve found these to be good starting range for split rate:
From that starting point, adjust up or down based on factors such as:
Troy Magennis has a Worksheet with more guidance and an entire chapter in his book for even more depth.
Velocity, throughput, and pace are closely related, but distinct, concepts which are often conflated. Each has their use and what is important is that the units are consistent with steps 2 and 3. I’ll use velocity for now, but mentally substitute the other terms as it makes sense for your context. I think velocity has the largest impact on the accuracy of forecasts, so it’s worth taking some time to get reasonable values here.
It’s best to use historical data when possible, if that historical data is recent. If that isn’t available, this is a guess based on experience. Either way it must be in the same units as backlog size. For time I usually use one week to simplify communication with the client. This is just for expectation setting, and the actual development effort can be done in weeks, sprints, etc.
Once we have a baseline velocity, adjust up or down based on factors such as:
Speaking of external dependencies, identifying and understanding them is a prerequisite for forecasting. If that wasn’t done as part of step 1, spend some time on that before proceeding.
Troy Magennis has a Worksheet with more guidance.
Download a copy of the Throughput Forecaster. This tool takes as an input the backlog size, split rate, and velocity, performs a Monte Carlo analysis, and forecasts the duration with varying levels of confidence (answering the question “how long will it take to build this backlog?”). If the client instead wants to know “what scope can I get with a fixed budget?”, use the Multiple Feature Cut Line Forecaster.
On tab 2 (Forecast):
The output is a range of durations based on confidence/likelihood.
At this point, we have a forecast of duration (i.e. a range of durations along with their likelihood), therefore we can forecast cost. If we have a likely start date, we can also forecast the delivery date. However, none of this is useful unless we can communicate it to the client.
The most straightforward deliverables would be:
If the client is already familiar with forecasting, we could even give them the tools (e.g. spreadsheets, estimates, etc.) used above.
However, my team has been experimenting with some other ways to communicate the results, which I’ll cover in a future blog post.