Do we size bugs with story points?

I found a question posted in one of the social channels at work: How should one give out story points to bugs/defects that one does not yet know how to fix yet and requires investigation? The original question asks how but I think even before we go there it would be nice to know whether we need to in the first place. I plan to write about this in 2-parts: (1) how I might go about with it — no explanations, but based on past experiences as a member of Agile Scrum teams and what I’ve read on the topic, and (2) links and quotes galore.

How I might go about with it

  • If it’s a bug found during testing of a user story we’re working on in the sprint AND it’s small enough (implicitly sized) to be fixed within the same sprint: It goes into the sprint backlog. No need to size it. Just prioritize it accordingly.
  • If it’s a bug unrelated to user stories that we’re testing this sprint (say, from an older feature) OR it’s too big a bug or complex (again implicitly sized) to be fixed within the sprint: It goes to the product backlog. It’ll be groomed as you would with other user stories to give it enough details for the team to work with. And if it makes it way into the Sprint Planning, then size the bug.
  • Now what if the bug that goes into the product backlog requires more investigation than usual (all bugs require investigation, but in some cases I suppose devs already have an idea of how to fix it, in some, totally no idea hence more investigation is needed): Tag it as a spike (not a term in the Scrum Guide, FYI). If it goes into the Sprint Backlog, meaning the team agrees to invest time on investigating that bug within the Sprint, no need to size it.
    • For that spike in the Sprint, it’ll just mean there’ll be a time-box (1-3 days of effort) for investigating that bug. At the end of the time-box, whoever works on it reports their findings and the team can discuss the next steps.
    • Assuming the team agrees on a resolution, duplicate the bug with the spike tag. Close the original one. In the duplicate, remove the Spike label. If it’s to remain in the Sprint Backlog meaning the team will fix it within the Sprint, then size the bug. Otherwise, the new bug (the duplicate) goes to the Product Backlog and no need to size it yet.
    • But what if there’s still no resolution or identified workaround. The team can opt to extend the time-box. But at some point, you can’t just extend and extend it forever. Once a threshold is met (is 3 months too long/short?): Tag it with a label your team agrees to use on such items, and then archive it.
  • At the end of the Sprint, the Scrum Master will be able to gather the following data in case they want to use it for some forecasting:
    • User Stories – total story points, bugs per user story
    • Bugs – total story points, total number of bugs
    • Spikes – total number of Spikes worked on, total number of Spikes closed, total points from Spikes that were converted to new bugs

That turned out longer than I expected. The next part are for some links on the topic and could give you the opposing views to help you come up with your own answer.

Links and quotes galore

12 common mistakes made when using Story Points – This has a lot of other interesting points not just about on whether you size bugs or not.

  • “Story Points represent the effort required to put a PBI (Product Backlog Item) live.” So story points are not limited to user stories.
  • “Story Points are about effort. Complexity, uncertainty and risk factors that influence effort but each alone is not enough to determine effort.
  • [Common mistake #5: Never Story Pointing Bugs] “A bug which is unrelated to the current sprint should just be story pointed. The bug represents work the team needs to complete. This does not apply if the team reserves a fixed percentage of time for working on bugs during the sprint. A bug related to an issue in the sprint should not be story pointed as this is part of the original estimation.”

Should Story Points Be Assigned to a Bug Fixing Story?

  • [I think this is with respect to legacy bugs or when the team is dealing with a large database of agile defects] “My usual recommendation is to assign points to bug fixing the agile defects. This really achieves the best of both worlds. We are able to see how much work the team is really able to accomplish, but also able to look at the historical data and see how much went into the bug-fixing story each sprint.”

Should you ‘Story Point’ everything? – This is a thread in the Scrum.org forum.

  • (No points for bugs) ‘They are called story points for a reason. They are not call[ed] “Item Points”. Ideally you should only have stories in your backlog and the technical tasks should be inside…’
  • (Yes or no points for bugs) “It is critical as a Scrum Master to ensure that story points are being used properly within an organization. They serve two purposes only: to help the Development Team and Product Owner plan future sprints, and to be accumulated for done items at the end of a sprint for velocity calculation purposes. They are not a proxy for value delivery. … That said, it seems there are a number of different items (bugs, technical tasks, spikes) that have a capacity impact on the Development Team each sprint. For planning purposes, if the team prefers to not point these items, a mechanism to determine the capacity impact is still desired….”
  • (No points altogether) ‘I have found, and this may depend on your team, that removing story points entirely helps the team and stakeholders focus on the sprint goal instead of “How many points”….’

What’s a spike, who should enter it, and how to word it? Since I mentioned “spikes”, I’ve put in this other link about it.

  • “A spike is an investment to make the story estimable or schedule-able.”
  • “Teams should agree that every spike is, say, never more than 1 day of research. (For some teams this might be, say, 3 days, if that’s the common situation.) At the end of the time-box, you have to report out your findings. That might result in another spike, but time-box the experiments. If you weren’t able to answer the question before time runs out, you must still report the results to the team and decide what to do next. What to do next might be to define another spike.”
  • “It’s also best if a spike has one very clear question to answer. Not a bunch of questions or an ambiguous statement of stuff you need to look into. Therefore, split your spikes just as you would large user stories.”

Let me know if you find anything more conclusive or helpful.

Ah, estimation!

For as long as I can remember, I’ve been asked to provide test estimates for project proposals. More often than not, the details are very high-level and there is never enough time given to you to digest the materials. The latest I received was this morning with 104 bulleted requirements (not yet including other background references) and they expect a reply by today otherwise they’d assume the test effort would be 25% of the development effort. Such short notice demands just make me wince. Didn’t they ever do math problems as a kid?

Juan renders 8.5 hours a day to Project X. Last week, Juan was on sick leave and is still feeling slightly under the weather (but this doesn’t really matter). How many minutes must Juan spend on each 104 requirements (not part of Project X) so that he does not have to do overtime?

Try as I might though, I can’t do away with test estimation. Estimation is part and parcel of the software engineering process, and is highly essential when you’re trying to bid for a project. Business needs to put a price. There’s no going around it. You, as a tester, will eventually be asked to estimate!

Lesson 1: Estimates are wrong most of the times

There are so much unknowns. You have to deal with the cone of uncertainty and all that jazz. Try googling why estimates are always wrong and you’d find a lot of results on the topic. So knowing this? What does it tell us? Nope, this doesn’t mean we just go off throwing numbers just for the sake of having a deliverable. Personally, it just tells me to be efficient with the effort that I put into estimation. I may only have half a day or half an hour even to come up with the numbers. I can’t develop an estimation model in that short a time, and even if I could the output can’t be validated and would most likely be wrong. Essentially, just estimate — don’t kill yourself over trying to come up with something perfect, and put aside your worries that it will be wrong (because it most likely is anyway). Chill! Don’t sweat it!

Lesson 2: Estimate like you’ll be the one to deliver

A reality is you often don’t have the authority over the final estimates. Countless of times, I’ve seen my initial estimates whittled down with each review. You have to remember that those who are working on those proposals are out there to win the bid. But as someone who got pulled in to do the estimates, do whatever’s in your power to not short change the team who will actually execute. Just keep adding buffers whenever and wherever you can!

Lesson 3: Best case scenarios hardly ever happen

In an ideal setup, there could be a team of 2-3 testers going over the requirements — each giving their estimate for the optimistic, most likely and pessimistic scenarios. In instances where there’s a huge discrepancy, you guys can talk it out until you come to a point where you all agree. From experience though, it’s almost always a lone effort. You don’t get the ideal situation for estimation, so it is safe to assume that you won’t get the ideal situation when it comes to delivery. Buffers and factoring in the pessimistic scenarios are always a good idea.

Lesson 4: Ask for models

For mature organizations, it’s possible that they’ve already collected some past metrics and have already crafted some magical software estimation models. Enter the size in function points and voila we have the estimates! Enter the development estimates and ta-dah you have got estimates for everything else! Just enter the parameters and it spews out the number! What number? THE number!!! Now I don’t want folks to be all reliant on these numbers since the parameters in which they may have worked in the past might be totally different from your current situation. I am wary of when they say that testing is just 25% of the dev effort when I know from first-hand experience that the effort in testing could go far more than that. What you can do is ask around and use these numbers as references. Only as references. You have to exercise judgement before deciding these numbers could be applicable.

Lesson 5: How I estimate all by my lonely self

So after making you painstakingly read through the previous items, here’s the actual content. When pushed into an estimation task for a new system to be built, I’m typically given a list of requirements or functionalities that the new system or app has to cater for. I’d go over each requirement to estimate for the following:

  • Test Execution – This covers the execution of the test cases, reporting the status of the tests, and logging bugs when found.
  • Test Planning – This includes studying the test basis, drafting test cases, identifying the needed test data.
  • Test Plan Rework – This is for reviewing and revising test case drafts.
  • Retest – For verifying bug-fixes.
  • Test Execution using a 2nd, 3rd, nth platform – It’s possible that you need to test the system/app using a different platform or device.

Now that’s for when you have the time to go over each and every requirement. Plan B would be to do the estimates for what would constitute as a Simple, Average, or Complex requirement. Then instead of estimating for each requirement, you just categorize them and then use the corresponding estimates for them.

Then you also add buffers to consider things like:

  • Things generally go bad, and there’s a general communication overhead
  • Technical difficulty — like if it is relatively unknown how to test a particular function, or if you’ll need to have some learning curve since you’ll need a tool to augment testing
  • Adjustments in case testing will be done by someone less experienced

Now, that was only for the functional test estimates. You’ll also need to estimate for the following:

  • Test Management – Especially for a large testing project, there’ll be management overhead to consider. There are also project deliverables like the Test Plan or Test Strategy document. There would be management meetings to attend to, reports to report, and project issues or escalations to deal with. There’d be onboarding of new folks into the team and essentially setting up the test framework, processes, standards, guidelines, etc. to keep all the testers in the same page, and the test team in the same page as the rest of the development team.
  • Integration / System Test – The focus of this test is for the end-to-end scenarios or business flows.
  • User Acceptance Test – The involvement of the test team during UAT will also need to be identified. Will they be facilitating (or would that be handled by the BAs)? Will we just provide support e.g., replicate issues reported, verify change requests, etc.?

And in the process of going over the requirements, you’ll also need to identify other types of testing which might be needed for the project and estimate for those that you are capable of or have experience on. It’s possible that the project could go with usability testing, accessibility testing, performance testing, etc.

Anyways, you’ve read this far. Thank you! I guess it’s too late to give out the disclaimer that I’m no expert. In case you have magical ratios or percentages that you’d like to share, I’d like to hear about them. Please share in the comments! And if there’s one take-away that you get from this super long post, please let it be: Do not short change the team who will actually deliver.

Two developers working at a rate of N hrs/day…

A friend told me an odd story. Two developers from another team were allocated at a total of 40% into a particular project. They were given a task which they’ve estimated they could complete by January.

The PM Prodded and pushed. “Finish it by December,” he demanded. Eventually, after tireless nagging, the two conceded and said they will deliver by December as requested.

“How will they do that,” I wondered. Apparently, nothing changed — only the deadline. The amount of work to be done is still the same. Their measly 0.25 and 0.15 allocations were retained as is. What was the January estimate then? Had they padded their initial estimates such that it would actually be feasible to complete the task by December? Or will they have to render overtime work to cope and meet the deadline?

Instances like these make me wonder whether the PM ever did math problems while he was back in school. Joe works at a rate of N hours per day. A given task takes X hours. How many days will it take for Joe to finish his work?  Judith needs to complete a task that requires X hours in N days? How many hours per day will Judith have to work? How many Judiths are needed if each Judith can only work Z hours per day?