The Last Responsible Moment

We’re often tempted to plan as much as possible in advance and make decisions way before they are actually due. It gives us a sense of security and risk mitigation and preparedness. I urge you to stop doing that.

To understand why, follow this chain of reasoning:

  1. When making decisions, more uncertainty leads to higher risk.
  2. The more knowledge we have, the less uncertainty remains.
  3. (No matter how much you know today, it’s safe to say that) Tomorrow you will know more than today.
  4. Hence every decision you will make tomorrow (next week, next month), will be more informed than the ones you make today.
  5. That’s why making decisions as late as possible lowers risk.

The knowledge you’re accumulating can be pretty much everything, e.g. about:

  • Yourself
  • Your team
  • Your project
  • Your product
  • Your business
  • Your competition
  • The market you’re in
  • The legislative environment you’re dealing with
  • And so on…

Agile software development and Scrum have this way of making late decisions built into the process. What matters most is at the top of the backlog or is worked on during a sprint. Everything else isn’t important for now and most decisions related to everything else don’t need to be made yet.

Keep this in mind. Keep questioning yourself whether decisions need to be made already. And if not, don’t make them now. And maybe don’t even bother to think about them yet.

Making Continuous Delivery work with Scrum and Sprints

Scrum is promoting fixed-length sprints of 1, 2 or 4 weeks. We do 2. That means we plan for 2 weeks, the team works for 2 weeks, and then we’re ending the sprint with regression tests, release preparations, sprint review, a final sign-off, and the release. All engineering activities are set up around this. Now we want to release more often – continuously.

There’s a lot of good reasons to deliver continuously. Robert Johnson of  Facebook gave one of my favorite pep talks about this a while ago with “Facebook: Moving Fast At Scale”.

My requirement here is to stick to sprints of 2 weeks. I worked through a few alternatives, e.g. reducing the sprint length to one week instead. This would give us twice as many releases but that’s still nowhere near Facebook and others, and also wouldn’t solve the problem I wanted to solve since nothing would need to change. Another option that has been coming back regularly was Kanban, mostly because work flows continuously, without being boxed into sprints.

In the following I’m walking through what needs to change in order to make continuous delivery work with Scrum and fixed-length sprints.

Sprints vs. releases or: what are sprints for?

The main reasons for doing short sprints over long-term planning is the ability to respond to change and simply accepting the fact that long-term plans don’t work out anyways.

The reason why we ended up releasing after every sprint is that a) releases always create a certain overhead and it seems to make sense to batch up work and go through the overhead only once. And b) because it fits traditional project management thinking: once the planned work is done, it’s being signed-off and released.

But in the same way as we broke down work from a 12 months project into one sprint at a time during our transition to agile, it sounds reasonable to break things down further from batching up a release every 2 weeks into very small continuous releases.

Sprint review and final sign off

If releases are done every 2 weeks after a sprint is finished, it’s easy to combine sprint review and final sign-off. Again, a way of batching up things. But it doesn’t actually save a lot of time to batch up a sign-off, so we could as well do a quick sign-off after every story is completed. This has advantages anyways because a story is only then truly done after it’s signed-off and released. So the change that needs to happen is to de-couple sign-off of each story from the sprint review at the end of the sprint.

What remains is that the sprint review is an opportunity for the team to brag about the work they’ve done, to get stakeholders involved and updated and to make sure that the actual progress becomes visible and agreed upon. It has also advantages to show the current progress in a live environment because by the time of the sprint review, each story that is done is already running in production.

From a process point of view, the team would now be able to release continuously throughout the sprint. Now let’s get back to that “overhead” I talked about. The technical challenges need to be adressed, otherwise we’ll spend more time on the actual release than on development.

No junk in the trunk and code freeze

When releases are happening every 2 weeks, there’s always a bit of a touch down period at the end. Unfinished code is being finished, the last tickets are verified and time is spent to make sure that the Master is clean and ready to go. This often allows for a certain degree of sloppiness with the Master throughout the sprint. It’s not very critical if unfinished work ends up on the Master because there’s always enough time to fix it and clean it up. This needs to change if releases should happen “whenever we feel like it”.

First of all: No junk in the trunk! Master must always contain work that is finished and built to production quality. There is different branching strategies out there and documented. In this context they mostly come down adding two additional areas in the repository:

  1. Where work in progress happens. I recommend a separate feature branch for each story. Work remains solely on this branch until very high confidence is reached that a feature works.
  2. Where work gets integrated (but not yet pushed to Master). This is where finished stories are integrated with the latest Master – but outside of Master. Here, the remaining issues are caught and regression tests on related existing features are done. This should increase the confidence that a new story works and doesn’t break anything else to 100%. Then it goes into Master.

In an ideal world, this would allow us to get rid of good ol’ Code Freeze altogether. The different branches, the quality gates on each branch, and the integration down towards Master does exactly what a sprint-end code freeze does: make sure Master is clean and ready for a release.

Automate your testing

If each story should end up on a Master that is ready to release, then regression tests must be done for each ticket. Otherwise it’s hard to ensure that Master is really ready and recently added work doesn’t break anything. That’s where lack of test automation really does start to hurt.

There’s zillions of articles and books out there about this, so I’ll keep it short. The essence is: do it from the very beginning if you can. If it’s too late for that, invest some effort and get your regression tests automated, as close to 100% as you can.

This will decrease the overhead related to regression tests to as close to 0 as it gets.

Above I mentioned “quality gates”. These are all the different checks and tests a revision of your software must go through before it’s Done. Depending on your system they may consist of building your software, running unit tests, running static code analysis checks, running regression, UI and load tests and maybe some – hopefully not too many – manual steps. With many CI servers like Jenkins they can be automated and arranged in build pipelines. Such a pipeline runs them consecutively on a certain revision and only if it runs through until the end without failing in between, you’ve got a green light on this revision. I recommend using build pipelines.

Automate your deployments

Regression tests and deployments have a lot in common: if you only do them rarely, manual steps usually don’t hurt enough to automate it to near 100%. Now that we’re about to release very very often, this starts to hurt (= creates overhead). Release must be as lightweight, fast, and robust as it gets. If only 1 out of 10 releases makes even remotely trouble or fails, it’s hard to get the team confident enough to release frequently.

I recommend using a CI server, reducing deployments to a few clicks and add a suite of tests against your live servers into the script that tests the current deployment right after it’s done and fails or succeeds immediately.

I also recommend putting deployments as much as possible into the hands of the engineers who are writing the code or at least bring a member of the operations team into the Scrum team. This removes additional hand-overs and the team being blocked by others. Depending on the environment this is often difficult to achieve, but automating deployments down to a few clicks that never fail certain helps a lot.

Changing your definition of done

If all of the above works and releases indeed start to happen regularly and even after every story is done, it’s time to change the Definition of Done. Both the final sign-off and the release should be in there.

Up your skills!

A transformation to continuous delivery is a big step forward for a team. It requires a variety of skills to cope with all the technical and non-technical challenges. Being intentional about improving the team’s skills and being willing to spend time and money for this will definitely help. You can also learn along the way and learn from mistakes, but focus on better skills pro-actively will help making less mistakes, moving faster and gaining confidence within the team and outside.

Dealing with release problems

A live user environment much more likely breaks during a release, obviously. Hence a very common concern is that more frequent releases will as well introduce more frequent problems and actually increase the total amount of work necessary to deal with and fix all the problems.

But there’s also advantages: continuous releases consist of much smaller change sets. Hence it’s a whole lot easier to regression test them and to release them. And if something breaks, it’s also a whole lot faster to spot, understand and fix the problem and release a hotfix. Not to mention that continuous releases often happen within a day or so after the work is finished, so it’s likely engineers haven’t forgotten all about the details of a change.

There’s more ways of helping out on this, e.g. the ability to release only to a small sub-set of users, observe and then roll-out to all users after. Improvements like this should be evaluated.

Conclusion

Here’s a summary of what I’ve been working through on making continuous releases work with Scrum and fixed-length sprints:

  1. Decide to de-couple sprints from releases. Sprints are for planning, releases are just one more piece of getting work “done”.
  2. Move your final sign-off out of the sprint review (if that has been the case) and move it to the end of each story. Add this to your Definition of Done.
  3. Choose and implement the right branching strategy. Introduce feature branches and an integration branch and make sure work is being properly tested when it leaves a branch. Move only down to Master when things are really working and keep junk out of the trunk.
  4. Automate your testing as much as possible. Besides unit test coverage, implement as much automated tests as you need in order to gain the team’s and the product owner’s confidence, e.g. UI tests, load tests or even automated static code analysis. Use a CI server like Jenkins to tie all your automation together and make use of build pipelines.
  5. Automate your deployments down to a few clicks. Get the ability to execute the release into the Scrum team to avoid hand-overs and the team being blocked on the release.
  6. Release whenever you feel like it, be happy, and adding value to your live users quicker than ever.

What is your opinion about this? Are you doing the same? Did you face the same issues? Did you solve them in a similar way? How are you releasing your software when using Scrum and fixed-length sprints? I would be happy if you take a minute and leave a comment below.

Scrum Introduction Session with Daniel Shupp

Today I’ve joined a Scrum introduction session held by Daniel Shupp, CTO of TechPropulsionLabs, at Highlands Coffee on Nugyen Du in Ho Chi Minh City. In a friendly an open atmosphere Daniel walked us through the basic elements of Scrum and its application in agile software development projects. Besides pure knowledge enriched by helpful real-world examples from Daniel’s extensive software development experience, it also offered plenty of opportunities to ask questions, to right wrong assumptions about what Scrum is and how it works and to get a feeling how an adoption of Scrum in own development processes would be like. Daniel described it as part of what his company wants to give back to the local development community, with more similar events hopefully to come.

TechPropulsionLabs (www.techpropulsionlabs.com) develops software for seed and early-stage startups around the world and is a keen advocate of agile software development and Scrum.

Hopefully this helps to spread the word and raise awareness for the advantages Scrum has to offer. If you’re interested to follow this development in Vietnam, TechPropulsionLabs certainly is a company to keep in mind. Incidentally, my friend Nhan, working as a Scrum Master at Swiss IT Bridge, recently posted his opinion on Scrum in Vietnam on his blog. For a short introduction about Scrum just start with Wikipedia article and go from there.

Thank you again, Daniel, for the insightful and enjoyable session.