Making Continuous Delivery work with Scrum and Sprints

Scrum is promoting fixed-length sprints of 1, 2 or 4 weeks. We do 2. That means we plan for 2 weeks, the team works for 2 weeks, and then we’re ending the sprint with regression tests, release preparations, sprint review, a final sign-off, and the release. All engineering activities are set up around this. Now we want to release more often – continuously.

There’s a lot of good reasons to deliver continuously. Robert Johnson of  Facebook gave one of my favorite pep talks about this a while ago with “Facebook: Moving Fast At Scale”.

My requirement here is to stick to sprints of 2 weeks. I worked through a few alternatives, e.g. reducing the sprint length to one week instead. This would give us twice as many releases but that’s still nowhere near Facebook and others, and also wouldn’t solve the problem I wanted to solve since nothing would need to change. Another option that has been coming back regularly was Kanban, mostly because work flows continuously, without being boxed into sprints.

In the following I’m walking through what needs to change in order to make continuous delivery work with Scrum and fixed-length sprints.

Sprints vs. releases or: what are sprints for?

The main reasons for doing short sprints over long-term planning is the ability to respond to change and simply accepting the fact that long-term plans don’t work out anyways.

The reason why we ended up releasing after every sprint is that a) releases always create a certain overhead and it seems to make sense to batch up work and go through the overhead only once. And b) because it fits traditional project management thinking: once the planned work is done, it’s being signed-off and released.

But in the same way as we broke down work from a 12 months project into one sprint at a time during our transition to agile, it sounds reasonable to break things down further from batching up a release every 2 weeks into very small continuous releases.

Sprint review and final sign off

If releases are done every 2 weeks after a sprint is finished, it’s easy to combine sprint review and final sign-off. Again, a way of batching up things. But it doesn’t actually save a lot of time to batch up a sign-off, so we could as well do a quick sign-off after every story is completed. This has advantages anyways because a story is only then truly done after it’s signed-off and released. So the change that needs to happen is to de-couple sign-off of each story from the sprint review at the end of the sprint.

What remains is that the sprint review is an opportunity for the team to brag about the work they’ve done, to get stakeholders involved and updated and to make sure that the actual progress becomes visible and agreed upon. It has also advantages to show the current progress in a live environment because by the time of the sprint review, each story that is done is already running in production.

From a process point of view, the team would now be able to release continuously throughout the sprint. Now let’s get back to that “overhead” I talked about. The technical challenges need to be adressed, otherwise we’ll spend more time on the actual release than on development.

No junk in the trunk and code freeze

When releases are happening every 2 weeks, there’s always a bit of a touch down period at the end. Unfinished code is being finished, the last tickets are verified and time is spent to make sure that the Master is clean and ready to go. This often allows for a certain degree of sloppiness with the Master throughout the sprint. It’s not very critical if unfinished work ends up on the Master because there’s always enough time to fix it and clean it up. This needs to change if releases should happen “whenever we feel like it”.

First of all: No junk in the trunk! Master must always contain work that is finished and built to production quality. There is different branching strategies out there and documented. In this context they mostly come down adding two additional areas in the repository:

  1. Where work in progress happens. I recommend a separate feature branch for each story. Work remains solely on this branch until very high confidence is reached that a feature works.
  2. Where work gets integrated (but not yet pushed to Master). This is where finished stories are integrated with the latest Master – but outside of Master. Here, the remaining issues are caught and regression tests on related existing features are done. This should increase the confidence that a new story works and doesn’t break anything else to 100%. Then it goes into Master.

In an ideal world, this would allow us to get rid of good ol’ Code Freeze altogether. The different branches, the quality gates on each branch, and the integration down towards Master does exactly what a sprint-end code freeze does: make sure Master is clean and ready for a release.

Automate your testing

If each story should end up on a Master that is ready to release, then regression tests must be done for each ticket. Otherwise it’s hard to ensure that Master is really ready and recently added work doesn’t break anything. That’s where lack of test automation really does start to hurt.

There’s zillions of articles and books out there about this, so I’ll keep it short. The essence is: do it from the very beginning if you can. If it’s too late for that, invest some effort and get your regression tests automated, as close to 100% as you can.

This will decrease the overhead related to regression tests to as close to 0 as it gets.

Above I mentioned “quality gates”. These are all the different checks and tests a revision of your software must go through before it’s Done. Depending on your system they may consist of building your software, running unit tests, running static code analysis checks, running regression, UI and load tests and maybe some – hopefully not too many – manual steps. With many CI servers like Jenkins they can be automated and arranged in build pipelines. Such a pipeline runs them consecutively on a certain revision and only if it runs through until the end without failing in between, you’ve got a green light on this revision. I recommend using build pipelines.

Automate your deployments

Regression tests and deployments have a lot in common: if you only do them rarely, manual steps usually don’t hurt enough to automate it to near 100%. Now that we’re about to release very very often, this starts to hurt (= creates overhead). Release must be as lightweight, fast, and robust as it gets. If only 1 out of 10 releases makes even remotely trouble or fails, it’s hard to get the team confident enough to release frequently.

I recommend using a CI server, reducing deployments to a few clicks and add a suite of tests against your live servers into the script that tests the current deployment right after it’s done and fails or succeeds immediately.

I also recommend putting deployments as much as possible into the hands of the engineers who are writing the code or at least bring a member of the operations team into the Scrum team. This removes additional hand-overs and the team being blocked by others. Depending on the environment this is often difficult to achieve, but automating deployments down to a few clicks that never fail certain helps a lot.

Changing your definition of done

If all of the above works and releases indeed start to happen regularly and even after every story is done, it’s time to change the Definition of Done. Both the final sign-off and the release should be in there.

Up your skills!

A transformation to continuous delivery is a big step forward for a team. It requires a variety of skills to cope with all the technical and non-technical challenges. Being intentional about improving the team’s skills and being willing to spend time and money for this will definitely help. You can also learn along the way and learn from mistakes, but focus on better skills pro-actively will help making less mistakes, moving faster and gaining confidence within the team and outside.

Dealing with release problems

A live user environment much more likely breaks during a release, obviously. Hence a very common concern is that more frequent releases will as well introduce more frequent problems and actually increase the total amount of work necessary to deal with and fix all the problems.

But there’s also advantages: continuous releases consist of much smaller change sets. Hence it’s a whole lot easier to regression test them and to release them. And if something breaks, it’s also a whole lot faster to spot, understand and fix the problem and release a hotfix. Not to mention that continuous releases often happen within a day or so after the work is finished, so it’s likely engineers haven’t forgotten all about the details of a change.

There’s more ways of helping out on this, e.g. the ability to release only to a small sub-set of users, observe and then roll-out to all users after. Improvements like this should be evaluated.

Conclusion

Here’s a summary of what I’ve been working through on making continuous releases work with Scrum and fixed-length sprints:

  1. Decide to de-couple sprints from releases. Sprints are for planning, releases are just one more piece of getting work “done”.
  2. Move your final sign-off out of the sprint review (if that has been the case) and move it to the end of each story. Add this to your Definition of Done.
  3. Choose and implement the right branching strategy. Introduce feature branches and an integration branch and make sure work is being properly tested when it leaves a branch. Move only down to Master when things are really working and keep junk out of the trunk.
  4. Automate your testing as much as possible. Besides unit test coverage, implement as much automated tests as you need in order to gain the team’s and the product owner’s confidence, e.g. UI tests, load tests or even automated static code analysis. Use a CI server like Jenkins to tie all your automation together and make use of build pipelines.
  5. Automate your deployments down to a few clicks. Get the ability to execute the release into the Scrum team to avoid hand-overs and the team being blocked on the release.
  6. Release whenever you feel like it, be happy, and adding value to your live users quicker than ever.

What is your opinion about this? Are you doing the same? Did you face the same issues? Did you solve them in a similar way? How are you releasing your software when using Scrum and fixed-length sprints? I would be happy if you take a minute and leave a comment below.

A/B Testing on iOS and Android (and others)

A/B testing (aka. split testing) has been around on the Web for a long time. The reason to do it is simple: much of how you design your user interface and your user experience is based on assumptions and no matter how good you think your assumptions are, you don’t know what works better in reality. And let’s be honest, most assumptions aren’t thought through carefully in the first place. After all, nothing can tell you better than real data from real users.

You’re already measuring what users are doing? That’s not quite enough yet. The reason why it’s “A/B testing” and not just “A testing plus analytics” is simple: numbers alone don’t always say much (e.g. vanity metrics). Comparing 2 numbers on the other hand is easy and reliable and even if neither of the 2 is really good, at least you know which one is the lesser of 2 evil and the one you should continue with (read: continue with and iterate again, fast).

One presentation that caught my interest early on was Marissa Mayer’s keynote at Google I/O in 2008. She gives insights on how far Google actually goes to fine-tune and optimize their user interfaces using A/B testing. That way, Google found proof for details that would usually be decided upon by a design or UX specialist, even down to the “right” whitespace between different parts of the page on google.com:

(…) you can test interfaces and be able to tell in a mathematical way which one works better for your users.

Fast forward to today. A/B testing is old news for websites, but still only just beginning for native mobile apps. It’s easy to get started on reasons, like:

  • Native apps get bundled as binaries, no code is running on your own servers where you can easily control them, implementing different behaviors of the same feature is arguably more difficult.
  • Release cycles are different. New versions still get “shipped” (just like software in the old days) and you don’t to send your users through this every other day or so.
  • And even if you would want to, on iOS it’s Apple who prevents you from doing so with a 1 to 2 weeks approval time.
  • It’s just relatively new and there isn’t much written, there aren’t many tools and frameworks and there isn’t much of a best-practice and universally established mindset existant around it.
  • And more…

The bright side: it’s changing now.

Tools and frameworks are appearing and making this whole story a whole lot easier. Some of the ones that look great are clutch.ioarise.io or Pathmapp. And even Amazon joins with their own A/B testing service.

I’m excited to give these tools a shot and see what they can do. We’ve recently been accepted into Pathmapp’s beta program and I’m thrilled to see how good it works. I’m going to follow up with experiences in later posts. In the meantime I’d be more than happy to hear your experiences and recommendations.

A New Attempt

It’s been 492 days ago since I published my last blog post. Recently I’ve started to become motivated again to do more blogging. I moved away from hosting the blog myself at hendrikbeck.com and I’ve cleaned up and renewed everything a little bit.

One more change: I always wanted to keep my dedicated to articles related to my work and use other resources (like hendrikbeck.tumblr.com) for personal posts. In now I think that’s just BS, given that I don’t blog every day. So, I’m just gonna combine it somehow and hope that I keep up with it this time and don’t stop again after some months. Not sure how much faith I’ve got in myself… ;)