Building the Right Product with Hypothesis-Driven Development

In my previous post about Making Continuous Delivery work with Scrum and Sprints I wrote about how to shorten release cycles significantly by changing your process and adding in the obvious amount of test and release automation.

A comment challenged that by basically saying “Well, this might help you build your product right (and in shorter cycles), but building the right product is a whole different question. And maybe the more important one.” Hard to disagree.

I wanted to dig deeper. These days you can’t be wrong by starting in the vicinity of Lean Startup if you’re looking for how to build the right product efficiently. As an engineer I’m familiar with a lof X-driven development techniques but then I’ve came across one I haven’t about before: Hypothesis-Driven Development.

The basic idea is simple:

  • Instead of requirements, you formulate assumptions, or hypotheses
  • At the same time you define a measurable signal, that will tell you whether you were right or wrong in a reasonably short amount of time

This sounds like a great start to get to a structured approach to factor the question of the right product into your development.

But of course building the right product and building the product right aren’t mutually exclusive. Nor would I say one is more important over the other. They both are. Where hypothesis-driven development guides you to make sure you’re being intentional about your assumptions and the need to test them, good old fashioned engineering techniques like test-driven development and test automation make sure you’re implementing your hypotheses right. Without being able to successfully (bug-free and all) deliver an increment of your software that aims at testing an assumption, you’re not going to get the right answers either.

The article I stumbled upon was which also links to a great presentation about Replacing Requirements with Hypotheses.

Making Continuous Delivery work with Scrum and Sprints

Scrum is promoting fixed-length sprints of 1, 2 or 4 weeks. We do 2. That means we plan for 2 weeks, the team works for 2 weeks, and then we’re ending the sprint with regression tests, release preparations, sprint review, a final sign-off, and the release. All engineering activities are set up around this. Now we want to release more often – continuously.

There’s a lot of good reasons to deliver continuously. Robert Johnson of  Facebook gave one of my favorite pep talks about this a while ago with “Facebook: Moving Fast At Scale”.

My requirement here is to stick to sprints of 2 weeks. I worked through a few alternatives, e.g. reducing the sprint length to one week instead. This would give us twice as many releases but that’s still nowhere near Facebook and others, and also wouldn’t solve the problem I wanted to solve since nothing would need to change. Another option that has been coming back regularly was Kanban, mostly because work flows continuously, without being boxed into sprints.

In the following I’m walking through what needs to change in order to make continuous delivery work with Scrum and fixed-length sprints.

Sprints vs. releases or: what are sprints for?

The main reasons for doing short sprints over long-term planning is the ability to respond to change and simply accepting the fact that long-term plans don’t work out anyways.

The reason why we ended up releasing after every sprint is that a) releases always create a certain overhead and it seems to make sense to batch up work and go through the overhead only once. And b) because it fits traditional project management thinking: once the planned work is done, it’s being signed-off and released.

But in the same way as we broke down work from a 12 months project into one sprint at a time during our transition to agile, it sounds reasonable to break things down further from batching up a release every 2 weeks into very small continuous releases.

Sprint review and final sign off

If releases are done every 2 weeks after a sprint is finished, it’s easy to combine sprint review and final sign-off. Again, a way of batching up things. But it doesn’t actually save a lot of time to batch up a sign-off, so we could as well do a quick sign-off after every story is completed. This has advantages anyways because a story is only then truly done after it’s signed-off and released. So the change that needs to happen is to de-couple sign-off of each story from the sprint review at the end of the sprint.

What remains is that the sprint review is an opportunity for the team to brag about the work they’ve done, to get stakeholders involved and updated and to make sure that the actual progress becomes visible and agreed upon. It has also advantages to show the current progress in a live environment because by the time of the sprint review, each story that is done is already running in production.

From a process point of view, the team would now be able to release continuously throughout the sprint. Now let’s get back to that “overhead” I talked about. The technical challenges need to be adressed, otherwise we’ll spend more time on the actual release than on development.

No junk in the trunk and code freeze

When releases are happening every 2 weeks, there’s always a bit of a touch down period at the end. Unfinished code is being finished, the last tickets are verified and time is spent to make sure that the Master is clean and ready to go. This often allows for a certain degree of sloppiness with the Master throughout the sprint. It’s not very critical if unfinished work ends up on the Master because there’s always enough time to fix it and clean it up. This needs to change if releases should happen “whenever we feel like it”.

First of all: No junk in the trunk! Master must always contain work that is finished and built to production quality. There is different branching strategies out there and documented. In this context they mostly come down adding two additional areas in the repository:

  1. Where work in progress happens. I recommend a separate feature branch for each story. Work remains solely on this branch until very high confidence is reached that a feature works.
  2. Where work gets integrated (but not yet pushed to Master). This is where finished stories are integrated with the latest Master – but outside of Master. Here, the remaining issues are caught and regression tests on related existing features are done. This should increase the confidence that a new story works and doesn’t break anything else to 100%. Then it goes into Master.

In an ideal world, this would allow us to get rid of good ol’ Code Freeze altogether. The different branches, the quality gates on each branch, and the integration down towards Master does exactly what a sprint-end code freeze does: make sure Master is clean and ready for a release.

Automate your testing

If each story should end up on a Master that is ready to release, then regression tests must be done for each ticket. Otherwise it’s hard to ensure that Master is really ready and recently added work doesn’t break anything. That’s where lack of test automation really does start to hurt.

There’s zillions of articles and books out there about this, so I’ll keep it short. The essence is: do it from the very beginning if you can. If it’s too late for that, invest some effort and get your regression tests automated, as close to 100% as you can.

This will decrease the overhead related to regression tests to as close to 0 as it gets.

Above I mentioned “quality gates”. These are all the different checks and tests a revision of your software must go through before it’s Done. Depending on your system they may consist of building your software, running unit tests, running static code analysis checks, running regression, UI and load tests and maybe some – hopefully not too many – manual steps. With many CI servers like Jenkins they can be automated and arranged in build pipelines. Such a pipeline runs them consecutively on a certain revision and only if it runs through until the end without failing in between, you’ve got a green light on this revision. I recommend using build pipelines.

Automate your deployments

Regression tests and deployments have a lot in common: if you only do them rarely, manual steps usually don’t hurt enough to automate it to near 100%. Now that we’re about to release very very often, this starts to hurt (= creates overhead). Release must be as lightweight, fast, and robust as it gets. If only 1 out of 10 releases makes even remotely trouble or fails, it’s hard to get the team confident enough to release frequently.

I recommend using a CI server, reducing deployments to a few clicks and add a suite of tests against your live servers into the script that tests the current deployment right after it’s done and fails or succeeds immediately.

I also recommend putting deployments as much as possible into the hands of the engineers who are writing the code or at least bring a member of the operations team into the Scrum team. This removes additional hand-overs and the team being blocked by others. Depending on the environment this is often difficult to achieve, but automating deployments down to a few clicks that never fail certain helps a lot.

Changing your definition of done

If all of the above works and releases indeed start to happen regularly and even after every story is done, it’s time to change the Definition of Done. Both the final sign-off and the release should be in there.

Up your skills!

A transformation to continuous delivery is a big step forward for a team. It requires a variety of skills to cope with all the technical and non-technical challenges. Being intentional about improving the team’s skills and being willing to spend time and money for this will definitely help. You can also learn along the way and learn from mistakes, but focus on better skills pro-actively will help making less mistakes, moving faster and gaining confidence within the team and outside.

Dealing with release problems

A live user environment much more likely breaks during a release, obviously. Hence a very common concern is that more frequent releases will as well introduce more frequent problems and actually increase the total amount of work necessary to deal with and fix all the problems.

But there’s also advantages: continuous releases consist of much smaller change sets. Hence it’s a whole lot easier to regression test them and to release them. And if something breaks, it’s also a whole lot faster to spot, understand and fix the problem and release a hotfix. Not to mention that continuous releases often happen within a day or so after the work is finished, so it’s likely engineers haven’t forgotten all about the details of a change.

There’s more ways of helping out on this, e.g. the ability to release only to a small sub-set of users, observe and then roll-out to all users after. Improvements like this should be evaluated.


Here’s a summary of what I’ve been working through on making continuous releases work with Scrum and fixed-length sprints:

  1. Decide to de-couple sprints from releases. Sprints are for planning, releases are just one more piece of getting work “done”.
  2. Move your final sign-off out of the sprint review (if that has been the case) and move it to the end of each story. Add this to your Definition of Done.
  3. Choose and implement the right branching strategy. Introduce feature branches and an integration branch and make sure work is being properly tested when it leaves a branch. Move only down to Master when things are really working and keep junk out of the trunk.
  4. Automate your testing as much as possible. Besides unit test coverage, implement as much automated tests as you need in order to gain the team’s and the product owner’s confidence, e.g. UI tests, load tests or even automated static code analysis. Use a CI server like Jenkins to tie all your automation together and make use of build pipelines.
  5. Automate your deployments down to a few clicks. Get the ability to execute the release into the Scrum team to avoid hand-overs and the team being blocked on the release.
  6. Release whenever you feel like it, be happy, and adding value to your live users quicker than ever.

What is your opinion about this? Are you doing the same? Did you face the same issues? Did you solve them in a similar way? How are you releasing your software when using Scrum and fixed-length sprints? I would be happy if you take a minute and leave a comment below.

Eating your own dog food

Early feedback is important. The earlier in the life cycle of development feedback comes in, the faster you can iterate, figure out what is working and what is not working, improve, and iterate again. You should release early and release often.

Releasing early and often usually aims at release cycles of something like 2 weeks. Depending on your kind of system, this can be shorter, but especially for native apps, much shorter release cycles aren’t really feasible. An even quicker way to get feedback is to give your software into the hands of your own colleagues and selected testers – constantly. Within your own organization, nobody prevents you from releasing continously, as often as multiple times in a day, without the overhead of an official release. You can then take the feedback of your own peers to iterate even faster. In modern tech slang this has become known as “eating your own dog food”.

Here at Klamr we try to get ongoing development into the hands of all our colleagues as fast as possible. The key to do this is Continuous Integration, that’s where everything ties in. Here’s how we do it:

  1. Jenkins: We’re using Jenkins as our continuous integration server and use it to automate most of our tasks. For each project there is a Jenkins job that pulls the latest code regularly, builds it, tests it, and then distributes it. Jenkins is amazingly easy to set up and configure, yet incredibly flexible and powerful. Ever since we started using it, it has grown with us into dozens of very different jobs for pretty much every project we’re working on.
  2. GIT branching strategy: while working on new features we need to decide when exactly changes should be made available internally. The general requirements are never to break builds altogether and not to break core functionality. We don’t pull every single change that is made anywhere in the project. We hook our Jenkins job into our GIT branching strategy to give the responsibility to decide which change is ready to our engineers. They have control over it by pulling changes into certain branches when they are ready.
  3. Schedule your distribution: depending on the project we either distribute immediately on every new change, or nightly. This is configured in Jenkins. My personal rule of thumb is: the more transparent new versions are for your (internal) users, the quicker and easier distributions/deployments are, and the less frequent commits to your distribution branch are, the better is to distribute changes immediately. When starting a new project, I generally start with this. Once problems appear that can be solved by slowing down, go to nightly distributions. Everything running server-side like a web app, for example, is completely transparent for users (just as they are in your production environment), new versions aren’t disrupting anybody. That’s a good candiate for very frequent distributions. An iOS application, on the other hand, needs to installed manually, hence pushing out 20 new versions every day tends to be disrupting for everybody. The last thing we want to do is make our co-workers feel disrupted and annoyed, that just leads to less and worse feedback.
  4. Distribute: the actual deliveries are all automated, but differ quite a bit depending on the type of software. Some examples of what we do:
    • Backend application: this get built and deployed to internal servers. This is the most complex deployment process we’ve got, especially things like database migrations don’t make it exactly trivial.
    • Web application: our web application is deployed on every new change to an internal, protected web server. It is then connected to our live database, so everybody in the company can use this web application instead of our live production web application. Changes on here have sometimes only been finished for minutes until they get available.
    • Android: our Android app is distributed in two ways: new APK files are sent out directly via email (Android makes installing new APK’s directly from email attachments so much easier than iOS) and via the service Appaloosa Store. The latter has some nice advantages like providing a custom store app and push notifications for new versions.
    • iOS: our iOS app is distributed via Testflight. There’s a few catches for iOS, for example that you need to build on a machine running Mac OS. That’s why we have a separate Jenkins instance only for building the iOS app. Most other Jenkins jobs are running on one Linux-based instance hosted on Amazon EC2. Also, devices must be explicitly registered in your ad-hoc provisioning and Apple restricts the number of internal devices to 100. No rocket science once it’s all set up, but a few extra hoops to jump through.
  5. Real data: It’s important to allow internal users to use these early builds against their real Production data. Our web application, for example, runs on an internal URL, but is configured against our Production servers and database. This allows us to test drive new features early on with our real accounts. This leads to much better feedback than asking people to test features on isolated servers with fake data and helped a lot with internal acceptance.
  6. Automate: the key to all of this is automation. If it’s not automated, regular distribution either doesn’t happen, or it wastes valuable engineering time. And as mentioned already above, this all ties into continuous integration. Much of the process and infrastructure described above should be in place anyways to continuously build and test your software in an automated way.
  7. Release notes: for us it proved incredibly helpful to automate release notes for each internal distribution. Remember that one of the main reasons to do all this in the first place is to get early feedback. Without release notes, it’s not possible for anybody to know what has changed and to know which part of your apps to pay attention to. We’re not doing this in all places but if we do it, we’re using GIT commit comments. They aren’t suitable for end users, but they are more than good enough for internal users.
  8. Respect: although these builds are only internal, we highly respect them. This means we never try to break them (see above), we try to make using and updating them as easy as possible for our co-workers, and our engineers are quickly reacting to any kind of feedback that comes in.

Regular internal distribution helps us to keep the feedback cycle as short as possible, sometimes even down to minutes. Automation of all the tasks involved helps us to keep moving fast, even as the number of systems and their complexity grows. I would highly recommend trying to automate as much as possible right from the start.

Are you eating your own dog food? What is your experience with this? Are you using different techniques and tools? Leave a comment, I’m very interesting to hear what you’re doing.