DIY dark-launching feature toggle in 16 lines of Ruby

Dark launching and soft launching functionality is an important ingredient for continuous shipping. To me by now, even 2 days of code changes piling up feel like more than necessary. Often enough, this is due to the fact that pending code changes aren’t complete enough yet to get them into production, let alone show it to anyone.

As a solution to this there are approaches like dark-launches and soft roll-outs, but often enough they require code and tool changes that are being delayed until very late. That’s quite a bummer since dark-launching is so helpful in moving fast, from both a technical and a product management point of view.

  • Developers are happy when they can get code into Production (“nice, it works, check!”)
  • Product Managers are happy when they can get early early feedback, at least from a few select customers (and without having to sacrifice on anything by publicly releasing something unfinished)
  • (The right group of) Users are happy if they get early access to features

Sounds like reason enough to stop NOT doing it right away huh? ;)

I was in this very situation on a relatively new project, with a million things to do, and I’ve decided to hack a DIY solution together and see where that gets me. Turns out it only a few lines of actual code and is already much much better than nothing. Here’s the premise:

  • I just wanted to be able to hide access to a feature, i.e. hide the link in the nav bar that leads users to it (yeah, I know, but it’s ok in this case, and I bet in a lot of cases it is)
  • We’re on Heroku so I thought Heroku config vars would be a great way to control it (no code deploy necessary, but also no overhead with databases and backend access etc., Heroku already provides everything)
  • Toggling only on a per-user basis (we were THAT small, yes), no other fanciness like user groups, geographic distribution, load balancing or whatever (yet)

And here’s the code that made it work:

Quick run-down:

  • Called the class DarkLaunch, more because I liked the sound of it than of its correctness ;)
  • It has this one feature toggle method that can be used to surround links etc. a la “if DarkLaunch.feature_visible(…)”. It returns true whenever a particular user should see the feature in this moment, and false otherwise
  • It always returns false if there’s not current_user (since we’re toggling on a per-user basis)
  • It always returns true for Development and Test (which leads to other problems but for the moment I liked it to have everything visible on Dev)
  • Each feature becomes an identifier (like UPLOAD_PHOTOS) that is used when calling the feature_visible()
  • It expects a Heroku config var named FEATURE_UPLOAD_PHOTOS
  • This config var is expected to contain a comma-separated list of user IDs to should have access to the feature
  • feature_visible returns true if the ID of the given user is in that list
  • Or once we’re ready to make a feature public to everybody, we can just set the variable to “PUBLIC”. In that case it always returns true, without checking user IDs anymore
  • And otherwise it returns false, blocking the feature for everybody else

Usage is then dead simple, as long as “launching” is as simple as showing something on the UI or hiding it:

It’s a bit of a quick hack, of course, and far from a complete or well-done (or flexible, or …) solution in so many ways. But it was great to see that a few lines of code added so much value to rolling out a feature. Feel free to let me know what you think or if you’re interested in using more of this. And who knows, maybe it becomes a little Gem… :-)

 




  • Hi Hendrik!
    I wouldn’t call this Dark launching, as the new code is not exercised. I would call this simply (binary) a feature toggle. For it to be a Dark launch, I expect the new system or code to receive calls, so that you can observe whether the “dark” system/code will handle the load it will receive once public, and the correctness of the results it produces. These results, however, aren’t made public. Hence the system operates in the “dark”, in the shadows of the real system that the public can see.

    • hendrikbeck

      Hey Fredrik,

      thanks for your comment.

      I don’t actually want to try too hard to defend what I did here, wasn’t really my finest hour and more of a little hack than anything serious.

      But since for the sake of the argument: I would say you could use the toggle to limit access to new features on different levels. For example, the pieces of code that receive calls can be open and let load onto the system while at the same time limiting certain UI features that give users access to the new data. In that case it would satisfy your definition of dark launching, wouldn’t it?

      So, my point is that it’s rather up to how you use it whether it’s dark launching or just toggling a few lines of code. What do you think?

      Cheers
      Hendrik

      • Well, I’ve seen code _very_ similar to that work just fine. The biggest challenge I’ve seen with toggles is knowing what you’ve tested, since all those flippers/switches/toggles/gates can be … toggled/switched on/off and typically at runtime. So what code paths did we actually massage in our pipeline(s) – where they relevant to what we’ll have running in production? (Second biggest issue is maintenance and knowing when to remove switches, but that’s highly tied to the context and whether you have control of all production sites or not (not = you deliver to other organisations who in turn develop software based on what you’ve built).)

        I think the point of “dark launching” is that whatever you just deployed (let’s call it “next” version), is actually taking on load from real production data (calls, requests, queues, …) so you can observe the “next” version’s behavior without exposing it publicly (so, “in the dark”). (Thus, if the code deployed is not processing production data, I wouldn’t call it a “dark launch”.) This way, you could say that the problem above – “what did we test, and does it say anything about what will run in production?” – is handled by letting the “next” version run along side the “current”, until we’re sufficiently happy that it will work and perform as needed.

        Does this makes sense to you?