A/B Testing on iOS and Android (and others)

A/B testing (aka. split testing) has been around on the Web for a long time. The reason to do it is simple: much of how you design your user interface and your user experience is based on assumptions and no matter how good you think your assumptions are, you don’t know what works better in reality. And let’s be honest, most assumptions aren’t thought through carefully in the first place. After all, nothing can tell you better than real data from real users.

You’re already measuring what users are doing? That’s not quite enough yet. The reason why it’s “A/B testing” and not just “A testing plus analytics” is simple: numbers alone don’t always say much (e.g. vanity metrics). Comparing 2 numbers on the other hand is easy and reliable and even if neither of the 2 is really good, at least you know which one is the lesser of 2 evil and the one you should continue with (read: continue with and iterate again, fast).

One presentation that caught my interest early on was Marissa Mayer’s keynote at Google I/O in 2008. She gives insights on how far Google actually goes to fine-tune and optimize their user interfaces using A/B testing. That way, Google found proof for details that would usually be decided upon by a design or UX specialist, even down to the “right” whitespace between different parts of the page on google.com:

(…) you can test interfaces and be able to tell in a mathematical way which one works better for your users.

Fast forward to today. A/B testing is old news for websites, but still only just beginning for native mobile apps. It’s easy to get started on reasons, like:

  • Native apps get bundled as binaries, no code is running on your own servers where you can easily control them, implementing different behaviors of the same feature is arguably more difficult.
  • Release cycles are different. New versions still get “shipped” (just like software in the old days) and you don’t to send your users through this every other day or so.
  • And even if you would want to, on iOS it’s Apple who prevents you from doing so with a 1 to 2 weeks approval time.
  • It’s just relatively new and there isn’t much written, there aren’t many tools and frameworks and there isn’t much of a best-practice and universally established mindset existant around it.
  • And more…

The bright side: it’s changing now.

Tools and frameworks are appearing and making this whole story a whole lot easier. Some of the ones that look great are clutch.ioarise.io or Pathmapp. And even Amazon joins with their own A/B testing service.

I’m excited to give these tools a shot and see what they can do. We’ve recently been accepted into Pathmapp’s beta program and I’m thrilled to see how good it works. I’m going to follow up with experiences in later posts. In the meantime I’d be more than happy to hear your experiences and recommendations.