A/B Testing on iOS and Android (and others)

A/B testing (aka. split testing) has been around on the Web for a long time. The reason to do it is simple: much of how you design your user interface and your user experience is based on assumptions and no matter how good you think your assumptions are, you don’t know what works better in reality. And let’s be honest, most assumptions aren’t thought through carefully in the first place. After all, nothing can tell you better than real data from real users.

You’re already measuring what users are doing? That’s not quite enough yet. The reason why it’s “A/B testing” and not just “A testing plus analytics” is simple: numbers alone don’t always say much (e.g. vanity metrics). Comparing 2 numbers on the other hand is easy and reliable and even if neither of the 2 is really good, at least you know which one is the lesser of 2 evil and the one you should continue with (read: continue with and iterate again, fast).

One presentation that caught my interest early on was Marissa Mayer’s keynote at Google I/O in 2008. She gives insights on how far Google actually goes to fine-tune and optimize their user interfaces using A/B testing. That way, Google found proof for details that would usually be decided upon by a design or UX specialist, even down to the “right” whitespace between different parts of the page on google.com:

(…) you can test interfaces and be able to tell in a mathematical way which one works better for your users.

Fast forward to today. A/B testing is old news for websites, but still only just beginning for native mobile apps. It’s easy to get started on reasons, like:

  • Native apps get bundled as binaries, no code is running on your own servers where you can easily control them, implementing different behaviors of the same feature is arguably more difficult.
  • Release cycles are different. New versions still get “shipped” (just like software in the old days) and you don’t to send your users through this every other day or so.
  • And even if you would want to, on iOS it’s Apple who prevents you from doing so with a 1 to 2 weeks approval time.
  • It’s just relatively new and there isn’t much written, there aren’t many tools and frameworks and there isn’t much of a best-practice and universally established mindset existant around it.
  • And more…

The bright side: it’s changing now.

Tools and frameworks are appearing and making this whole story a whole lot easier. Some of the ones that look great are clutch.ioarise.io or Pathmapp. And even Amazon joins with their own A/B testing service.

I’m excited to give these tools a shot and see what they can do. We’ve recently been accepted into Pathmapp’s beta program and I’m thrilled to see how good it works. I’m going to follow up with experiences in later posts. In the meantime I’d be more than happy to hear your experiences and recommendations.

3 steps I should have taken to get started with Web and App analytics

In the last weeks I wrapped my head around client-side analytics here at Klamr. Looking back, probably the biggest Gotcha! was the realization that ignoring all the cool out-of-the-box stats and colorful graphs would have been better. It’s all to easy to get paralyzed by the sheer amount of information each tool provides you. The answer you’re  looking for isn’t between visits and pageviews and referrers and user flows and mobile data and custom segments, so don’t spend your time looking for it there (just yet). Eric Ries calls them “Vanity Metrics” and tells you why they are dangerous. I should have listened earlier.

It actually starts with making sure you know what your app is supposed to do and what user actions are valuable for you. That’s all that really matters in the beginning. Whether you’ve got 50 or 5000 sessions and whether your median session length is 34 or 58 seconds, that doesn’t tell you much yet. So, better start with what matters and ask yourself: what are the few important actions I want my users to do with my app?

Hence, if I would start over again, I would do the following:

  1. Make a list with the 3, 4, 5 most valuable user actions in your app, on a very high level. For a shopping app, a completed purchase should obviously be on this list. For a photo sharing app, a shared photo should be on. A blogging platform would include creating a new blog and publishing a post. Note that this list should contain the successful  final action at the end of activities, not the actual activity leading up to it. Analytics tools often call them “events”. A purchase, for example, usually consists of a relatively long flow, but for now only the successful completion matters.
  2. Determine where and how these events are happening in your app, i.e. the lines of code that actually perform them. Find out whether you wanna instrument your client application (web or native) or whether you better do it on your server backend. Instrumenting your client often tends to be easier tool-wise, but sometimes it’s just better done on the server. But keep in mind that you try to find out how to improve your client. Getting the number of purchases, for example, isn’t the actual goal here. What you actually want to do in later iterations is find out where you can improve user flows, UI and UX to optimize. That’s why server-side instrumentation doesn’t work out well later.
  3. Instrument your system with the tool of your choice, e.g. murm.ioGoogle Analytics or Flurry. Then release it and wait for data. Instead of tons of numbers you should now see your few major numbers showing up in the tool.

See what this gives you and then iterate. You can add more (refined) events, categorize them like a Pirate, add funnels and goals, and go deeper. And in one of your next iterations you can look at what you can do to improve.