The following post was contributed by entrepreneur-turned-VC Eden Shochat.
“There are lies, damn lies and then there is statistics”
— Mark Twain
One of the great benefits of being an investor is that I get the opportunity to work with people who are incredibly smart about running Internet businesses. Product and UX is still a kind of a black art in Israel and so I found myself writing this blog post on a cold Saturday afternoon.
I noticed that there is a common thread running through groups that are quick to iterate and have a track record of successful apps. These groups are very self aware – they measure everything about their applications, starting with user actions and ending with indicators measuring the success of their application and use it to drive their engineering. Strangely enough, the metrics are used to approximate the user, rather than traditional user stories and design personas.
Unlike an enterprise software company where sales-cycle abound and analytics boil down to an excel file with a weighted deal-flow, running an Internet company is great for a numbers person that loves statistics. There are really just two key performance indicators (KPIs) that you should be optimizing:
- Customer Acquisition Cost: The amount of money you spend to get a user to produce income. Seems simple at first sight, but gets very complicated when one needs to measure word of mouth expenses and effects. If the source of each user can be tracked, then measure at a per-user basis, otherwise take the entire user acquisition budget and divide by the number of users. Over-time (more than one level) referrals are harder to measure, but it’s a close enough approximation even without them.
- Life-Time Value of each user: The life-time value is hard to estimate before having the monetization machine run for a while, so you should usually focus on a different number here – the retention rate. Again, simpler said than done as you’d need to define what it is for a user to be defined as retained; what usually works is whether the user used the application during the past week.
That’s an over-simplication of course, as the means to get to the acquisition costs generally revolves not only around cost but also who assumes the risk for the traffic. Hence given the same cost, acquisition cost where all risk is taken by the traffic supplier (CPA) is better than where you assume the risk (CPM). When I evaluate investments I try to factor this into account as well as the “business risk”.
You must be pushing back at this stage, saying that this doesn’t take into account critical factors such as virality, engagement, monetization strategy, the number of active users or that magic ingredient that makes a user go wow. Well, it does. If you have a viral application (congratulations! that’s the hardest type to create) then your customer-acquisition costs will be marginal to zero. Engagement? same thing + lifetime value (if you keep the user, you make more money). Monetization strategy? lifetime value. It works so much better when you just need to look at a small number of factors in order to make the hardest decision of a startup: where to expend resources.
As a side-note, this is also a great way to manage your investors, if they have a clue. Send them weekly reports of the KPIs. Don’t bother sending weekly/monthly/before-useless-board-meetings update deck. If the KPIs you measure are descriptive of the business, it will be the best information you could share. This could be a fun post too: “Managing your investors”.
It usually starts with having a top-level requirement coming in. When practicing result centric product design you always start with the user, as the results are always about the user. This means that you should ask yourself:
- Who is your user in general? What do they care about? You should already know this. Does this new requirement change it? Should it? Is this even a requirement this user needs?
- What is the goal that the user is trying to accomplish with this feature?
- Does this feature require them to interact with other users? Seemingly a second level question, but this is key to finding viral potential for this new feature. I still don’t understand to date why doesn’t Shazam allow me to tell my friends about this song that it just recognized?! That’s silly.
- What will surprise them and cause them pleasure? When I bought my Macbook Air, the power cord being magnetic so that if someone stumbles on it the computer won’t fall delighted me. When I saw Any.DO for the first time and put in “Call” into a task, it autocompleted it from my contact list. That surprised me. I immediately told both stories to anybody who would listen. That’s reduction in customer acquisition costs right there.
This is mostly standard good consumer Internet product management practice. The key difference in result-centric product design is that you are shooting to get a specific result of the key KPIs with the product changes. What is the impact that you want this feature to have on your users? What’s your goal in term of numbers? It should be a something in the lines of: “increase retention by 20%” or “reduce acquisition costs to 25 cents by increasing virality”.
Yes. It’s hard.
Once you write it down, though, you can easily compare which features you should focus on. Yes, this assumes you are predicting correctly. More about that later.
Now translate this into measurable metrics. In a perfect world you could do specific A/B testing for each of the features you add. This means that for an iteration with any number of new features, you’d be running multiple tests, each with only one addition turned on, and measuring the impact of a specific modification. Fast-forward into the real world where there are cross dependencies between the different features, measuring the top-level KPIs doesn’t work. In such a case you look at 2nd level indicators.
- Soft indicators are the ones that don’t directly relate to the top-level KPIs but ones that should give a sense whether the “dogs are eating the dogfood”.
Is there usage of the feature at all?
Repeat usage? Is it tapering over time?
- Hard indicators are the top-level KPIs but split according to specific subset of users that actively used the feature.
Key for the success of this methodology is determining success KPIs in advance, including what they are, how do you measure them and what would constitute success. This seems simple enough, but if you don’t do it, you are susceptible to the greatest sin which is cognitive dissonance or “hindsight is always 20/20”. You convince yourself that the measurements which are (or seem to be..) improving are they ones that really measure the impact of a specific feature.
There are two by-products of this approach:
- You will invest the time required to instrument your code to monitor these KPIs, so there won’t be an issue of data collection down the line when you actually want to understand whether the feature made sense.
- By determining the KPIs in advance, you will have less arguments about adding more instrumentation and modifying the success evaluation criteria later, after you launch. Everyone interprets the numbers differently when alternate interpretation can support their opinion (hence, lies, damn lies and statistics).
Instrumenting always seems like a waste of time, but at the end of the day, the overhead on top of the feature implementation is marginalized when you want to know whether the primary investment (the feature) made sense.
Again, this is easier said than done for mobile applications. There are some unique considerations impacting mobile apps, mostly because this is a hybrid of client/server applications that have an auto-update mechanism, but where they users don’t have to upgrade. There is a delay in measurable impact caused by version up-take times. When deploying mobile apps, it can take a week before you have significant numbers, which underscores the importance of both instrumenting the app and choosing approximate soft indicators that you could use in the early days to evaluate feature success/failure.
For this approach to work you need to be brutally honest with yourself. First of all, you need hosting. Good hosting. You can try and find a HostGator Cyber Monday Deal 2016 coupon, but even with that – a powerful server for this will not be cheap. Every feature should have a post-mortem (maybe a bad name, it’s not where all features go to die) where you compare the actual impact the feature has on your product vs. what you originally thought it would have. Even if it’s a feature with positive outcome, think back to the drivers of your prediction – did it not meet them fully? why is that? What made you mis-predict? How can you improve your prediction accuracy in future features? Or maybe, did it surpass them? WHY? This can be a great basis for making decisions for future features. Only if you do this process religiously can you use this as the basis for feature selection in the future. This also means that some features should be retired. Complexity is the root of many retention issues, and you shouldn’t be shy of killing/retiring features that don’t contribute to the top-level KPIs. Sometimes counter-intuitive but think Instagr.am/Picplz vs. the larger social networks.
One example of result-centric product design could be a new widget for an imaginary Android project management application. The primary goal for the widget would be retention, as it’s visible most of the time, unlike applications that require the user to launch them. Our users are people who have too much on their mind, want to get things done and need help focusing. Many of those users try to get organized in a variety of ways but aren’t necessary power users. For them, the widget should be a way to better organize a short-term todo list that they look at more regularly than the overall list of things they need to do within the project. Most of the task lists in our lives and projects quickly become unmanageable and become a source of discomfort rather than a tool that makes us feel better organized.
Google describes Android widgets as: “A widget displays an application’s most important or timely information at a glance, on a user’s Home screen. The most effective widgets display your application’s most useful or timely data in the smallest widget size. Users will weigh the usefulness or your widget against the portion of the Home screen it covers, so the smaller the better.”.
We want our user to quickly feel more organized when he adds our widget. Lets take the top priority and due date items and populate the widget for him or her with a short list of things that must be done today. Rather than simply providing another view of the same task list, the user will be able to determine a “cut-off” point after-which the rat race for the day will be over. This mental shift will help users feel that they accomplished something today. That, by itself, is designing for pleasure.
Estimating what kind of increase can we get will depend on how many users actually go ahead and add it, which in turn would drive our user experience design to have the application offer the user to add the widget to the home page. By having the user add the widget, users that install the application will be exposed to it at all times. There are a couple of KPIs of interest: how many users actually added it immediately after installing the application, how many added it from within the configuration screens or the Android menus and how many of these users kept it. These are all soft metrics. The overall retention % is of key importance and there we will differentiate between users that added the widget and those that didn’t (hard metrics). Before starting the coding we will predict that of the users that activate it (we predict 50% will add it assuming a way to automatically add the widget via the application), we expect a 25% bump in retention.
By monitoring the metrics we can know whether the widget is effective, and if it is, whether the methods we have used within the app to get users to use it serve their purpose. If the statistics don’t support the original goals, it’s likely you can better use your engineering on other features that can achieve that goal. There will always be fewer engineers than tasks; use them to make a difference.