Data-driven people ask questions with intent
"Look in our analytics data and find a few audience segments."
I hate this question. So much.
Your goal should never be to simply analyse the data. Rather, your goal should be to first ask better questions of your data, and secondly to action the result.
Asking better questions of your analytics data is the obvious, but difficult-to-do practice in data analysis. If you aren't clear about the question you are asking, the results will be noisy, possibly leading you to make false conclusions. But this article isn't about asking better questions, it's about committing to actioning the results.
Before you go through the iterative process of asking better questions, you first need to establish your boundaries of influence. These are the limits of what you feasibly have the resources to change when you answer your question.
Boundaries of influence
I'm an amateur student of Stoic Philosophy. One of the core ideas in Stoicism is the Dichotomy of Control, the assertion that some things are 'up to us' (within your power), and others are 'not up to us' (not within your power). Putting aside the greater importance of this precept in our personal lives, it's incredibly important when tackling the practice of data analysis.
Put simply, it's silly to ask a question of your data where you don't have the resources to directly action or influence any change when your question is answered. This is your boundary of influence.
If you work in a team, you probably have a job role which outlines parts of your team's product/process where you are looked to for influence, or have the autonomy to change. This may be a new design or improving an existing product or process.
If you are a programmer, you probably won't be changing marketing campaigns on Facebook that influence the traffic mix to your website. If you're a paid media marketer, you are unlikely to be changing the onboarding experience when a user signs up to a new product.
We will take a look at this example in more detail below.
Understanding these constraints are vital. Without them, the analysis paralysis party is waiting for you to turn up.
Confused is okay, overwhelmed is not.
We all can get a bit lost when looking at raw data. Sometimes the cause is poor naming conventions in the schema design or inconsistencies in a sample you are looking at. Perhaps the relationships between different concepts are too complex to understand without a diagram.
This confusion is okay.
Overwhelm, on the other hand, tends to come from not understanding the significance of a single data point, or not having any idea what might cause the data point to change.
For example, if I told you a webpage gets 1000 sessions per day. What does this mean to you? This metric (sessions) across two dimensions (a particular page, bounded by day), has no universal significance. If someone makes this assertion, I'll immediate ask:
- What's a session? Is it tied to a user, a device, (do we even care) ? What does it mean for a session to 'start' and 'end'? Can it be paused and restarted?
- 1000 sessions per day implies an aggregation. Is this a sum of sessions in one day? Is it the average sessions per day from a given time period? What's the sample period? How many observations did we make? Are we calculating this at a higher resolution (e.g. per month) and breaking it down to days? Or did we only sample at a lower resolution (e.g. per hour) and extrapolate upwards?
- Do we need to care about seasonality?
At this point you'll realise I'm not such a great guest at a dinner party.
Putting everything above aside, there is still a single fundamental question that needs to be answered:
What changes could I make with this information?
Let's go back to the example of the Programmer who doesn't do marketing and the Growth Marketer who only looks at the top of the funnel.
What change will you make?
Suppose you are the Programmer. 'What is the cost and retention rate of each acquisition channel?' is really not a great question to ask. Why? Because the cost of acquisition and retention per channel are unlikely metrics you have any power to change.
On the other hand, consider you were the Growth Marketer. 'What percentage of users complete the onboarding before and after the recent site update?' is not an actionable question to ask. It's unlikely that they were part of the site redesign process, would understand the test coverage, or have any voice in rolling back the site.
These two examples illustrate my core argument: asking questions outside of your boundaries of influence will lead to overwhelm. Unfortunately, this is how most teams approach analytics. They open Google Analytics and experience the dreaded sense of overwhelm from the chaos of metrics outside of their boundaries of influence.
It's worth noting that for many analytics projects, we need cross-functional teams solving a single problem. It might be that the Growth Marketer discovers that acquiring customers through podcast sponsorship is cost-effective but these users aren't retained. They might need the Programmer to care about their channel mix problem to jointly dig deeper and help formulate better questions to ask of the data.
How to commit to the results
Before firing up your analytics tool, stop for a moment and ask yourself two questions:
- What question am I trying to answer?
- What will I do next after I get the answer?
It can be helpful to contemplate an answer to the question you are asking. This will help you see if your question is not specific enough.
For example, if your question was: what is the retention rate of our users? you might realise that the possible answers of 5%, 10%, 90% does not give you any practical next steps. A better question might be: what is the 7-day retention rate for all users compared to the 7-day retention rate of those who completed the onboarding?
The possible answer, with corresponding actions, might be:
- no difference -> let's not invest in improving the onboarding in the next release
- big positive difference -> follow up with a new question quantifying the financial impact
- big negative difference -> run an A/B test to qualify if the historical data is correct
Asking better questions is an entire topic in and of itself. But hopefully this post has provided some insight on the value of committing to actionable results.