I love Twitter. I’m really addicted to those witty, deeply insightful, haiku-like aphorisms some people can squeeze into 140 characters. I love this little community in my timeline where people share what they love, live through, and think. In addition, my UX feed regularly brings up interesting news and blogposts. Recently, I came across this one by Andrew Chen, a blogger focused on mobile products, metrics, and user growth: New data shows losing 80% of mobile users is normal, and why the best apps do better.

In this post, Andrew shows so-called retention curves – the percentage of daily active users of an app, from the day of installment over time. Wait – percentage over time? Doesn’t this sound exactly like survival functions? So why don’t we try some survival analysis on them?

Let’s first see what Andrew says. In a nutshell, most apps lose daily active users rapidly – 80% within the first few days is normal, then the retention curves flatten out. However, the top ten apps’ curves are consistently above the crowd – they retain much more users, and for a longer time. According to Andrew, this process begins at day one, in the very first moments people spend using an app. Later on, Andrew writes, the loss of active users happens at about equal rates. To quote:

…users find the top apps immediately useful, use it repeatedly in the first week, and the drop off happens at about the same speed as the average apps. Fascinating.”

I’d suggest you take a break and look up Andrew’s original post– he has a lot of valid advice worth reading. However, let’s check on his conclusions by applying some survival analysis – more specifically, a Weibull analysis.

In a previous post, I explained how to perform a Weibull analysis of web site visit durations. For user retention analysis, the approach is very similar – first, you check whether the data follow a Weibull distribution. If they do, you have not only insight into the process that generated those data, but also parameters to quantify it.

Fortunately, Andrew published the data underlying his analysis, so we can use them to create a so-called probability plot (learn more about probability plots here). For a Weibull analysis plot, we plot the natural logarithm of times on the horizontal axis. On the vertical axis, we need to calculate a little more – let S denote the percentages, then we plot ln[ln(1/S)]. With these transformations, the data points will line up straight if and only if the data follow a Weibull distribution. Let’s check (Figure 1):



Figure 1: Weibull probability plot for user retention data reported in New data shows losing 80% of mobile users is normal, and why the best apps do better (accessed June 23, 2015)

All data series line up perfectly straight – the Weibull distribution model describes user retention curves perfectly. I added regression equations for the top 10 and average apps; the R² values mean that those equations explain 99% of the variance! What’s more, the lines are almost parallel: what does this mean?

The Weibull distribution model describes “aging” processes. In reliability analysis, a frequent observation is that technical parts fail at a rate that increases over time. The increase in failing rates can be read from the so-called shape parameter of the Weibull distribution (more on that in a minute). Here, the “aging” is inverted: users abandon apps at a decreasing rate: the longer someone uses an app, the more likely is it that they keep using it for an even longer time.

We can read the extent to which this “negative aging” happens directly from the slope of the regression lines in the Weibull plot. As we can see, it is almost identical for all apps – in fact, the slope of the top 10 apps line is ever so slightly less steep than that of the average line. “Negative aging” happens to a slightly greater extent in top 10 apps than in average apps. What does this mean?

Well, first of all the process of losing daily active users over time is not really different between the top 10 apps and the rest of the crowd – it’s just happening slower. What’s more, whatever the reason is for users abandoning apps, its influence is constant over time! Whatever the top apps do for retaining users, they do it along the entire lifecycle. They are not only doing better in the first minutes, as Andrew Chen writes, but they keep doing better over time. We can even quantify just how much better they do.

Let’s look a little deeper into the Weibull distribution model. It comes with two parameters, a scale parameter α and a shape parameter γ. The scale parameter indicates the magnitude of times involved – it is equal to the time when 63.2% of individuals “died”, that is, abandoned daily use of the app. The shape parameter quantifies the direction and amount of aging: the smaller the shape parameter, the more negative aging happens – the longer people use an app, the longer they will keep using it.

The numerical values of the scale and shape parameters can be determined from the regression equation in the Weibull plot. α is the time at which the line intersects the time axis. For the top 10 apps, this so-called “characteristic time” is 935 days, for the average app not even half a day (0.44). γ is the regression equation’s offset, divided by α. The difference between top and average apps in α is dramatic, that in γ almost nonexistent – same shape, different scale.

So let’s return to my initial question: just how addictive is Twitter? I found a user retention dataset published by The Information.com on their blogpost “Which Apps Retain Their Users—And Which Ones Don’t”, which includes data from Twitter. Let’s consider the probability plot for those data (Figure 2):


Figure 2: Weibull probability plot for Twitter user retention data reported in Which Apps Retain Their Users—And Which Ones Don’t (accessed June 23, 2015)

Again, the Weibull model fits the data perfectly with 99% variance explanation. The characteristic time α is 38 days: typically, users actively use Twitter on a daily basis for about a month. The shape parameter γ is 0.177 – slightly less than the top 10 app’s γ calculated from Andrew Chen’s data, which is 0.186, and fairly better than average (0.208).

So Twitter loses daily active users faster than the top 10 listed by Andrew. But to those who stay, it’s more addictive! Looks like I’m not the only one hooked on it…

Not logged in