Retention and loss indicators

1, Background and significance

Pay attention to the stickiness of the product corresponding to users, judge whether the APP can retain users at the initial stage and the growth of active users, especially in the AARRR model, retention is separately described as a user operation data index.

However, in the middle and later stages of APP, we are more concerned about the user stability and revenue transformation of the product, so the loss index can better reflect the stage of the product life cycle. It also helps to analyze the life cycle of active users and the changes of channel quality.

2, Data and index calculation

1. Data sources

1.1 database table: user table / login table (user registration time, latest login time, login location)

1.2 user buried point data: considering that some products, especially overseas products, are available in non login status, buried point data is used to report data.

2. Retention indicators

2.1 retained users, retention rate

Derivative indicator: day 1 retention ratio

Definition: the proportion of users who use (log in) on the specified date who continue to use (log in) after the next day (specified date + 1).

Similar indicators: 3-day retention rate and 7-day retention rate.

Development and Implementation:

SELECT 
   COUNT(DISTINCT CASE WHEN DATEDIFF($day, event_day) <= 0 THEN user_id  ELSE null END ) AS day_user_num,   --Number of users of the day
   COUNT(DISTINCT CASE WHEN DATEDIFF($day, event_day) <= 1 THEN user_id  ELSE null END ) AS 1day_user_num,  --Number of users retained in the next day
   COUNT(DISTINCT CASE WHEN DATEDIFF($day, event_day) <= 1 THEN user_id  ELSE null END ) / COUNT(DISTINCT CASE WHEN DATEDIFF($day, event_day) <= 0 THEN user_id  ELSE null END ) as 1day_ratio, --Next day retention rate

   COUNT(DISTINCT CASE WHEN DATEDIFF($day, event_day) <= 2 THEN user_id  ELSE null END ) AS 3day_user_num,  --3 Number of users retained per day
   COUNT(DISTINCT CASE WHEN DATEDIFF($day, event_day) <= 2 THEN user_id  ELSE null END ) / COUNT(DISTINCT CASE WHEN DATEDIFF($day, event_day) <= 0 THEN user_id  ELSE null END ) as 3day_ratio,  --3 Daily retention rate

   COUNT(DISTINCT CASE WHEN DATEDIFF($day, event_day) <= 6 THEN user_id  ELSE null END ) AS 7day_user_num, --7 Number of users retained per day
   COUNT(DISTINCT CASE WHEN DATEDIFF($day, event_day) <= 6 THEN user_id  ELSE null END ) / COUNT(DISTINCT CASE WHEN DATEDIFF($day, event_day) <= 0 THEN user_id  ELSE null END ) as 7day_ratio  --7 Daily retention rate
FROM
   user_event 
WHERE
    DATEDIFF($day, event_day) <= 6
AND DATEDIFF($day, event_day) >= 0



---- ok of sql

SELECT 
    t_day,
    count(distinct user_id) as day_user_num,
    count(distinct case when day_interval=1 then user_id else null end) as  1day_user_num,
    count(distinct case when day_interval=1 then user_id else null end)/ count(distinct user_id) as 1day_ratio,
    count(distinct case when day_interval=3 then user_id else null end) as  3day_user_num,
    count(distinct case when day_interval=3 then user_id else null end)/ count(distinct user_id) as 3day_ratio,
    count(distinct case when day_interval=7 then user_id else null end) as  7day_user_num,
    count(distinct case when day_interval=7 then user_id else null end)/ count(distinct user_id) as 7day_ratio
FROM 
(
    SELECT 
        t.day as t_day,
        t.user_id,
        t.n_day_user_id,
        DATEDIFF(TO_DATE(t.n_day,'yyyyMMdd'), TO_DATE(t.day,'yyyyMMdd')) as day_interval
    FROM
    (
        SELECT  a.did AS user_id
                ,b.did AS n_day_user_id
                ,a.day AS day
                ,b.day AS n_day
        FROM    dwd_user_event_di AS a
        LEFT JOIN dwd_user_event_di AS b
        ON      a.did = b.did
        WHERE   
            DATEDIFF(TO_DATE(a.day,'yyyyMMdd'), TO_DATE('20211001','yyyyMMdd'))<=6 
        AND 
            DATEDIFF(TO_DATE(a.day,'yyyyMMdd'), TO_DATE('20211001','yyyyMMdd'))>=0 
    ) t
)
GROUP BY t_day
;

It can also be compared by industry data or average retention, for example:

Based on the retention rate, the data gradually decreases from high to low over time. When the downward trend is stable, it is the start time of user loss. As shown in the figure below, the retention rate is smooth from week 5, so the loss cycle of 5 weeks, i.e. 35 days, can be set.

3. Loss indicators

3.1 lost users, loss rate

Indicator: daily turnover rate, day 1 Chun ratio, the proportion of users who log in / use the product on the statistical day and do not log in / use the product in the next 7 days to users on the statistical day.

Weekly loss rate Week Chrun Ratio, the proportion of users who logged in / used the product last week, but did not log in / used this week in last week's (used / active) users.

Monthly churn rate: the proportion of users who logged in / used the product last month but did not log in / use it this month in the last month (used / active).

Loss usually depends on weekly loss or monthly loss indicators. be used for

  • Active user life cycle analysis;
  • Changes of channel users;
  • The operation means to drive revenue, and the impact of version update on the loss of users;
  • The loss was high in that period;
  • Industry loss comparison, product life cycle, that is, mid-term evaluation.

Posted on Thu, 25 Nov 2021 16:42:50 -0500 by barneybarney68