[Analytic Block] Daily, Weekly, Monthly Active Users

About This Block

Active Users is a common metric to measure how customers are engaging with your product over different timeframes. In this pattern, we use a rolling timeframe to calculate a daily count of how many users made an interaction with the product in the prior 30 days, 7 days, and on the same day. This block will allow you to answer questions like:

  • How is my active customer base growing over time?
  • What are the characteristics of my active users?
  • How do changes to my product affect my active customer base?

Ideal Data Types

This block is a great way to analyze event data, which can come from Google Analytics, Segment, Snowplow or a custom event tracking system (like a table that has a row for every user event).

Expected Output

This block helps us see our customers as monthly, weekly, and daily active users (i.e. for each date, the users who have made a purchase in the last 30 days, in the last 7 days, or on that day).

Please note that this is a static data set so we set the filters for specific timeframe.

Explore Data in Full Screen

By joining these data with the user data, we can explore active users by any category for which we have data. For example, here are the weekly active users over the past 90 days cohorted by their signup month:

Explore Data in Full Screen

Try it Yourself

###How It’s Done:
This pattern requires building a date table. The construction of this table can vary based on the SQL dialect being used. In the example below we use Redshift. We have discourse articles for how to create this type of table in MySQL and PostgreSQL as well.

We start by making this table. We are going to take advantage of the fact that we have a large table (here it’s order_items) with more rows than we need for dates and use the window function row_number to generate a series of integers to build our dates table.

Next, we will construct a simple daily_use table that has the user_id and date for every day a user made a purchase. We cross join the table of calendar dates with the daily_use table where the calendar date falls between the usage date and the usage date plus 30 days. This allows us to determine, for each calendar date, if a user has been active up to 30 days prior. We select the user_id, the calendar date, and how many days it’s been since the user’s last activity to create a table that looks like this:

Here we see that user 1 was active on January 7, but for each day after that was not. So each new calendar date increments the days since last action. This table is written back as a Persistent Derived Table (PDT) to increase query performance.

Finally, we add dimensions and measures that allow us to count and drill in on monthly, weekly, and daily active users.

###Try it Out:

explore: active_users {
  join: users {
    type: left_outer
    sql_on: ${active_users.user_id} = ${users.id} ;;
    relationship: many_to_one
  }
}

view: dates {
  derived_table: {
    distribution_style: all
    sortkeys: ["date"]
    sql_trigger_value:  GETDATE() ;;
    sql: -- ## 1) Create a Date table with a row for each date.
      SELECT '2001-01-01'::DATE + d AS date
      FROM
        (SELECT ROW_NUMBER() OVER(ORDER BY id) -1 AS d FROM orders ORDER BY id LIMIT 20000) AS  d
       ;;
  }
}

view: active_users {
  derived_table: {
    sql_trigger_value: GETDATE();;
    distribution: "user_id"
    sortkeys: ["date"]
    sql: WITH daily_use AS (
        -- ## 2 ) Create a table of days and activity by user_id
        SELECT
          user_id
          , DATE_TRUNC('day', created_at) as activity_date
        FROM orders
        GROUP BY 1, 2
      )
      --  ## 3) Cross join activity and dates to build a row for each user/date combo with
      -- days since last activity
      SELECT
            daily_use.user_id
          , wd.date as date
          , MIN(wd.date::date - daily_use.activity_date::date) as days_since_last_action
      FROM ${dates.SQL_TABLE_NAME} AS wd
      LEFT JOIN daily_use
          ON wd.date >= daily_use.activity_date
          AND wd.date < daily_use.activity_date + interval '30 day'
      GROUP BY 1,2
       ;;
  }

  dimension: date {
    type: date
    sql: ${TABLE}.date ;;
  }

  dimension: user_id {
    type: number
    sql: ${TABLE}.user_id ;;
  }

  dimension: days_since_last_action {
    type: number
    sql: ${TABLE}.days_since_last_action ;;
    value_format_name: decimal_0
  }

  dimension: active_this_day {
    type: yesno
    sql: ${days_since_last_action} <  1 ;;
  }

  dimension: active_last_7_days {
    type: yesno
    sql: ${days_since_last_action} < 7 ;;
  }

  measure: user_count_active_30_days {
    label: "Monthly Active Users"
    type: count_distinct
    sql: ${user_id} ;;
    drill_fields: [users.id, users.name]
  }

  measure: user_count_active_this_day {
    label: "Daily Active Users"
    type: count_distinct
    sql: ${user_id} ;;
    drill_fields: [users.id, users.name]

    filters: {
      field: active_this_day
      value: "yes"
    }
  }

  measure: user_count_active_7_days {
    label: "Weekly Active Users"
    type: count_distinct
    sql: ${user_id} ;;
    drill_fields: [users.id, users.name]

    filters: {
      field: active_last_7_days
      value: "yes"
    }
  }
}
9 Likes

Thank you for an excellent post here by @jakeyoos!

If you need it for BigQuery:

view: dates {
  derived_table: {

    sql_trigger_value: SELECT CURRENT_DATE() ;;
    sql: 
      SELECT cast(date as date) as date
FROM UNNEST(GENERATE_DATE_ARRAY(DATE_SUB(CURRENT_DATE, INTERVAL 5 YEAR), CURRENT_DATE)) date
       ;;
  }
}

view: active_users {
  derived_table: {
  sql_trigger_value: SELECT CURRENT_DATE();;

    sql: WITH daily_use AS (
        SELECT
          user_id as user_id
          , cast(TIMESTAMP_TRUNC(transactiondate,day) as date) as activity_date
        FROM users
        GROUP BY 1, 2
      )
      
      SELECT
            daily_use.user_id
          , wd.date as date
          , MIN( DATE_DIFF(wd.date, daily_use.activity_date, day) ) as days_since_last_action
      FROM ${dates.SQL_TABLE_NAME} AS wd
      CROSS JOIN daily_use
        WHERE wd.date BETWEEN daily_use.activity_date AND DATE_ADD(daily_use.activity_date, INTERVAL 30 DAY)
      GROUP BY 1,2
       ;;
  }

dimension_group: date {
    type: time
    timeframes: [date,month,quarter,quarter_of_year,year,raw]
    sql: CAST( ${TABLE}.date AS TIMESTAMP);;
  }

  dimension: user_id {
    type: number
    sql: ${TABLE}.user_id ;;
  }

  dimension: days_since_last_action {
    type: number
    sql: ${TABLE}.days_since_last_action ;;
    value_format_name: decimal_0
  }

  dimension: active_this_day {
    type: yesno
    sql: ${days_since_last_action} <  1 ;;
  }

  dimension: active_last_7_days {
    type: yesno
    sql: ${days_since_last_action} < 7 ;;
  }

  measure: user_count_active_30_days {
    label: "Monthly Active Users"
    type: count_distinct
    sql: ${user_id} ;;
    drill_fields: [users.id, users.name]
  }

  measure: user_count_active_this_day {
    label: "Daily Active Users"
    type: count_distinct
    sql: ${user_id} ;;
    drill_fields: [users.id, users.name]

    filters: {
      field: active_this_day
      value: "yes"
    }
  }

  measure: user_count_active_7_days {
    label: "Weekly Active Users"
    type: count_distinct
    sql: ${user_id} ;;
    drill_fields: [users.id, users.name]

    filters: {
      field: active_last_7_days
      value: "yes"
    }
  }
}
2 Likes

Here is a Snowflake View for DAU, WAU, MAU.

view: event_active_users_pdt {
  derived_table: {
    sql:
      WITH dau AS (
      SELECT TO_DATE(CREATED_AT) as event_date, COUNT(DISTINCT USER_ID) AS dau
      FROM SCHEMA.EVENTS
      WHERE -- <.. filter criteria...>
      GROUP BY event_date
     )
    SELECT event_date, dau,
             (SELECT COUNT(DISTINCT USER_ID)
              FROM SCHEMA.EVENTS
              WHERE -- <... filter criteria...>
              AND TO_DATE(CREATED_AT) BETWEEN DATEADD(DAY, -29, dau.event_date) AND dau.event_date
             ) AS mau,
             (SELECT COUNT(DISTINCT USER_ID)
              FROM SCHEMA.EVENTS
              WHERE -- <... filter criteria...>
              AND TO_DATE(CREATED_AT) BETWEEN DATEADD(DAY, -7, dau.event_date) AND dau.event_date
             ) AS wau
    FROM dau
    ;;
    datagroup_trigger: events_datagroup
  }

  dimension: event_active_users_pk {
    type: string
    primary_key: yes
    hidden: yes
    sql: TO_VARCHAR(${event_date}, 'YYYYMMDD');;
  }

  dimension_group: event {
    type: time
    timeframes: [
      raw,
      date,
      day_of_week,
      day_of_month,
      day_of_year,
      week,
      week_of_year,
      month,
      quarter,
      year
    ]
    sql: ${TABLE}."EVENT_DATE" ;;
  }

  dimension: dau {
    label: "DAU"
    type: number
    sql: ${TABLE}."DAU" ;;
  }

  dimension: mau {
    label: "MAU"
    type: number
    sql: ${TABLE}."MAU" ;;
  }

  dimension: wau {
    label: "WAU"
    type: number
    sql: ${TABLE}."WAU" ;;
  }

  dimension: dau_mau {
    label: "DAU/MAU"
    type: number
    value_format_name: decimal_2
    sql: 1.0 * (${dau} / nullif(${mau}, 0)) ;;
  }

  dimension: wau_mau {
    label: "WAU/MAU"
    type: number
    value_format_name: decimal_2
    sql: 1.0 * (${wau} / nullif(${mau}, 0)) ;;
  }

  dimension: dau_wau {
    label: "DAU/MAU"
    type: number
    value_format_name: decimal_2
    sql: 1.0 * (${dau} / nullif(${wau}, 0)) ;;
  }

  measure: count {
    label: "Number of Records"
    type: count
    drill_fields: [detail*]
  }

  measure: average_dau_mau {
    label: "Average DAU/MAU"
    type: average
    sql: ${dau_mau} ;;
    value_format_name: decimal_2
    drill_fields: [detail*]
  }

  measure: average_wau_mau {
    label: "Average WAU/MAU"
    type: average
    sql: ${wau_mau} ;;
    value_format_name: decimal_2
    drill_fields: [detail*]
  }

  measure: average_dau_wau {
    label: "Average DAU/WAU"
    type: average
    sql: ${dau_wau} ;;
    value_format_name: decimal_2
    drill_fields: [detail*]
  }

  measure: min_dau_mau {
    label: "Min DAU/MAU"
    type: min
    sql: ${dau_mau} ;;
    value_format_name: decimal_2
    drill_fields: [detail*]
  }

  measure: min_wau_mau {
    label: "Min WAU/MAU"
    type: min
    sql: ${wau_mau} ;;
    value_format_name: decimal_2
    drill_fields: [detail*]
  }

  measure: min_dau_wau {
    label: "Min DAU/WAU"
    type: min
    sql: ${dau_wau} ;;
    value_format_name: decimal_2
    drill_fields: [detail*]
  }

  measure: max_dau_mau {
    label: "Max DAU/MAU"
    type: max
    sql: ${dau_mau} ;;
    value_format_name: decimal_2
    drill_fields: [detail*]
  }

  measure: max_wau_mau {
    label: "Max WAU/MAU"
    type: max
    sql: ${wau_mau} ;;
    value_format_name: decimal_2
    drill_fields: [detail*]
  }

  measure: max_dau_wau {
    label: "Max DAU/WAU"
    type: max
    sql: ${dau_wau} ;;
    value_format_name: decimal_2
    drill_fields: [detail*]
  }

  set: detail {
    fields: [event_date, dau, mau, wau]
  }
}

2 Likes

Thanks @JeffH for the Snowflake implementation. Could you please assist me with a purely Oracle SQL way of accomplishing the same. We use Oracle SQL at work and if I can crack this logic, it will make my work so much easier.

It may not even be a full implementation; even good pointers in the right direction would be a great help. In fact, if you have ever come across a purely Oracle SQL implementation, a link would be helpful. I have scoured StackOverflow for months and I haven’t found the right logic.