[Analytic Block] Sessionization Funnel

low_priority
done
reply
(Anicia Santos) #1

Please note that this Funnel Block provides funnel analysis for sessionized event data. For funnel analysis based on transaction data, please see our Simple Funnel Block.

About This Block

Once your event data is sessionized, a simple funnel explorer Block can help you visualize custom flows through your site. This Block’s flexibility is highly useful when you want to compare analogous flows to see which one usually leads to better conversion or retention on the site. For example:

  • If a user can navigate to both pages B and C from page A, which page leads to more users visiting and making a purchase on page D?
  • If page B leads to more conversions on page D, we should make page B easily accessible from page A.

Ideal Data Types

As noted above, this block is built sessionized event data. Please see this Looker Block to get your event data sessionized if you have not already. The event_id field is assumed to be sequential (i.e. higher event_id implies it occurred later). If not, the raw event times should be used and time based dimension groups should be used.

Expected Output

By allowing users to filter event sequences with custom events, we give them the power to look at variable funnel flows. It’s highly unlikely that a user can ONLY go to one other page from a given landing page, so the funnel explorer is as flexible as our pages themselves.

In the example below, we evaluate the flow from a Category page to a Brand page to a Checkout page.

Explore Data in Full Screen

And what makes this analysis with Looker so powerful, is that any of those three pages can easily be substituted out in the filters to look at a different custom flow. As always, each data aggregation or point on a visualization can be drilled into, down to the individual event or session level.

We can also look at a custom 2-page sequence over time as a monitor. If we see a spike in the number of users going from our checkout page to cancel rather than converting, that can be an early indicator of a bug!

Explore Data in Full Screen

Try it Out!

The block below was created using Redshift. The window functions used to create sessionization using the Sessionization Block pattern are not compatible with MySQL databases. For a pattern compatible with MySQL, try substituting in subqueries for the window functions.

We only need to create one view for this block that we will call ‘Funnel Explorer’.

The view also has a templated filter based on event date. We can set a default value in the model under the explore.


# include all views in this project
include: "*.view"

# include all dashboards in this project
include: "*.dashboard"

################################################################
# Includes Sessionization Block and Sessionization Funnel Block
################################################################

explore: events_sessionized {
  label: "Web Session Data"
  persist_for: "24 hours"

  join: sessions {
    relationship: many_to_one
    type: left_outer
    sql_on: ${events_sessionized.unique_session_id} = ${sessions.unique_session_id} ;;
  }

  join: session_facts {
    relationship: many_to_one
    type: inner
    view_label: "Sessions"
    sql_on: ${sessions.unique_session_id} = ${session_facts.unique_session_id} ;;
  }
}

explore: funnel_explorer {
  persist_for: "24 hours"

  always_filter: {
    filters: {
      field: event_time
      value: "30 days"
    }
  }
}

################################################################
# Sessions View
################################################################

view: sessions {
  derived_table: {
    sql_trigger_value: SELECT DATE(CONVERT_TIMEZONE('UTC', 'America/Los_Angeles', GETDATE())) ;;
    distribution: "user_id"
    sortkeys: ["session_start"]
    sql: WITH lag AS
        (SELECT
                  logs.created_at AS created_at 
                , logs.user_id AS user_id
                , logs.ip_address AS ip_address
                , DATEDIFF(
                    minute,
                    LAG(logs.created_at) OVER ( PARTITION BY logs.user_id, logs.ip_address ORDER BY logs.created_at)
                  , logs.created_at) AS idle_time
              FROM public.events as logs
              WHERE ((logs.created_at) >= (DATEADD(day,-59, DATE_TRUNC('day',GETDATE()) ))
                    AND (logs.created_at) < (DATEADD(day,60, DATEADD(day,-59, DATE_TRUNC('day',GETDATE()) ) ))) -- optional limit of events table to only past 60 days
              )
        SELECT
          lag.created_at AS session_start
          , lag.idle_time AS idle_time
          , lag.user_id AS user_id
          , lag.ip_address AS ip_address
          , ROW_NUMBER () OVER (ORDER BY lag.created_at) AS unique_session_id
          , ROW_NUMBER () OVER (PARTITION BY COALESCE(lag.user_id::varchar, lag.ip_address) ORDER BY lag.created_at) AS session_sequence
          , COALESCE(
                LEAD(lag.created_at) OVER (PARTITION BY lag.user_id, lag.ip_address ORDER BY lag.created_at)
              , '6000-01-01') AS next_session_start
        FROM lag
        WHERE (lag.idle_time > 60 OR lag.idle_time IS NULL)  -- session threshold (currently set at 60 minutes)
       ;;
  }

  measure: count {
    type: count
    drill_fields: [detail*]
  }

  dimension_group: session_start_at {
    type: time
    hidden: yes
    convert_tz: no
    timeframes: [time, date, week, month]
    sql: ${TABLE}.session_start ;;
  }

  dimension: idle_time {
    type: number
    value_format: "0"
    sql: ${TABLE}.idle_time ;;
  }

  dimension: unique_session_id {
    type: number
    value_format_name: id
    primary_key: yes
    sql: ${TABLE}.unique_session_id ;;
  }

  dimension: session_sequence {
    type: number
    value_format_name: id
    sql: ${TABLE}.session_sequence ;;
  }

  dimension_group: next_session_start_at {
    type: time
    convert_tz: no
    timeframes: [time, date, week, month]
    sql: ${TABLE}.next_session_start ;;
  }

  measure: count_distinct_sessions {
    type: count_distinct
    sql: ${unique_session_id} ;;
  }

  set: detail {
    fields: [
      session_start_at_time,
      idle_time,
      unique_session_id,
      session_sequence,
      next_session_start_at_time
    ]
  }
}

################################################################
# Events View
################################################################

view: events_sessionized {
  view_label: "Events"

  derived_table: {
    sql_trigger_value: SELECT DATE(CONVERT_TIMEZONE('UTC', 'America/Los_Angeles', GETDATE())) ;;
    distribution: "event_id"
    sortkeys: ["created_at"]
    sql: SELECT
      ROW_NUMBER() OVER (ORDER BY log.created_at) AS event_id
    , log.ip_address
    , log.user_id
    , log.os
    , log.uri
    , log.event_type
    , log.browser
    , log.traffic_source
    , log.created_at
    , sessions.unique_session_id
    , ROW_NUMBER () OVER (PARTITION BY unique_session_id ORDER BY log.created_at) AS event_sequence_within_session
    , ROW_NUMBER () OVER (PARTITION BY unique_session_id ORDER BY log.created_at desc) AS inverse_event_sequence_within_session
FROM public.events AS log
INNER JOIN ${sessions.SQL_TABLE_NAME} AS sessions
  ON log.user_id = sessions.user_id
  AND log.ip_address = sessions.ip_address
  AND log.created_at >= sessions.session_start
  AND log.created_at < sessions.next_session_start
WHERE
  ((log.created_at) >= (DATEADD(day,-59, DATE_TRUNC('day',GETDATE()) ))  AND (log.created_at) < (DATEADD(day,60, DATEADD(day,-59, DATE_TRUNC('day',GETDATE()) ) )))
 ;;
  }

  measure: count {
    type: count
    drill_fields: [detail*]
  }

  dimension: event_id {
    primary_key: yes
    type: number
    value_format_name: id
    sql: ${TABLE}.event_id ;;
  }

  dimension: user_id {
    type: number
    sql: ${TABLE}.user_id ;;
  }

  dimension: unique_session_id {
    type: number
    value_format_name: id
    hidden: yes
    sql: ${TABLE}.unique_session_id ;;
  }

  dimension: page_name {
    type: string
    sql: ${TABLE}.uri ;;
  }

  dimension: event_type {
    type: string
    sql: ${TABLE}.event_type ;;
  }

  dimension: traffic_source {
    type: string
    sql: ${TABLE}.traffic_source ;;
  }

  dimension: event_sequence_within_session {
    type: number
    value_format_name: id
    sql: ${TABLE}.event_sequence_within_session ;;
  }

  dimension: inverse_event_sequence_within_session {
    type: number
    value_format_name: id
    sql: ${TABLE}.inverse_event_sequence_within_session ;;
  }

  set: detail {
    fields: [
      event_id,
      #ip_address,
      user_id,
      #os,
      traffic_source,
      #event_time_time,
      unique_session_id,
      event_sequence_within_session,
      inverse_event_sequence_within_session,
      #user_first_session_time,
      #session_landing_page,
      #session_exit_page
    ]
  }
}

################################################################
# Session Facts View
################################################################

view: session_facts {
  derived_table: {
    sql_trigger_value: SELECT DATE(CONVERT_TIMEZONE('UTC', 'America/Los_Angeles', GETDATE())) ;;
    distribution: "unique_session_id"
    sortkeys: ["session_start"]
    sql: WITH session_facts AS
        (
          SELECT
             unique_session_id
            , logs_with_session_info.created_at
            , user_id
            , ip_address
            , uri
            , event_id
            , event_type
            , COALESCE(user_id::varchar, ip_address) as identifier
            , FIRST_VALUE (created_at) OVER (PARTITION BY unique_session_id ORDER BY created_at ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS session_start
            , LAST_VALUE (created_at) OVER (PARTITION BY unique_session_id ORDER BY created_at ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS session_end
            , FIRST_VALUE (event_type) OVER (PARTITION BY unique_session_id ORDER BY created_at ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS session_landing_page
            , LAST_VALUE  (event_type) OVER (PARTITION BY unique_session_id ORDER BY created_at ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS session_exit_page
          FROM
              ${events_sessionized.SQL_TABLE_NAME} AS logs_with_session_info
          GROUP BY 1,2,3,4,5,6, 7
          ORDER BY unique_session_id asc
        )
      SELECT
        session_facts.unique_session_id
        , session_facts.identifier
        , session_facts.session_start
        , session_facts.session_end
        , session_landing_page
        , session_exit_page
        , ROW_NUMBER () OVER (PARTITION BY session_facts.identifier ORDER BY MIN(session_start)) AS session_sequence_for_user
        , ROW_NUMBER () OVER (PARTITION BY session_facts.identifier ORDER BY MIN(session_start) desc) AS inverse_session_sequence_for_user
        , count(1) as events_in_session
      FROM session_facts
      INNER JOIN
        ${events_sessionized.SQL_TABLE_NAME} AS logs_with_session_info
      ON
        logs_with_session_info.created_at = session_facts.session_start
        AND logs_with_session_info.unique_session_id = session_facts.unique_session_id
      GROUP BY 1,2,3,4,5,6
      ORDER BY session_start asc
       ;;
  }

  dimension: unique_session_id {
    hidden: yes
    primary_key: yes
    type: number
    value_format_name: id
    sql: ${TABLE}.unique_session_id ;;
  }

  dimension_group: session_start_at {
    type: time
    convert_tz: no
    timeframes: [time, date, week, month]
    sql: ${TABLE}.session_start ;;
  }

  dimension_group: session_end_at {
    type: time
    convert_tz: no
    timeframes: [time, date, week, month]
    sql: ${TABLE}.session_end ;;
  }

  dimension: session_sequence_for_user {
    type: number
    sql: ${TABLE}.session_sequence_for_user ;;
  }

  dimension: inverse_session_sequence_for_user {
    type: number
    sql: ${TABLE}.inverse_session_sequence_for_user ;;
  }

  dimension: number_of_events_in_session {
    type: number
    sql: ${TABLE}.events_in_session ;;
  }

  dimension: session_landing_page {
    type: string
    sql: ${TABLE}.session_landing_page ;;
  }

  dimension: session_exit_page {
    type: string
    sql: ${TABLE}.session_exit_page ;;
  }

  dimension: session_length_seconds {
    type: number
    sql: DATEDIFF('sec', ${TABLE}.session_start, ${TABLE}.session_end) ;;
  }

  dimension: session_length_seconds_tier {
    type: tier
    tiers: [
      0,
      15,
      30,
      45,
      60,
      75,
      100
    ]
    sql: ${session_length_seconds} ;;
  }

  measure: average_session_length_seconds {
    type: average
    sql: ${session_length_seconds} ;;
  }

  measure: session_facts_count {
    type: count
    drill_fields: [detail*]
  }

  set: detail {
    fields: [
      unique_session_id,
      session_start_at_time,
      session_end_at_time,
      session_sequence_for_user,
      inverse_session_sequence_for_user,
      number_of_events_in_session,
      session_landing_page,
      session_exit_page
    ]
  }
}


################################################################
# Funnel Explorer View
################################################################

view: funnel_explorer {
  # In this query, we retrieve, for each session, the first and last instance of each event in our sequence. If,
  # for each event, its first instance occurs before the last instance of the next event in the sequence, then
  # that is considered a completion of the sequence.
  derived_table: {
    sql: SELECT sessions.unique_session_id as unique_session_id
        , events_sessionized.user_id
        , sessions.session_start AS session_start
        , MIN(
            CASE WHEN
              {% condition event_1 %} events_sessionized.event_type {% endcondition %}
              THEN events_sessionized.created_at
              ELSE NULL END
            ) AS event_1
        , MIN(
            CASE WHEN
              {% condition event_2 %} events_sessionized.event_type {% endcondition %}
              THEN events_sessionized.created_at
              ELSE NULL END
            ) AS event_2_first
        , MAX(
            CASE WHEN
              {% condition event_2 %} events_sessionized.event_type {% endcondition %}
              THEN events_sessionized.created_at
              ELSE NULL END
            ) AS event_2_last
        , MIN(
            CASE WHEN
              {% condition event_3 %} events_sessionized.event_type {% endcondition %}
              THEN events_sessionized.created_at
              ELSE NULL END
            ) AS event_3_first
        , MAX(
            CASE WHEN
              {% condition event_3 %} events_sessionized.event_type {% endcondition %}
              THEN events_sessionized.created_at
              ELSE NULL END
            ) AS event_3_last
        , MIN(
            CASE WHEN
              {% condition event_4 %} events_sessionized.event_type {% endcondition %}
              THEN events_sessionized.created_at
              ELSE NULL END
            ) AS event_4_first
          , MAX(
            CASE WHEN
              {% condition event_4 %} events_sessionized.event_type {% endcondition %}
              THEN events_sessionized.created_at
              ELSE NULL END
            ) AS event_4_last
      FROM ${events_sessionized.SQL_TABLE_NAME} AS events_sessionized
      LEFT JOIN ${sessions.SQL_TABLE_NAME} AS sessions
        ON events_sessionized.unique_session_id = sessions.unique_session_id
      WHERE {% condition event_time %} created_at {% endcondition %}
      GROUP BY 1,2,3
       ;;
  }

  filter: event_1 {
    suggest_dimension: events_sessionized.event_type
    suggest_explore: events_sessionized
  }

  filter: event_2 {
    suggest_dimension: events_sessionized.event_type
    suggest_explore: events_sessionized
  }

  filter: event_3 {
    suggest_dimension: events_sessionized.event_type
    suggest_explore: events_sessionized
  }

  filter: event_4 {
    suggest_dimension: events_sessionized.event_type
    suggest_explore: events_sessionized
  }

  filter: event_time {
    type: date_time
  }

  dimension: unique_session_id {
    type: string
    primary_key: yes
    #     hidden: TRUE
    sql: ${TABLE}.unique_session_id ;;
  }

  dimension: user_id {
    type: number
    #     hidden: TRUE
    sql: ${TABLE}.user_id ;;
  }

  dimension_group: session_start {
    type: time
    #     hidden: TRUE
    convert_tz: no
    timeframes: [
      time,
      date,
      week,
      month,
      year,
      raw
    ]
    sql: ${TABLE}.session_start ;;
  }

  dimension_group: event_1 {
    description: "First occurrence of event 1"
    type: time
    convert_tz: no
    timeframes: [time]
    hidden: yes
    sql: ${TABLE}.event_1 ;;
  }

  dimension_group: event_2_first {
    description: "First occurrence of event 2"
    type: time
    convert_tz: no
    timeframes: [time]
    hidden: yes
    sql: ${TABLE}.event_2_first ;;
  }

  dimension_group: event_2_last {
    description: "Last occurrence of event 2"
    type: time
    convert_tz: no
    timeframes: [time]
    hidden: yes
    sql: ${TABLE}.event_2_last ;;
  }

  dimension_group: event_3_first {
    description: "First occurrence of event 3"
    type: time
    convert_tz: no
    timeframes: [time]
    hidden: yes
    sql: ${TABLE}.event_3_first ;;
  }

  dimension_group: event_3_last {
    description: "Last occurrence of event 3"
    type: time
    convert_tz: no
    timeframes: [time]
    hidden: yes
    sql: ${TABLE}.event_3_last ;;
  }

  dimension_group: event_4_first {
    description: "First occurrence of event 4"
    type: time
    convert_tz: no
    timeframes: [time]
    hidden: yes
    sql: ${TABLE}.event_4_first ;;
  }

  dimension_group: event_4_last {
    description: "Last occurrence of event 4"
    type: time
    convert_tz: no
    timeframes: [time]
    hidden: yes
    sql: ${TABLE}.event_4_last ;;
  }

  dimension: event1_before_event2 {
    type: yesno
    hidden: yes
    sql: ${TABLE}.event_1 < ${TABLE}.event_2_last ;;
  }

  dimension: event1_before_event3 {
    type: yesno
    hidden: yes
    sql: ${TABLE}.event_1 < ${TABLE}.event_3_last ;;
  }

  dimension: event1_before_event4 {
    type: yesno
    hidden: yes
    sql: ${TABLE}.event_1 < ${TABLE}.event_4_last ;;
  }

  dimension: event2_before_event3 {
    type: yesno
    hidden: yes
    sql: ${TABLE}.event_2_first < ${TABLE}.event_3_last ;;
  }

  dimension: event2_before_event4 {
    type: yesno
    hidden: yes
    sql: ${TABLE}.event_2_first < ${TABLE}.event_4_last ;;
  }

  dimension: event3_before_event4 {
    type: yesno
    hidden: yes
    sql: ${TABLE}.event_3_first < ${TABLE}.event_4_last ;;
  }

  dimension: reached_event_1 {
    hidden: yes
    type: yesno
    sql: (${event_1_time} IS NOT NULL)
      ;;
  }

  dimension: reached_event_2 {
    hidden: yes
    type: yesno
    sql: (${event_1_time} IS NOT NULL AND ${event_2_first_time} IS NOT NULL AND ${event_1_time} < ${event_2_last_time})
      ;;
  }

  dimension: reached_event_3 {
    hidden: yes
    type: yesno
    sql: (${event_1_time} IS NOT NULL AND ${event_2_last_time} IS NOT NULL AND ${event_3_last_time}  IS NOT NULL
      AND ${event_1_time} < ${event_2_last_time} AND ${event_1_time} < ${event_3_last_time} AND ${event_2_first_time} < ${event_3_last_time})
       ;;
  }

  dimension: reached_event_4 {
    hidden: yes
    type: yesno
    sql: (${event_1_time} IS NOT NULL AND ${event_2_last_time} IS NOT NULL AND ${event_3_last_time}  IS NOT NULL AND ${event_4_last_time} IS NOT NULL
      AND ${event_1_time} < ${event_2_last_time} AND ${event_1_time} < ${event_3_last_time} AND ${event_1_time} < ${event_4_last_time} AND ${event_2_first_time} < ${event_3_last_time} AND ${event_2_first_time} < ${event_4_last_time} AND ${event_3_first_time} < ${event_4_last_time})
 ;;
  }

  dimension: furthest_step {
    label: "Furthest Funnel Step Reached"

    case: {
      when: {
        sql: ${reached_event_4} = true ;;
        label: "4th"
      }

      when: {
        sql: ${reached_event_3} = true ;;
        label: "3rd"
      }

      when: {
        sql: ${reached_event_2} = true ;;
        label: "2nd"
      }

      when: {
        sql: ${reached_event_1} = true ;;
        label: "1st"
      }

      else: "no"
    }
  }

  measure: count_sessions {
    type: count_distinct
    drill_fields: [detail*]
    sql: ${unique_session_id} ;;
  }

  measure: count_sessions_event1 {
    label: "Event 1"
    type: count_distinct
    sql: ${unique_session_id} ;;
    drill_fields: [detail*]

    filters: {
      field: furthest_step
      value: "1st,2nd,3rd,4th"
    }
  }

  measure: count_sessions_event12 {
    label: "Event 2"
    description: "Only includes sessions which also completed event 1"
    type: count_distinct
    sql: ${unique_session_id} ;;
    drill_fields: [detail*]

    filters: {
      field: furthest_step
      value: "2nd,3rd,4th"
    }
  }

  measure: count_sessions_event123 {
    label: "Event 3"
    description: "Only includes sessions which also completed events 1 and 2"
    type: count_distinct
    sql: ${unique_session_id} ;;
    drill_fields: [detail*]

    filters: {
      field: furthest_step
      value: "3rd, 4th"
    }
  }

  measure: count_sessions_event1234 {
    label: "Event 4"
    description: "Only includes sessions which also completed events 1, 2 and 3"
    type: count_distinct
    sql: ${unique_session_id} ;;
    drill_fields: [detail*]

    filters: {
      field: furthest_step
      value: "4th"
    }
  }

  set: detail {
    fields: [unique_session_id, user_id, session_start_time]
  }
}


Once you save your model changes, don’t forget to run the LookML validator and test out a few explorations before pushing your changes to production!

4 Likes

JOIN 2017 - Deep Dive - Event Sessionization
(Akhill Chopra) #2

This is incredibly helpful @anicia!

Implemented this last night (with some much-needed help from @kenny) using event timestamps instead of IDs (as we’re UNION’ing different sources of events thereby rendering event_id boolean comparisons moot), event_names instead of page_path, and joined in a session_facts table to look by device, utm_ parameters etc…

This has massively enhanced an internal debate around funnel drop offs and analytics in general and is directly translating to fresh eyes on a potentially very valuable opportunity.

Many many thanks!! Awesome work!

4 Likes

(Anicia Santos) #3

Thanks Akhill!

This was a lot of fun to build and the different ways event data is collected makes the implementation a little tricky. Obviously the more data you can collect around a user’s session makes the investigation that much more interesting. I am very glad to hear @kenny was able to help you get that set up!

0 Likes

(Menashe Hamm) #4

Nice block. Note though that it counts a user as having reached event 3 even if his sequence of events was event 2, event 1, event 3, and only then event 2 again.

0 Likes

(Bharat Ayyar) #5

The example links no longer appear to work!

0 Likes

(Izzy) #6

Which ones, Bharat? I just tested them out and they seemed to load alright.

0 Likes

(Bharat Ayyar) #7

Hmmm… that’s odd.
Well, if I normally click on the blocks I get an error and the page doesn’t load.

This page isn’t working
**blocks.looker.com**  didn’t send any data.
(ERR_EMPTY_RESPONSE) 

However, if I Command + Click and open in a new tab, I’m able to view the Look. That’s a new one! Thanks, Izzy.

0 Likes

(Izzy) #8

You’re right! Those URLs were weird… I replaced them with more normal looking ones and I think it works alright now. Thanks for flagging that, Bharat.

0 Likes