Arbitrary Period Comparisons

time
lookml

(ernesto ongaro) #1

Google Analytics has a date selector that allows for comparison of two arbitrary time periods on the same chart. Period over Period analysis is useful for looking at this 30 days vs the previous 30 days, but it doesn’t let you compare how two arbitrary periods might compare.

For example: What did registration for next week’s webinar look like in comparison to a similar one we did 3 months ago?

Arbitrary time period analysis in Google Analytics

One way to do this in Looker is to use liquid variables to create an explore that looks like this:

There’s a few things going on here:

  1. First Period Filter & Second Period Filter: View filters to take input from the users
  2. Days from First Period: The magic to make this possible; conditional logic to find the lowest non-negative value from each period.
  3. Period Selected: used to pivot the chart and create the First Period and Second Period series

And here’s the LookML to put it together:


filter: first_period_filter {
    group_label: "Arbitrary Period Comparisons"
    type: date
  }
  
  filter: second_period_filter {
    group_label: "Arbitrary Period Comparisons"
    type: date
  }
  
  dimension: days_from_start_first {
    hidden: yes
    type: number
    sql: DATEDIFF('day',  {% date_start first_period_filter %}, ${created_date}) ;;
  }
  
  dimension: days_from_start_second {
    hidden: yes
    type: number
    sql: DATEDIFF('day',  {% date_start second_period_filter %}, ${created_date}) ;;
  }
  
  dimension: days_from_first_period {
    group_label: "Arbitrary Period Comparisons"
    type: number
    sql:
      CASE
       WHEN ${days_from_start_second} >= 0
       THEN ${days_from_start_second}
       WHEN ${days_from_start_first} >= 0
       THEN ${days_from_start_first}
      END;;
  }
  
  
  dimension: period_selected {
    group_label: "Arbitrary Period Comparisons"
    type: string
    sql:
        CASE
          WHEN ${created_raw} >=  {% date_start first_period_filter %}
          AND ${created_raw} <= {% date_end first_period_filter %}
          THEN 'First Period'
          WHEN ${created_raw} >=  {% date_start second_period_filter %}
          AND ${created_raw} <= {% date_end second_period_filter %}
          THEN 'Second Period'
          END ;;
  }


Dashboards with 'compare to' date filters
(Ben) #2

Hi @ernesto

I am looking to implement exactly this functionality but having some problems copying over your code.

I am getting the error:

Failed to retrieve data - A valid date part argument is required, but found mydata.Date at [13:77]

I replaced ‘created_date’ in your code with ‘session_date’ from my data, where sessions_date is a time dimension group. Any idea where I am going wrong? Does this data field need to be a specific type?

Ben


(ernesto ongaro) #3

Hey @Ben! (where’s Holly btw?)

Which database? I wrote this for Redshift which expects DATEDIFF ( datepart, {timestamp}, {timestamp} ) but some databases the datepart is at the end.
`


(Ben) #4

Hi @ernesto

Aha, of course! I have updated the code BigQuery and shared below. However, when I run this my results aren’t looking quite right - how do you get the data to align (row wise) on days from first period as this is different for each date range?

Code for BigQuery:

  filter: first_period_filter {
    group_label: "Arbitrary Period Comparisons"
    type: date
  }

  filter: second_period_filter {
    group_label: "Arbitrary Period Comparisons"
    type: date
  }

dimension: days_from_start_first {
hidden: yes
type: number
sql: DATE_DIFF( ${session_date}, CAST({% date_start first_period_filter %} AS DATE), DAY) ;;
  }

  dimension: days_from_start_second {
hidden: yes
type: number
sql: DATE_DIFF(${session_date}, CAST({% date_start second_period_filter %} AS DATE), DAY) ;;
  } 

    dimension: days_from_first_period {
type: number
sql:
  CASE
   WHEN ${days_from_start_first} >= 0
   THEN ${days_from_start_first}
   WHEN ${days_from_start_second} >= 0
   THEN ${days_from_start_second}
  END;;

}

  dimension: period_selected {
    group_label: "Arbitrary Period Comparisons"
    type: string
    sql:
        CASE
          WHEN ${session_raw} >=  {% date_start first_period_filter %}
          AND ${session_raw} <= {% date_end first_period_filter %}
          THEN 'First Period'
          WHEN ${session_raw} >=  {% date_start second_period_filter %}
          AND ${session_raw} <= {% date_end second_period_filter %}
          THEN 'Second Period'
          END ;;
  }

(ernesto ongaro) #5

@Ben- thanks for sharing a working BQ version, very cool!

I think all you’re missing is to turn off Plot Null Values


(Ben) #6

@ernesto I tried that, but I still have a big gap between my date periods, rather than then stacking on top of one another like in your example. Here’s a screenshot of my data, where I don’t have any alignment in pivoted columns, because they all have different Days from first period, does this make sense? :slight_smile:

Ben


(ernesto ongaro) #7

Could you try reversing the logic in days_from_first_period ?

  dimension: days_from_first_period {
    type: number
    sql:
      CASE
       WHEN ${days_from_start_first} >= 0
       THEN ${days_from_start_first}
       WHEN ${days_from_start_second} >= 0
       THEN ${days_from_start_second}
      END;;
  }

I think the other dialect intricacy is that date diff arguments for start/end are swapped in BigQuery vs Redshift. So you can do the above or flip arguments.


(Ben) #8

Hi @ernesto

I tried this and the results are the same. The BigQuery docs say date_diff:

Returns the number of date_part boundaries between the two date_expression s. If the first date occurs before the second date, then the result is non-positive.

I am wondering if I need to change the code below too?

  dimension: days_from_start_second {
    hidden: yes
    type: number
    sql: DATE_DIFF(CAST({% date_start second_period_filter %} AS DATE), ${session_date}, DAY) ;;
  }

What is ${session_date} (my date field) doing here? Should this be today’s date, or something else?

Thanks for your help with this!
Ben


(ernesto ongaro) #9

Hi @Ben,

I recreated in BigQuery and you will need to swap your arguments for DATE_DIFF around - no need to swap the days_from_first_period around after you’ve done this.

dimension: days_from_start_first {
    hidden: yes
    type: number
    sql: DATE_DIFF( ${session_date}, CAST({% date_start first_period_filter %} AS DATE), DAY) ;;
  }

  dimension: days_from_start_second {
    hidden: yes
    type: number
    sql: DATE_DIFF(${session_date}, CAST({% date_start second_period_filter %} AS DATE), DAY) ;;
  } 

(Ben) #10

Amazing! Thanks again for your help getting this working with BigQuery - Holly will be delighted! :fairy: