Month to Date and Year to Date Analysis


(Zachary Michel) #1

Month-to-Date (MTD) and Year-to-Date (YTD) analyses are useful when conducting timeframe-based comparisons.

For example, while you are part of the way through a month, you may want to compare your progress with earlier months. However, it would not be useful to compare part of the current month with the entirety of earlier months. Instead, it would make more sense to only look at the days in earlier months that have already transpired in the current month.

The best way to execute this type of analysis in Looker is to create a type: yesno dimension that allows you to limit the query to days that have already passed in the given time frame. Getting this to work properly in a variety of scenarios takes some clever SQL and Looker usage.

MTD and YTD Analysis in Looker

Let’s assume that the dimension group on which we want to perform these analyses looks like this:

- dimension_group: created
  type: time
  timeframes: [time, date, week, year]
  sql: ${TABLE}.created_at

Our goal is to create a dimension of type: yesno that will return yes if the date is anytime before today, and no otherwise. We will have to base our dimension on the time frame in which we are interested. Let’s consider MTD to begin with, then we can easily convert this to YTD with some simple replacements.

MySQL / Postgres is_before_mtd

- dimension: is_before_mtd
  type: yesno
  sql: |
    (EXTRACT(DAY FROM ${created_time}) < EXTRACT(DAY FROM CURRENT_TIMESTAMP)
      OR
      (
        EXTRACT(DAY FROM ${created_time}) = EXTRACT(DAY FROM CURRENT_TIMESTAMP) AND
        EXTRACT(HOUR FROM ${created_time}) < EXTRACT(HOUR FROM CURRENT_TIMESTAMP)
      )
      OR
      (
        EXTRACT(DAY FROM ${created_time}) = EXTRACT(DAY FROM CURRENT_TIMESTAMP) AND
        EXTRACT(HOUR FROM ${created_time}) <= EXTRACT(HOUR FROM CURRENT_TIMESTAMP) AND
        EXTRACT(MINUTE FROM ${created_time}) < EXTRACT(MINUTE FROM CURRENT_TIMESTAMP)
      )
    )

Redshift is_before_mtd

- dimension: is_before_mtd
  type: yesno
  sql: |
    (EXTRACT(DAY FROM ${TABLE}.created_at) < EXTRACT(DAY FROM GETDATE())
      OR
      (
        EXTRACT(DAY FROM ${TABLE}.created_at) = EXTRACT(DAY FROM GETDATE()) AND
        EXTRACT(HOUR FROM ${TABLE}.created_at) < EXTRACT(HOUR FROM GETDATE())
      )
      OR
      (
        EXTRACT(DAY FROM ${TABLE}.created_at) = EXTRACT(DAY FROM GETDATE()) AND
        EXTRACT(HOUR FROM ${TABLE}.created_at) <= EXTRACT(HOUR FROM GETDATE()) AND
        EXTRACT(MINUTE FROM ${TABLE}.created_at) < EXTRACT(MINUTE FROM GETDATE())
      )
    )

MS SQL is_before_mtd

- dimension: is_before_mtd
  type: yesno
  sql: |
    (DATEPART(DAY, ${created_time}) < DATEPART(DAY, CURRENT_TIMESTAMP)
      OR
      (
        DATEPART(DAY, ${created_time}) = DATEPART(DAY, CURRENT_TIMESTAMP) AND
        DATEPART(HOUR, ${created_time}) < DATEPART(HOUR, CURRENT_TIMESTAMP)
      )
      OR
      (
        DATEPART(DAY, ${created_time}) = DATEPART(DAY, CURRENT_TIMESTAMP) AND
        DATEPART(HOUR, ${created_time}) <= DATEPART(HOUR, CURRENT_TIMESTAMP) AND
        DATEPART(MINUTE, ${created_time}) < DATEPART(MINUTE, CURRENT_TIMESTAMP)
      )
    )

##Adjusting for YTD

To do the analysis on YTD instead of MTD, we simply have to extract a different part of the date dimension. Depending on which dialect we’re in, we may need to change syntax.

Replacements by Dialect

# MySQL 
# from:
EXTRACT(DAY FROM [date])
# to: 
DAYOFYEAR([date])

#Postgres / Redshift
# from
EXTRACT(DAY FROM [date])
# to
EXTRACT(DOY FROM [date])

#MS SQL
# from
DATEPART(DAY , [date])
# to
DATEPART(dayofyear , [date])

Creating the Query

To execute the YTD or MTD analysis, simply select your Is Before YTD/MTD dimension as a filter and set it to yes.

Example

In this example we’re using:

  • ORDERS Created Month as a dimension
  • ORDERS Count, ORDERS Total Revenue, and ORDERS New Customer Revenue as measures
  • ORDERS Is Before Mtd = Yes as a filter

Now we can get an apples-to-apples comparison of these metrics between the current month and previous months.

MTD Total Order Comparison


[Analytic Block] Dynamic Previous Period Analysis using date_start, date_end
YTD LY comparison
(Vaite Leprince) #2

Hi,
This is awesome, thanks!
With Redshift, I had an error message when reproducing the above query. I changed ${created_time} to ${created_time}::timestamp, that solved the issue.


(Nicole Beyer) #3

@vaite This is because for type: time looker outputs a string rather than a timestamp in Redhsift


(Christopher Poulos) #4

This was incredibly helpful. Thank you Zach!


(Sca4) #5

Heads up, it looks like CURRENT_TIMESTAMP has been deprecated from the Redshift compute node.

Your Redshift example will no longer work. You must use the function GETDATE() which has the same functionality as CURRENT_TIMESTAMP but is included on the compute nodes (meaning it can be used in conjunction with user defined tables).


(lloyd tabb) #6

Thanks, adding @mtoy and @dmarcotte.


(Michael Toy) #7

actually it doesn’t return the same thing.

current_timestamp returns a timestamp with a timezone

getdate returns a timestamp without a timezome

    select GETDATE();
       getdate       
---------------------
 2015-03-31 00:33:10
(1 row)

test_db=# select CURRENT_TIMESTAMP;
          timestamptz          
-------------------------------
 2015-03-31 00:33:21.997411+00

depending on how your database connection is set up, this difference may or may not matter.


(Sca4) #8

Good point it isn’t exactly the same, however the function change is still necessary or else you keep getting hit with this error: Specified types or functions (one per INFO message) not supported on Redshift tables.


(Alison) #9

Zach, this is great if you are being forced into month-to-date and year-to-date analysis (i.e. financials). :thumbsup: Really impressed you got this into 1 dimension.

@enelson came up with another way to write this for month-to-date in Redshift. This specific dimension LookML definition assumes you a) aren’t converting all datetimes to a timezone in your Looker connection configuration, b) are hiding it so it’s only available to Looker developers for dashboard creation, and c) that you want 2 months ago (i.e. January when you’re in March):

- dimension: mtd_this_day_two_months_ago
  label: 'III MTD This Day Two Months Ago (Yes/No)'
  type: yesno
  hidden: true
  sql: | 
    CASE
      WHEN ${TABLE}.created_at BETWEEN
        (DATE_TRUNC('month',CONVERT_TIMEZONE('America/Chicago', GETDATE())) - INTERVAL '2 months' AND 
       DATE_TRUNC('day', CONVERT_TIMEZONE('America/Chicago',GETDATE())) - INTERVAL '2 months') THEN TRUE
      ELSE FALSE
      END

You can then create measures with this dimension as well by adding the following to the measure definition:

filters:
  mtd_this_day_two_months_ago: true

We however find problems with month-to-date reporting at the beginning of each month when there is a) no data and/or b) the previous month started on a different day of the week. There are also problems with months not having the same number of days at the end of a month. When possible, I encourage looking at week-compared-to-week graphs. However, business users seem to have all their mental benchmarks in months, so we took a different path to do month-compared-to-month analysis. Credit goes to @enelson for this solution too. It assumes you have a primary key defined in the view because it’s not specifying a field to count.

- measure: count_60_days_ago_for_30_days
  label: 'III Total Calls - 60 Days Ago for 30 Days'
  type: count
  filters: 
    created_date: 60 days ago for 30 days
  detail: phone_calls_details*
  hidden: true

Measures like this allow us to report on 3 equal 30 day time segments in the same graph.


(brettg) #10

This can also be done with scalar subqueries on redshift. Below is an example filtering at the date granularity level:

- view: order_items
  sql_table_name: order_items

  - filter: is_before_order_date
    label: Before Order Date (Day Of Year)
    description: Use this to just filter on days of years before the most recent order
    type: yesno
    sql: ${created_day_of_year} <= DATE_PART(dayofyear, (SELECT MAX(order_items.created_at) FROM order_items))

  - dimension_group: created
    view_label: 'Orders'
    type: time
    timeframes: [time, hour, date, week, month, year, day_of_year]
    sql: ${TABLE}.created_at


(Morgan Imel) #11

Quarter to Date dimensions are much trickier than the above, but doable! You can make the following modification:

##Postgres / Redshift:

# from:
EXTRACT(DAY FROM ${created_time})
# to: 
CASE EXTRACT(QUARTER FROM ${created_time})
WHEN 1 THEN EXTRACT(DOY FROM ${created_time})
WHEN 2 THEN EXTRACT(DOY FROM ${created_time}) - 90
WHEN 3 THEN EXTRACT(DOY FROM ${created_time}) - 181
WHEN 4 THEN EXTRACT(DOY FROM ${created_time}) - 273
END
- (CASE WHEN (MOD(CAST(EXTRACT(YEAR FROM ${created_time}) AS INT),4)=0) THEN 1 ELSE 0 END)
# the last line is to account for leap years

##MySQL:

# from:
EXTRACT(DAY FROM ${created_time})
# to: 
CASE QUARTER(${created_time})
WHEN 1 THEN DAYOFYEAR(${created_time})
WHEN 2 THEN DAYOFYEAR(${created_time}) - 90
WHEN 3 THEN DAYOFYEAR(${created_time}) - 181
WHEN 4 THEN DAYOFYEAR(${created_time}) - 273
END
- (CASE WHEN (MOD(YEAR(${created_time}),4)=0) THEN 1 ELSE 0 END)
# the last line is to account for leap years

Note that the same would apply to the current date as well.

Not super pretty, but gets the job done!


(Joao Ramos) #12

I’m getting the error

Failed to retrieve data - Encountered " "FROM" "FROM "" at line 32, column 392. Was expecting: ")" ...

which I take to mean that Looker is picking up the FROM in EXTRACT(DAY FROM [date]) as a clause. Any ideas on what I might be doing wrong?


(sam) #13

Hey @Joao_Ramos, we’d love to help you get to the bottom of this! Please email support@looker.com with details on what your LookML looks like and we can work through this via email.


(Parijat Talkad) #14

Not sure where else to put this, but the above image of the graph, is there a way to move the date/year below the bar/column, as opposed to the left side of the bar/column?


(romain.ducarrouge) #16

For MySQL we can also use the following in New LookML:

dimension: is_before_mtd {
    type: yesno
    sql: ( ${TABLE}.date_field  not between  DATE_FORMAT(NOW() ,'%Y-%M-01') AND NOW() ) ;;
}
dimension: is_before_ytd {
    type: yesno
    sql: ( ${TABLE}.date_field not between  DATE_FORMAT(NOW() ,'%Y-01-01') AND NOW() ) ;;
}

for Postgres:

dimension: is_before_mtd {
    type: yesno
    sql: ( ${TABLE}.date_field now() not between date_trunc('month', now()) and now() ) ;;
}
dimension: is_before_ytd {
    type: yesno
    sql: ( ${TABLE}.date_field  now() not between date_trunc('year', now()) and now() ) ;;
}

and for Redshift:

dimension: is_before_mtd {
    type: yesno
    sql: ( ${TABLE}.date_field not between DATE_TRUNC('month', CURRENT_TIMESTAMP) and CURRENT_TIMESTAMP ) ;;
}
dimension: is_before_ytd {
    type: yesno
    sql: ( ${TABLE}.date_field  not between DATE_TRUNC('year', CURRENT_TIMESTAMP) and CURRENT_TIMESTAMP ) ;;
}

(Caitlin Moorman) #17

A simpler way to handle day of quarter in Redshift that avoids the CASE statement is:
DATEDIFF('day',date_trunc('quarter',${created_time}),${created_time}) + 1