Dimensionalize a Measure: Cohort Tiers on a Count


(andy) #1

The purpose of this article is to understand how to group by information in your database which is only available as a measure. Examples include:

  • Cohort on a lifetime value such as all orders a customer has ever made or count of sales a salesperson made each quarter
  • Compare an order’s date to that customer’s first order date (when first_order_date doesn’t exist as it’s own column in the users table)

Let’s dive a little deeper into the first example. Below is a sample portion of an orders table in your ecommerce database.

A sales manager asks, “Do my salespeople who sell the fewest orders actually bring in the most revenue?” The task becomes cohorting my salespeople into groups based on how many orders they have sold. The final metric would look like this:

You’ll notice in the original orders table we don’t actually have a column which tells us how many sales that salesperson has made. Because of this, we cannot calculate a measure based on a cohort, since SQL doesn’t allow grouping by the results of an aggregate function. Put another way, we can only group by dimensions. What we need is a way to change the count of lifetime sales into a dimension, and then create a tier dimension from it.

We call this “dimensionalizing a measure.” This is because we are taking a measure, in this case a count of sales, and using it as a dimension. In Looker, we accomplish this by creating a derived table which expresses the measure we want in the derived table SQL.

  1. Start by setting up our results in the explore interface. In this case, we’d choose the Salesperson ID dimension and the Count of Sales measure. The result might look like this:
  2. Choose “Open in SQL Runner” from the SQL interface in the Results area.*
  3. Once in the SQL Runner, you can choose “Add to Project” from the gear menu. Don’t forget to remove any row limit clause in the derived table SQL*
  4. Add a tier dimension to cohort the salesperson’s lifetime sales into groups. In this case, 0-10, 11-20, and 21 or more.
  5. Join the table into your original explore via the field used in the group by clause. In this case that would be the salesperson ID.

*UPDATE: Steps 2 and 3 can also be done with a Native Derived Table instead of a SQL Derived Table. NDTs offer improved governance through interacting with your existing model, and are very useful after a brief learning curve. Learn more about NDTs here.

Now the salesperson’s lifetime orders is available as a cohort dimension in the original explore. The resulting table in our example looks like this:

Now that our explore includes this extra dimensionalized measure, we can produce the original desired table by using the Lifetime Sales Cohort from the derived table as our dimension, and a sum of the order values as a measure.

For another look at this topic from our founder Lloyd Tabb, check out this discourse article:


How to pivot or group by a measure?
Max or Min Date Measure
The Ability to Leverage Result from One Tile in Another Tile
(Christine Kelly) #2

Hello, I was wondering if it is possible to have another approach than to create a derived table?
A new derived table means another view and it looks like clutter so it is confusing for users because they can get lost and miss data.
In addition it is a lot of hassle just to convert one measure into a dimension which we might need to do quite regularly for ad hoc analysis.
I know a work around to get a dimension into a measure is to multiply by 1 or divide by 0 ect, is the opposite way not possible somehow?

Please kindly advice.
Thank you in advance,
Christine


(david.hughes) #3

Hey Christine

Yes we can take another approach to do this,

Custom Fields SQL Runner Explore Approach:
How To Steps

  • Select measure you want to dimensionalize on the explore page

  • Open in SQL Runner and run query

  • Gear menu - Explore

  • Create custom measure, select dimensionalized measure as the Field to Measure

Advantages:
-You can achieve many common use cases (Make tiers out of a measure, Perform aggregations on a measure, Pivot on a measure) without ever changing or even accessing the LookML project, preventing model bloat
-Much simpler steps requiring no LookML knowledge to get to the desired result

Disadvantages:
-Less reusable than derived tables
-Limited by what you can do with custom fields (currently no number type measures, limited date functionality, tiers require lots of if functions, etc.)

Regards
Dave


(Christine Kelly) #4

Hi Dave,
But I will need to save it as a project after as I do not have access to all my other dimensions, it is exactly what I was doing already.I can not do it without the derived table.
Once I click on explore and add the custom measure I will not have access to my other fields.


(andy) #5

Can you clarify your question, Christine? I am confused why you wouldn’t have access to important dimensions if you take the custom field approach.