The purpose of this article is to understand how to group by information in your database which is only available as a measure. Examples include:
- Cohort on a lifetime value such as all orders a customer has ever made or count of sales a salesperson made each quarter
- Compare an order’s date to that customer’s first order date (when first_order_date doesn’t exist as it’s own column in the users table)
Let’s dive a little deeper into the first example. Below is a sample portion of an orders table in your ecommerce database.
A sales manager asks, “Do my salespeople who sell the fewest orders actually bring in the most revenue?” The task becomes cohorting my salespeople into groups based on how many orders they have sold. The final metric would look like this:
You’ll notice in the original orders table we don’t actually have a column which tells us how many sales that salesperson has made. Because of this, we cannot calculate a measure based on a cohort, since SQL doesn’t allow grouping by the results of an aggregate function. Put another way, we can only group by dimensions. What we need is a way to change the count of lifetime sales into a dimension, and then create a tier dimension from it.
We call this “dimensionalizing a measure.” This is because we are taking a measure, in this case a count of sales, and using it as a dimension. In Looker, we accomplish this by creating a derived table which expresses the measure we want in the derived table SQL.
- Start by setting up our results in the explore interface. In this case, we’d choose the Salesperson ID dimension and the Count of Sales measure. The result might look like this:
- Choose “Open in SQL Runner” from the SQL interface in the Results area.*
- Once in the SQL Runner, you can choose “Add to Project” from the gear menu. Don’t forget to remove any row limit clause in the derived table SQL*
- Add a tier dimension to cohort the salesperson’s lifetime sales into groups. In this case, 0-10, 11-20, and 21 or more.
- Join the table into your original explore via the field used in the group by clause. In this case that would be the salesperson ID.
*UPDATE: Steps 2 and 3 can also be done with a Native Derived Table instead of a SQL Derived Table. NDTs offer improved governance through interacting with your existing model, and are very useful after a brief learning curve. Learn more about NDTs here.
Now the salesperson’s lifetime orders is available as a cohort dimension in the original explore. The resulting table in our example looks like this:
Now that our explore includes this extra dimensionalized measure, we can produce the original desired table by using the Lifetime Sales Cohort from the derived table as our dimension, and a sum of the order values as a measure.
For another look at this topic from our founder Lloyd Tabb, check out this discourse article: