Creating Dynamic Tiers


#1

Tiers can be a great way to bucket values, but with LookML type: tier those buckets are predefined and static.

Sometimes we will want to create a dynamic tier that allows us to change the bucket size. We can do this using parameters!

First, we will create a parameter of type number for the user to input the bucket size they would like. Then, we will use the parameter within the dimension to determine the various buckets.

In this example, we will build a dynamic age tier.

parameter: age_tier_bucket_size {
  type: number
 }

dimension: dynamic_age_tier {
  type: number
  sql: TRUNCATE(${TABLE}.age / {% parameter age_tier_bucket_size %}, 0) 
     * {% parameter age_tier_bucket_size %} ;;
 }

To understand the logic, let’s take the example of creating 10-year buckets. We take the age, say 25, and divide by 10 to get 2.5 which is then truncated to 2. We then multiply this truncated number by 10 to get 20 which becomes the bucket. So any value between 20 - 29 would become part of the 20 bucket in this case.

Here’s what the output of this example would look like:


Histograms?
(jesse.carah) #2

I created a similar pattern which uses the modulo operation to build dynamic tiers. The big difference is that you get concatenated tiers like 10-20, 20-30, etc. This is what the tiers look like:

Mathematical Approach:

Like I said before, I use the modulo operation (denoted by %) to calculate the tiers. The approach can be explained with some simple math:

  • Let r represent the user-defined bucket size
  • Let n represent the the value for which we want to assign a bucket

We’ll first calculate the lower bound of the bucket with the following formula:

n - n % r

We then calculate the upper bound with this formula:

n - n % r + r

Example:

Let r = 5
Let n = 33
n % r = 33 % 5 = 3

This implies:

Lower bound = 33 - 3 = 30
Upper bound = 33 - 3 + 5 = 35
When then concatenate these two values and get 30 - 35 as our bucket.

The LookML

First, I create a parameter that enables the end user to select his or her desired bucket size:

  parameter: bucket_size {
    default_value: "10"
    type: number
  }

I then build a dimension that takes the user input from the bucket_size parameter and calculates even-sized buckets.

 dimension: dynamic_bucket  {
    sql:
        concat(${sale_price} - mod(${sale_price},{% parameter bucket_size %}),
          '-', ${sale_price} - mod(${sale_price},{% parameter bucket_size %}) + {% parameter bucket_size %})
      ;;
  order_by_field: dynamic_sort_field   
  }

One annoyance is that this dimension is a string, which can cause issues with sorting. For example, the value 100-110 would come before 90-100 if we applied an ascending sort on our bucket. For this reason, we’ll need to build a sort field which determines – you guessed it – how our dimension will sort.

  dimension: dynamic_sort_field {
    sql: 
      ${sale_price} - mod(${sale_price},{% parameter bucket_size %});;
      type: number
    hidden: yes
  }

Because our sort field is now a number instead of a string, the dynamic bucket column will sort as intended. It’s also worth noting that this dimension has the same cardinality as our bucket, which ensures that our result set does not fan out. Check out this post to learn more about fanouts.

–

That’s it! Go ahead and bucket away to your heart’s desire.


#3

or if anyone is using sql server which doesn’t have the exact same MOD and TRUNCATE functions, here is a pattern that provides a concatenated label as well as the numerical sort:

dimension:  MEASURE_raw  {
    type: number
    group_label: "MEASURE"
    hidden: no
    sql: ${TABLE}.MEASURE
  }

parameter: MEASURE_bucket_size {
    default_value: "2"
    type: number
  }

  dimension: MEASURE_dynamic_bucket  {
    description: "tiers of MEASURE"
    group_label: "MEASURE"
    sql:
    cast(
    ROUND(${MEASURE_raw}/ {% parameter MEASURE_bucket_size %}, 0) * {% parameter MEASURE_bucket_size %}
    as varchar(256)
    )
    +'-'+
    cast(
    ROUND(${MEASURE_raw}/ {% parameter MEASURE_bucket_size %}, 0) * {% parameter MEASURE_bucket_size %}
    + {% parameter MEASURE_bucket_size %}
    as varchar(256)
    )
    +'%'

    ;;
    order_by_field: MEASURE_bucket_sort
  }

  dimension: MEASURE_bucket_sort {
    group_label: "MEASURE"
    sql:
      ROUND(${MEASURE_raw}/ {% parameter MEASURE_bucket_size %}, 0) * {% parameter MEASURE_bucket_size %};;
    type: number
    hidden: no
  }