Sql_distinct_key: in symmetric aggregates (3.20+)

(lloyd tabb) #1

Note: sql_distinct_key is usually used when there is no primary key on the table you are aggregating over. In most cases, you will not need sql_distinct_key if you have a primary key.

In Release 3.20, type sum_distinct and type avg_distinct can now optionally take a new parameter sql_distinct_key.

Normally, sum_distinct and avg_distinct measures, use the declared primary key of the view they are contained in as unique value associated with the number on which they are performing the operation. This new parameter lets sum_distinct and avg_distinct work situations where there is no primary key or the view is denormalized (a very wide table with subtables embedded).

Suppose, for example you had a table that looked like:

| order_time | order_id | order_shipping_amount | customer_id | item_id | item_name | item_amount |

In this particular table, there is no primary key and each row is repeated for every item in an order. A single order may consist of multiple rows in this table. Summing order_shipping_amount would be incorrect because it would be repeated for each item. You could use LookML’s sum_distinct to perform this:

- measure: total_shipping_amount
  type: sum_distinct
  sql: ${TABLE}.order_shipping_amount
  sql_distinct_key: ${TABLE}.order_id

Error: Non-unique value/primary key error
Looker 3.20 Release Notes
(Zev Lebowitz) #2

Cool! If my distinct key is a combination of multiple columns, do I have to define a new dimension first, or can I write the concatenation in the sql_distinct_key field like this:

- measure: total_shipping_amount
  type: sum_distinct
  sql: ${TABLE}.order_shipping_amount
  sql_distinct_key:  ${TABLE}.order_id || ' - ' || ${TABLE}.customer_id

(lloyd tabb) #3

Yes, any valid sql expression will work. It must be unique and the value generated in the sql statement must remain constant for each unique key.

(Swapnil Pimple) #4

Hey I wanted to do distinct sum my input is following
Rule — Amount
1 — 100
2 — 200
1 – 100
but output coming like
Rule — Amount
1 – - 200
2 — 200
Can you help to get distinct sum

(lloyd tabb) #5

How are you writing it? It should something like the measure below.

measure: total_amount {
  type: sum_distinct
  sql_distinct_key: CAST(${rule} as STRING) ;;
  sql: ${amount} ;;