Percentile of a dimension grouped by another dimension

done
low_priority
reply

(Soren Rehn) #1

I’m looking to generate a table that accomplishes a the equivalent of

select
 server_name,
 percentile_cont(0.95) within group (order by response_time) as time_95_pct
from time_tracking
group by server_name
;

I’ve tested with the percentile function, but there doesn’t seem to be a way to calculate it within a group.
It doesn’t look like I can pivot on the group either because the value I want to get the percentile is a dimension.


(Izzy) #2

Is that query throwing an error?

I wasn’t quite sure what to make of this, so I googled it and found this which might be useful: https://stackoverflow.com/questions/19546604/use-percentile-cont-with-a-group-by-statment-in-t-sql

If that query’s acting up, you also may be able to do the percentile in a subquery, and then group by it in an outer query.

Or, perhaps you just need to add a PARTITION BY clause to partition by group?


(Soren Rehn) #3

Sorry - The query I posted is working fine. But I’m wondering how I could accomplish the same thing with a calculated filed in a Look. It appears the percentile function only operates across an entire column by default, and I can’t figure out how to look only within a subset of rows.


(Izzy) #4

Aha, thanks for clarifying. This is tough to do, since it’s basically like a SQL window function in a table calculation. I feel like I keep tagging @chris.seymour in these, but he came up with a wizardly way to do these grouped calculations in table calcs that I’m sure he’d be happy to share.


(Fabio) #5

Maybe I’m misunderstanding the question, but it seems to me you could do this by adding it as a measure to your time_tracking view -

measure: time_95_pct {
  type: number
  sql: percentile_cont(0.95) within group (order by ${TABLE}.response_time) ;;
}

Just be aware that if this measure is used with a non 1:1 join, it may not work as a “symmetric”/“distinct” measure