Percentile of a dimension grouped by another dimension


(Soren Rehn) #1

I’m looking to generate a table that accomplishes a the equivalent of

 percentile_cont(0.95) within group (order by response_time) as time_95_pct
from time_tracking
group by server_name

I’ve tested with the percentile function, but there doesn’t seem to be a way to calculate it within a group.
It doesn’t look like I can pivot on the group either because the value I want to get the percentile is a dimension.

(Izzy) #2

Is that query throwing an error?

I wasn’t quite sure what to make of this, so I googled it and found this which might be useful:

If that query’s acting up, you also may be able to do the percentile in a subquery, and then group by it in an outer query.

Or, perhaps you just need to add a PARTITION BY clause to partition by group?

(Soren Rehn) #3

Sorry - The query I posted is working fine. But I’m wondering how I could accomplish the same thing with a calculated filed in a Look. It appears the percentile function only operates across an entire column by default, and I can’t figure out how to look only within a subset of rows.

(Izzy) #4

Aha, thanks for clarifying. This is tough to do, since it’s basically like a SQL window function in a table calculation. I feel like I keep tagging @chris.seymour in these, but he came up with a wizardly way to do these grouped calculations in table calcs that I’m sure he’d be happy to share.

(Fabio) #5

Maybe I’m misunderstanding the question, but it seems to me you could do this by adding it as a measure to your time_tracking view -

measure: time_95_pct {
  type: number
  sql: percentile_cont(0.95) within group (order by ${TABLE}.response_time) ;;

Just be aware that if this measure is used with a non 1:1 join, it may not work as a “symmetric”/“distinct” measure