How do I count duplicate values in a table?

(Wayne Henry) #1

I’m looking to count (and show the count) for duplicate values in a table, similar to how in Excel I can use the @ symbol to refer to a cell in the current row, then count how many times that row appears in the column, i.e. COUNTIF([CustomerName],[@CustomerName])

How do I replicate this in looker, using table calculations?

(sara.leon) #2

Hi Wayne!

One way to do this using table calculations is to wrap a sum function around an if statement. An example of this is:


Here I’ve got a city name (type string), Woodridge, that I’m using boolean logic on. I’m then summing all the 1’s that occur, so all the instances where the city name is Woodridge.

Hope that helps!

Sara L.

(Wayne Henry) #3

Hey Sara!

Thanks a lot for your response, which would be useful in some circumstances, but as I’m dealing with 1000s of rows, I was after something similar but dynamic, so in other words, I wouldn’t have to specify “Woodridge” but rather “count how many times the specified cell in this row appears in the specified column”, so going down the column, the formula counts occurrences of “Woodridge”, “Libertyville”, “Richmond” etc, without me having to specify for each one. Does that make sense and do you know of a way to do that?

Thanks in advance!

(Sara Guzman) #4

Hi Henry,

We can count the Frequency of a Value in a Column by, modifying the table calcs for subtotals described here then, we can use the table calc below:

count(${view.field}) - match(${view.field}, offset(${view.field}, count(${view.field}) - row() * 2 + 1)) + 2

match(${view.field}}, ${view.field})

Please let us know if that helps!

(Wayne Henry) #5

Hi Sara!

My apologies for the delay in replying to your post, due to being on vacation. Thanks a lot for that calculation, it was exactly, what I was looking for and actually opened a lot more possibilities. Thanks a lot!

(Sarah) #6

We had a similar duplicate count request, and we used the above steps, however the table is messy. We are currently using the suggested partition_row_number and duplicate count custom fields. We are trying to get a total number of ‘# of reviews’ for each Project ID. Right now, when a project is reviewed more than once the project ID shows up the number of time it was reviewed. An example of what we would like to show is to have the project ID show up on one line with the TOTAL # of reviews on the same line. We are trying to count the number of times a project ID was in Workscope Review.

Example of Current Visualization:
Project ID Review Review Task Details Task Name # of Reviews
1 Passed Review 1
1 Error Review 2
2 Error Review 1
2 Error Review 2
2 Error Review 3
2 Error Review 4
2 Error Review 5

This is what we would like the Visualization to look like:
Project ID Review Review Task Details Task Name # of Reviews
1 Passed Review 2
2 Passed Review 5

Thank you,

(Sarah) #7

Here is a better screenshot of the tables. I recreated them in Excel so I could remove sensitive information. Thanks again!

(Sarah) #8

I was able to answer my own question via: Filtering in Visualizations using Table Calculations (3.28+)


(Izzy) #9

Nice! That’s exactly the solution I would recommend too :smile: