SUMIFS in looker

Hi there,

I want to understand how to do SUMIFS in Looker. Example of expression:

SUM “Value Column 1” when “Dimension 1” = “X” AND “Dimension 2” = “Y” AND “Dimension 3”=“Z”.

I’ve been trying calculated fields like the below but it hasn’t worked as it sums more rows than what it should:

if(${dimension 1}=${dimension column X} AND ${dimension 2}=${dimension column X}, sum(${column1}), ${column1})

Please let me know

Thanks

Oriol

1 13 24.2K
13 REPLIES 13

Hi @Oriol,

In order to get the calculation you want, try wrapping your if statement in the SUM function, like this: SUM(if()). This will compute the condition of the if statement first, and then sum. Let me know if this gets you the results you want!

Quinn

Quinn’s example works if you want to do this using the ‘Calculations’ feature of a result set.

If you want SUMIFs to work in SQL you can create a new measure filtered in the model. Here is a great article that describes all the interesting ways to use the feature:

[Retired] Filtering Measures in LookML

Hi Quinn,

I’m afraid your recommendation doesn’t work. I’ve tried to replicate Microsoft Excel’s “SUMIFS” formula in several ways:

(goal is to sum “value_column1” when 2 conditions are met, otherwise leave the value of “value_column1” as is)

  1. if(${dimension_column1}=1 AND ${dimension_column2}=${dimension_column2},sum(${value_column1}), ${value_column1})

this option ignores the “IF” statement completely and just sums everything in “value_column1”.

  1. sum(if(${dimension_column1}=1 AND ${dimension_column2}=${dimension_column2},${value_column1},0))

this option sums all “value_column1” values where ${dimension_column1}=1 but ignores the second IF statement. The problem with the second statement I guess is that I can’t reference a specific cell within “dimension_column2”, which is what I want to do. It also means I can’t leave “value_column1” as is, I have to leave it as “0” or something else.

I have to do this using “Calculations” or just do it in excel as I’m currently doing. Would be good to have the option to do this in looker as we could leverage all the other great features of the software (dashboards and so on).

Thanks,

Oriol

Hey @Oriol
Could you give this table calculation a try:

    sum(if(${dimension_1}=1 AND ${dimension_2}=1, ${measure_1}, null)), 
${measure_1})

it should apply the sum on the rows that match your two conditions only!
Alternatively we can also implement a filtered measure for this use case as Lloyd suggested above.
Please let us know if that worked for you!
Cheers,
-Romain

Hi Romain,

This doesn’t work either as I can’t refer to one single cell for every particular row. The functionality I’m looking for here is exactly the same as a “SUMIFS” in excel and it doesn’t seem to be doable with calculations in looker.

Thanks

Oriol

Hi @Oriol,

We would love to dive further into your use case and find a workable solution for you. Could you please visit help.looker.com with screenshots of the calculation you are working on? I think this extra context will be very beneficial in this case.

Thanks!

Quinn

Another way of doing it: you can filter a measure in the Develop part.

 measure: red_total {
    type: sum
    sql: ${amount} ;;
    filters: {
      field: color
      value: "red"
    }
  }
 measure: blue_total {
    type: sum
    sql: ${amount} ;;
    filters: {
      field: color
      value: "blue"
    }
  }

Hello one and all!

I too have been looking for a dynamic way of re-creating sumif/countif in Looker calculations.

I’m using Looker for example, to show which US State a contact is living in. I have 1000 rows of contacts, each has a US State listed as a dimension (oh, and dropping the other dimensions is not a solution, we need the other dimensions)

Using MATCH(Dimenson1:Dimension1), you can identify the first instance of every US State, and with another calculation, you can INDEX that instance to produce a second clean list of every US State. So far, so good.

Where I hit a wall is here.

Let’s say the first states that show up in the above mentioned second column are NY, CA, and TX. Right now, I can only isolate the SUM one argument at a time: IF(NY, then NY, IF(CA, then CA, and so on. (This mostly takes shape around the row() of the result).

What I’ve not figured out yet is if you can lock what you’re looking at row by row in the second column, and have it look at every instance in the first column. In Excel, this is easy by applying a “$” symbol next to the row or column you want to keep.

So far in Looker, when I ask the second column to find what matches it on the first column, it only identifies instances where the item match on the same line, not across multiple lines.

Any idea if I’m missing something or this is what everyone has found?

Thanks for your interest. Have a nice day!

Hi Joseph – I just sent you a follow up email. I’ll post back here with the solution we come up with.

While this isn’t directly solving the above question, it might be handy for those who happen upon this post looking to do a type of SUMIF().
Parameters can be a great help for doing somewhat dynamic SUMs while keeping things user-friendly. Here’s a flavour of something we see implemented frequently:

parameter: brand_selection {
    allowed_value: {
      value: "Calvin Klein"
      label: "Calvin Klein"
    }
    allowed_value: {
      value: "Volcom"
      label: "Volcom"
    }
  }
  parameter: category_selection {
    allowed_value: {
      value: "Jeans"
      label: "Jeans"
    }
    allowed_value: {
      value: "Accessories"
      label: "Accessories"
    }
  }
  parameter: retail_price_upper_bound {
    type: unquoted
    allowed_value: {
      value: "100"
      label: "Less Than $100"
    }
    allowed_value: {
      value: "75"
      label: "Less Than $75"
    }
    allowed_value: {
      value: "50"
      label: "Less Than $50"
    }
  }
  
  measure: sum_if_conditional_measure {
    type: sum
    sql: CASE WHEN ${product_brand} = {% parameter brand_selection %} 
    AND ${product_category} = {% parameter category_selection %}
    AND ${product_retail_price} <= {% parameter retail_price_upper_bound %}
    THEN 
    ${product_retail_price}
    ELSE NULL
    END;;
    value_format_name: usd
  }

Which gives the user 3 basic filter options with finite value selections (expand as necessary) and voila:

GenDemo
Participant V

Thnx, this is exactly what I was looking for.

GenDemo
Participant V

to give an example - the attached image is how I use it. It works fine if you have very few records. But I am currently working on something where I am moving an Excel file into Looker: where there are number of SUMIFs on a table of accounts with various filters applied. I started doing this in Looker by using the filters, but its getting to complex. So I tried using these sum(if()) formulae. They work for few records, but with this case I am working on now, the row limit is affecting the calculation. So I will be looking to move this to LookML calculations in the SQL statements rather.

2020-02-07 14_13_13-Receivables Snaphot

Have a look at this webinar: https://training.looker.com/table-calculations-tips-and-tricks/277136, and start at 9:40. It covers SUMIF there.

Top Labels in this Space
Top Solution Authors