Templated filters: Show names in the suggestions but put id in the SQL

(Andrew Paterson) #1

As is probably common with relational databases, I am generating a report on a table with a structure that looks like this:

Event:
    Date
    Parameter
    Occurrences
                      Site:
    Site_Id     --->    Site_Id
                        Name

                      Section:
    Section_Id  --->    Section_Id
                        Name

                      Group:
    Group_Id    --->    Group_Id
                        Name

    etc.

I want to allow users to filter by site, section and/or group but this is a statistical report (total number of occurrences by date and parameter) and the site, section and group do not actually appear in the report, so I am using templated filters in a derived table. The issue that I have is that the users will pick values for the site/section/group by name, not id (which is meaningless to them, of course) but this means that, to make the filters work, I will have to join the Site, Section and Group tables to the Event table to get the names that are used in the filter. There are two problems with this:

  1. There are (literally) millions of entries in the unfiltered Event table so joining in other tables is costly.

  2. The site, section and group names are filter-only fields so the joins to the Site, Section and Group tables are unnecessary when not filtering by those fields.

N.B. I tried using a templated filter with a “sql” parameter to add a WHERE clause to the query but, unfortunately, it did not work because the WHERE clause was applied to the result of the derived table whereas it needed to be embedded inside the derived table.

Our in-house reporting system (that we are replacing with Looker) shows the user the list of names but uses the id in the SQL select - e.g. “event.site_id IN (123,456,789)” - so, if there isn’t a way to do this in Looker, I would like to suggest a possible enhancement. You could add another “suggest_…” parameter (e.g. suggest_list) to be used with the suggest_explore and suggest_dimension parameters. For example:

- filter: site_name
    suggest_explore: site
    suggest_dimension: site.site_id
    suggest_list: site.name

In this example, the site name would appear in the filter drop-down list but the site id would be used in the SQL. This would make the join unnecessary and the report would run faster whether filtered or not.

Thanks,
Andrew

0 Likes

The Podium — May 1st, 2019
(lloyd tabb) #2

Hey @apaterson, you can move the filter into the derived table using a {% condition %} block. The usage is a little tricky, but it works very well. If no filter is provided, the block is substituted with a ‘true’ value so you can provide multiple filters this way.

0 Likes

(Andrew Paterson) #3

Hi @lloydtabb, many thanks for the suggestion. I am not sure that I fully understand it right now but I will try it out when I have had chance to read through the documentation.

0 Likes

(sam) #4

To elaborate on Lloyd’s idea:

Option 1: Using Templated Filters

Let’s take advantage of the fact that the “sql” parameter of a “filter” field directly injects the WHERE clause. We create a filter which pulls suggestions from the name field, but then actually grabs the id from the ID field.

filter: name_to_id {
  suggest_explore: users
  suggest_dimension: users.name
  type: string
  sql: ${id} IN (SELECT users.id FROM users WHERE
       {% condition name_to_id %} users.name {% endcondition %}) ;;
}

This will generate a subquery in the WHERE clause which returns all IDs which match when the filter criteria is applied to “name”.

The downside of this approach is that a subquery isn’t especially performant.

Option 2: Hard Coding with Parameters

To avoid a lookup query like the one in Option 1, we can hard code the mapping directly into a “parameter” field.

parameter: name_to_id {
type: string
  allowed_value: {
    label: "1"
    value: "Lloyd"
  }
  allowed_value: {
    label: "2"
    value: "Sam"
  }
# More allowed_value objects afterwards if desired
}

This could be a lot of work to setup and maintain, but would improve performance by eliminating the need for a lookup function. There’s a neat python script by Drew Gillson which can help automate this process of updating the LookML allowed_value objects based on results from a Look.

1 Like