Allow ability to add query hints and join hints, e.g. HASH join etc

done
low_priority
reply

(Matthew Darwin) #1

It would be really useful to have a mechanism to add query hints to the generated SQL from LookML. I have some queries where the optimiser for SQL Server is struggling due to the casting of date columns to character and back to facilitate month date ranges, this is resulting in a cardinality estimate issue, which in turn is resulting in a nested loop join instead of a hash join.

Simply specifiying the HASH hint in the join clause sorts the problem out completely; so having the ability to ensure that these are added via LookML would be very beneficial.


(Izzy) #3

This is a lovely idea— I will move it over to the feature requests section where it can be tracked. Having an options parameter in LookML would be cool.

That said, you might want to explore using the sql: parameter: https://docs.looker.com/reference/explore-params/sql-for-join. It’s been pushed down in favor of the much better sql_on:, but if you want total control over your join SQL, that parameter gives it to you.

You could do something like

join: table {
    view_label: "sweet table"
    relationship: many_to_one
    sql: LEFT OUTER HASH JOIN table ON ${table.id} = ${othertable.id} ;;
  }

or even

join: table {
    view_label: "sweet table"
    relationship: many_to_one
    sql: LEFT OUTER JOIN table ON ${table.id} = ${othertable.id} OPTION(HASH JOIN) ;;
  }

I don’t have a SQL server db to mess around with, so this is all conjecture— Would love to know if it works!


(Matthew Darwin) #4

I’ll give that a whirl; I wasn’t aware of that as an option. It should be very very useful in a lot of scenarios here!


(Izzy) #5

It’s kinda funny, I literally last week was like “nobody uses this darn parameter anymore” and made an edit to the product code to further deprecate it/add warnings for it. Guess I’ll delete that pull request…

Glad that looks promising!


(Matthew Darwin) #6

Yes, that works. Had some messing about to do with some aliasing, as this is also in an extended view, but got things working ok. So thanks!


(Matthew Darwin) #7

izzy, I have a further question on this. We’re currently using this in an extended model, where the same table is referred multiple times, once in the original model, the second in the extended part (to join back to the original model).

This is in order to tie data back to a calendar view, and allow the same logical joins of data, but would allow that same logical set to be queried based on different event dates, e.g. application date, deal date and so on. The reason we used an extended model is so we didn’t need to maintain the logical joins more than once, just the anchor for the date selection.

The original SQL output for this would actually result in just a single join to that table, as the lookML implementation seemed clever enough to realise that’s what we’re doing.

However, with using sql: in the extension we now get errors for that table where it’s complaining about having both a sql: and sql on: parameter for the same table. I can get around this by using a different name in the extension, however this then results in two joins. Which is fine, but has a bit of an impact on performance and readability.


(Izzy) #8

This might be part of the reason why that parameter is now deprecated in favor of sql_on:. With the sql: parameter, Looker just kind of gives you carte blanche and assumes you know what you’re doing— We don’t parse stuff out from there in the same clever way as we do with sql_on.

I’ll give this a real test tomorrow morning and see if I can’t figure out a way around it… What’s the actual wording of the error you’re seeing before renaming?


(François Jehl) #9

Totally agree with this feature, especially since this is quite consistent cross engines (you don’t have that much differences). Except SQL Server that uses the OPTION keyword in SELECT’s BNF, most others (including Oracle, MySQL, Vertica…) use the SELECT /*+hint*/ syntax.


(Matthew Darwin) #10

Sorry Izzy, missed your reply. I’ll try and grab the error and code for replication in a moment.


(lloyd tabb) #11

Its kind of unfortunate, but if you are replacing a sql: parameter within a join you need to either stick with sql:. If you start with an sql_on: you need to stick with an sql_on: in the extended version in both the base and extended declarations. Extends isn’t smart enough to replace a sql: with an sql_on: when overriding the declaration.

The extends works in a context free way over objects. The same problem exists with sql_table_name: and derived_table: declarations in views. If you start with a sql_table_name:, you can’t extend it to be a derived table. This is one of the regrets for me in LookML.

There is a reasonable work around. You can create a version an ancestor without either sql: or sql_on: defined and force the declaration in subsequent uses. Not ideal, but probably more readable.

explore: foo_base {
   extension: required
   join: some_view {
   }
}

explore foo1 {
   extends: [foo_base]
   join: some_view {
     sql: .... ;;
   }
}

explore foo2 {
   extends: [foo_base]
   join: some_view {
     sql_on: .... ;;
   }
}

(Matthew Darwin) #12

OK, that can work. The original join will therefore look something like this:-

  join: dim_all_allocations_with_departmentgroup {
view_label: "Applications"
#sql_on: ${dim_all_allocations_with_departmentgroup.application_id} = ${dim_loan_applications.application_id}
#       -- AND ${dim_all_allocations_with_departmentgroup.finance_exec_id} IS NOT NULL
#  ;;
#type: left_outer
sql: LEFT OUTER HASH JOIN DIM.AllAllocations AS dim_all_allocations_with_departmentgroup
      ON ${dim_all_allocations_with_departmentgroup.application_id} = ${dim_loan_applications.application_id};;
relationship: one_to_many
  }

and the extended view like this:-

  join: dim_all_allocations_with_departmentgroup  {
    view_label: "Applications"
    #sql_on: CONVERT(date,${dim_all_allocations_with_departmentgroup.allocated_from_raw}) = ${dim_calendar.date_raw} ;;
    #type: inner
    # switching to sql: in order to specify the join hint
    sql: INNER HASH JOIN DIM.AllAllocations AS dim_all_allocations_with_departmentgroup
    ON CONVERT(date,${dim_all_allocations_with_departmentgroup.allocated_from_raw}) = ${dim_calendar.date_raw};;
    #AND ${dim_all_allocations_with_departmentgroup.application_id} = ${dim_loan_applications.application_id};;
    relationship: one_to_many
  }

  join: dim_loan_applications {
    view_label: "Applications"
    sql_on: ${dim_all_allocations_with_departmentgroup.application_id} = ${dim_loan_applications.application_id};;
    type: inner
    relationship: many_to_one
  }

Seems to be working ok in the intial testing, so thanks for that!