Reduce multiple joins


(Swetha Santosh) #1


In our explores we are using multiple joins on several views with derived tables. These joins range from 7 to 12 on an average/explore and Model contains 4 to 5 of such explores. I am looking to merge these tables and form a large physical table on redshift or have a PDT defined to cache it and triggered every 24hours to increase query performance and also reduce validation time.

Does anyone have any thoughts on this? or a better approach?

(Kevin Mc Carthy) #2

Hello Swetha,

This is a great question that comes up a lot, and the correct approach will depend on the specific scenario.

My perspective is that creating a pre-joined wide table is definitely a tactic to consider to help with performance.

Some of the downsides to consider include:

  • Uses that don’t touch the most granular data might actually perform worse.
  • If the PDT takes exceptionally very to build, that can also be problematic.
  • Depending on your database, there may be costs associated with additional physical storage.

I’m interested to hear others’ considerations and suggestions.