Looker Community

'Point In Time' actions modeling

This is a bit long-winded but please bear with me on this one.

I’m not sure our modeling journey has been as straight forward and seamless as we would have anticipated or wanted.

Our use case is as below.

We have 6 separate tables that contain information from actions that our company takes on a regular basis.

Direct Sales

These tables contain a date that those actions took place along with a session_id to identify the session. Those dates look like the below



The way our model works is that these numbers need to be reported side by side and are a reflection of our ‘Point in Time’ performance. To be more specific they don’t share any common identifiers other than the date that the actions took place, so a typical join won’t work in this instance (session_id = session_id).

Our current solution is to join these tables up via UNION into a PDT using standard SQL and giving each table an ‘Event’ type. See below example (rough mockup)

	utc_date_abandoned AS [Activity Date]
	,client_number AS [Client]
	,’Abandoned’ AS [Event]
	,COUNT(DISTINCT session_id) AS [Number of Abandonments]
	,NULL AS [Number of Sends]
	,NULL AS [Number of Opens]

FROM data.abandonments

	WHERE utc_date_abandoned >= ‘2019-10-10’
		     AND utc_date_abandoned < ‘2019-10-15’
		     AND client_number = 12345


	utc_date_sentAS [Activity Date]
	,client_number AS [Client]
	,’Sends’ AS [Event]
	,NULL AS [Number of Abandonments]
	,COUNT(DISTINCT session_id) AS [Number of Sends]
	,NULL AS [Number of Opens]

FROM data.abandonments

	WHERE utc_date_sentAS >= ‘2019-10-10’
		     AND utc_date_sentAS < ‘2019-10-15’
		     AND client_number = 12345


The final results would give us the Total number of Abandonments, Sends and every other metric we decide on by adding another table to the union.

This solution works well for us but does have a few downfalls that we have discovered over time.

1.) We decided to run the UNION in the PDT without date restrictions overnight and cache the results.
- This resulted in really quick querying times but was snowflake usage heavy and used far too many credits.

2.) We added restrictions to the WHERE clause which reduced the volume of data but meant that it was running in query and taking almost 2 minutes to run.

My initial suspicion is that the route we’ve gone down is not the best solution for either performance or cost.

Has anyone had a similar use case or does anyone have any ideas on how we might be able to do this in a more efficient manner?

Hey @Paul_Walton - I would probably use UNION ALL too. But to save on Snowflake credits you could implement an ‘incremental’ rebuild of the PDT, take a look at How to: Simple incremental PDTs using create_process

This’ll allow you to only process a subset of your new events, whilst maintaining a complete view in the ‘cached’ table.


Hi Paul,

This approach may be useful for you if you want to avoid a PDT altogether and allow users to hit raw data, but perform the “union” at query time: https://help.looker.com/hc/en-us/articles/360023686574-A-Cure-For-the-one-to-many-Blues

Morning Fabio,

Thanks for your suggesting. We’ve implemented your model and are finding that as soon as we bring in measures from different views we then begin to lose performance. If we bring in the 2 measures separately we are getting results in around 2 seconds. As soon as we combine them this jumps to around 4 minutes.

Any help would be great.

Hi Paul,

It would help to see a query plan if you have one available.

Off the top of my head, it could be either a join algorithm issue, or a data scanning issue.

For the join algorithm, for example, Snowflake may be choosing to implement the one-to-many joins as nested loops, when a hash join should still work. That might be pretty specific to Snowflake’s query planner and would require some trial and error to try to overcome

For scanning issues, it could be that you’re applying a filter in the outer where clause and Snowflake is not pushing it down to the table scans, which we can get around to a certain extent by doing it ourselves using liquid inside of our LookML joins.

Hi Paul, just curious if you found anything else about where the problem is coming from?