Data need to be refreshed periodically in Looker Derived tables

done
low_priority
lookml
reply

(Sadhana Ampalam) #1

Hello Everyone,

We have a requirement in Looker to refresh derived table on Periodic basis.Around 80 derived tables which needs to be refreshed at different schedules.When users are execting the Looks, they should get the data from the cache and it shouldn’t hit warehouse.

To achieve this we have used below code to refresh a single DT:
sql_trigger:
select case
when (((extract(day from current_date))=1) and (extract(hour from current_timestamp(0))=16 and (extract(minute from current_timestamp(0))=00 )) then current_timestamp
when (((SELECT day_of_week from sys_calendar.calendar where
calendar_date = current_date) = 1)
AND (extract(hour from current_timestamp(0)) = 20)
AND (extract(minute from current_timestamp(0))=00) ) then
current_timestamp
END ;;

Problems we are facing with the above code:

  1. After each successful refresh an unnecessary refresh will occur.(The reason for this is Case statement will return null value when it does not meet any condition .Null is not equal to Sql trigger value of Looker hence unnecessary refresh will occur).
  2. These Unnecessary built are causing utilization issues in data base end.
    Our dialect is Teradata dialect.
    Can anyone please suggest a way to meet this requirement.

Regards,
Sadhana


(Izzy) #2

This is the root of the issue. If you can get the case to return the previous trigger value, or never return nulls, you’d be set. I wonder if instead of doing a complex case when + extract statement, you could instead generate a calendar table with the dates + times you want refreshes to happen. Then, once all of your reference points for rebuilding are in a table, you could do a select MAX from that table WHERE the date /time is < the current day which should trigger rebuilds on the schedule of your choice.

Thinking of if there’s a more dynamic way to do it…


(sam) #3

The Problem

I agree with izzy, the problem is the usage of CASE WHEN and the subsequent nulls you get from the ELSE.

For example, if you want the table to build on the first of the month, the following code is faulty because it will build on the second and third day of the month.

SELECT CASE
  WHEN (((extract(day from current_date))=2)
  THEN CURRENT_TIMESTAMP
  ELSE NULL
  END

Using Offsets

I think it will help a lot to start with the cadence you want, and then adjust the time the rebuild happens.

For example, to only build on the first of the month, do something like:

SELECT  (extract(month) from current_timestamp) ;;

Easy, right? The value changes every month, so it will rebuild only on the 1st.

To rebuild on the second of the month, just use a little offset:

SELECT  (extract(month) from add_days(current_timestamp, -1) ;;

Note the subtraction of one day. Now, the 1st of the month will be computed as the last day of the previous month. So the month value won’t change until the 2nd of the month!

Combining multiple expressions

Say we want “Every 4th of the month AND every Tuesday.” This isn’t much harder; we can now just concatenate both values together. When one value changes, the concatenation will also change.

SELECT  
(extract(month) from add_days(current_timestamp, -3) ||
(extract(week) from add_days(current_timestamp, -2)  ;;

Notice that we subtract 3 from the timestamp in the first phrase so that the 3rd of the month will still appear to be “last month”, but the 4th will be considered as the 1st of the new month.

And we subtract 2 from the timestamp in the second phrase so that Monday appears as “last week”, but Tuesday appears as “this week”.

What do y’all think?


(sam) #4

Another example:

Every Monday at 2:00 AM, and the 1st of each month at 10:00 PM.

For the Monday at 2:00 am, this is a weekly refresh with a 26 hour offset. Normally, the refresh would happen at midnight Sunday, but we want to delay it by a whole day (24 hours) and then 2 more hours to get to 2 AM Monday.

For the 1st of the month at 10pm, this is just a monthly refresh with a 22 hour offset.

SELECT  
(extract(week) from add_hours(current_timestamp, -26)) ||
(extract(month) from add_hours(current_timestamp, -22)) ;;

(Sadhana Ampalam) #5

Hi Sam,

Thanks for your prompt reply. Seems like this might work for us. Will give it a try.

Regards,
Sadhana


(Izzy) #6

@sam with the remedy for my lazy Friday afternoon reply! I came back to it this morning to figure out a better answer but I don’t think it gets a lot better than that :smiley: