Create view from table does not consider timezone for datetime fields (BigQuery)

Consider the following Bigquery View:

SELECT CURRENT_DATETIME as dt

It produces a single column of type datetime

Then use “create view from table” to produce the view code:
54

It produces the following code:

view: test_datetime {
  sql_table_name: views.test_datetime ;;

  dimension_group: dt {
    type: time
    timeframes: [
      raw,
      time,
      date,
      week,
      month,
      quarter,
      year
    ]
    sql: CAST(${TABLE}.dt AS TIMESTAMP) ;;
  }

  measure: count {
    type: count
    drill_fields: []
  }
}

CAST(${TABLE}.dt AS TIMESTAMP) is actually incorrect if the application timezone is not UTC. This cast assumes the the timezone is UTC. One way to fix it is to use:
TIMESTAMP(${TABLE}.dt, "{{ _query._query_timezone }}") which is going to produce the right timestamp.

Additional question is why at all the conversion is needed? It can be kept as datetime.

Do you consider the above behavior a bug?

Thanks

Interesting. Would this not be rectified by the timezone conversion that would run when the sql is generated, unless you’ve added convert_tz: no to that dimension? Even if conversion isn’t present in the sql: parameter of the dimension, if you run a query from an explore, it will convert it properly at runtime.

Let me know if I’m missing the mark with that.

Hi Izzi and thanks for looking into this.

Looker produces this SQL:

FORMAT_TIMESTAMP('%F %T', TIMESTAMP(FORMAT_TIMESTAMP('%F %T', CAST(test_datetime.dt AS TIMESTAMP) , 'Asia/Kolkata')))

My understanding is - this assumes the underlying DATETIME in UTC and formats it accordingly. And if this is the case it will work but as DATETIME is designed to be independent from the timezone it’s more likely that the application time zone needs to be applied at the time of the first conversion.

And in all cases - it’s going to require some assumption. So I’m thinking why not just keep it as datetime and don’t interpret timezone at all?

Pls share your thoughts on this…

I was wondering - any thoughts on this?

Hey there!

DATETIME and TIMESTAMP are both datatypes in bigquery. Both of these datatypes do not store timezone information. CURRENT_DATETIME is a bq function that does allow for the optional input of a timezone. The return type of this function is a DATETIME. The value of CURRENT_DATE is the current time in the specified timezone or UTC if no timezone specified.

Now the reason that Looker adds syntax to cast that DATETIME to a TIMESTAMP is to use it a dimension_group- a DATETIME datatype won’t work with a dimension_group, there are conflicts in the generated sql. With this casting, there isn’t any timezone conversion happening, we’re only converting the datatype.

One way to see this is to SELECT CAST(CURRENT_DATETIME( America/New_York) AS TIMESTAMP) on your db or in sql runner. This will return the current time in NY as a TIMESTAMP datatype.

Let me know if there is anything I can clarify!

Cheers,
Adina

Hi Adina,

Thank you for writing back. The issue I’m bringing is different - it’s about DATETIME fields, being possibly improperly casted to TIMESTAMP with the CAST function. This can happens when the user/application timezone is not UTC.

Your comment on CURRENT_DATETIME() is correct but not related to this case. We are talking about any DATETIME field, not just the result of this function.

And about dimension_group not working with datetime datatype, according to documentation: “The default value for datatype is datetime.” so maybe it’s expected to work. Otherwise I have reported this behaviour separately here.

Thanks for your help!

Did you have an example of a datetime field being improperly cast? I’m not sure I follow what affect of the user/ application timezone is on this since neither datetime nor timestamp store timezone information. But I’d be happy to look into this more if you have a specific example!

I’m definitely with you in that a datatime datatype should work with a dimension group and will certainly pass this issue on to our engineering team!

Cheers,
Adina

Thanks Adina,

Let me try to illustrate the case.

I’m arguing that the way Looker chooses to convert DATETIME to TIMESTAMP using CAST() function for dimension_group may have undesired results.

Consider an application/user is in GMT+5:30 (India). We have the following view:

view: test_datetime {
  derived_table: {
    sql: SELECT DATETIME(2008, 12, 25, 18, 30, 00) as dt ;; # DATETIME: 2008-12-25 18:30 (Date is 25th)
  }
  
  dimension_group: dt {
    type: time
    timeframes: [
      raw,
      time,
      date,
      week,
      month,
      quarter,
      year
    ]
    sql: CAST(${TABLE}.dt AS TIMESTAMP) ;; # This is automatic typecast by Looker
  }
}

Then we go to the explore and just select dt date in the row. The SQL generated is the following and we expect the date to be 2008-12-25, right?

WITH test_datetime AS (SELECT DATETIME(2008, 12, 25, 18, 30, 00) as dt )
SELECT 
	CAST(TIMESTAMP(FORMAT_TIMESTAMP('%F %T', CAST(test_datetime.dt AS TIMESTAMP) , 'Asia/Kolkata')) AS DATE) AS test_datetime_dt_date
FROM test_datetime

GROUP BY 1
ORDER BY 1 DESC
LIMIT 500

So here’s the actual result - 2008-12-26 and it’s logical because 18:30 on 25th IS 00:00 on 26th at IST timezone. This is not expected, as the date is clearly 25th.

If the casting was done instead with TIMESTAMP(${TABLE}.dt, "{{ _query._query_timezone }}") then the result is as expected. This is not an isolated case - it clearly doesn’t work for all non-UTC timezones.

Hope this explanation helps but of course just let me know if it doesn’t make sense.

I’ve been looking at the bigquery docs and they’re super confusing on this. I’m trying to figure out where exactly things are breaking down in these conversions so I know what to run with.

This is what I think is happening. Please jump in and correct me if I botch it.

When this is executed:
SELECT DATETIME(2008, 12, 25, 18, 30, 00) as dt
I don’t think this has any timezone associated with it yet. No problems yet.

Then, when you do this:
CAST(${TABLE}.dt AS TIMESTAMP)
bigquery automatically gives it UTC timestamp, since we don’t give it the timezone— Their docs say “If no timezone is specified, the default timezone, UTC, is used.”. After this step, everything is offset since it’s converting from the assumed UTC.

Is that the step where you think we should be applying TIMESTAMP(${TABLE}.dt, "{{ _query._query_timezone }}") in the LookML generated from view? I think I follow, just making sure that this is specifically about the generated LookML, and not about the SQL generator.

Also, just out of curiousity, why wouldn’t

dimension_group: dt {
    datatype: datetime
    type: time
    timeframes: [
      raw,
      time,
      date,
      week,
      month,
      quarter,
      year
    ]
    sql: ${TABLE}.dt ;; 
  }

work? Would the auto generated SQL proceed with the cast anyway?

Thanks for the detailed explanations on this!

Oh, I see. My idea at the end wouldn’t work cause it needs to be a timestamp for format_timestamp to work.

This would be the solution:

  dimension_group: dt_query_cast {
    type: time
    timeframes: [
      raw,
      time,
      date,
      week,
      month,
      quarter,
      year
    ]
    sql: CAST(${TABLE}.dt AS TIMESTAMP,"{{ _query._query_timezone }}") ;;
  }

That makes sense. I wonder how we could generate this in a clean and user-friendly way.

Hi Izzi,

Thank you so much for looking into this!

  1. SELECT DATETIME(2008, 12, 25, 18, 30, 00) as dt
    I don’t think this has any timezone associated with it yet. No problems yet. : Yes, correct

  2. CAST(${TABLE}.dt AS TIMESTAMP) - Is that the step where you think we should be applying TIMESTAMP(${TABLE}.dt, "{{ _query._query_timezone }}") in the LookML generated from view? - Yes, correct. It’s about the LookML

  3. About the code below - IT’S THE MAIN ISSUE, IT DOESN’T WORK AS REPORTED HERE and this is the main issue, I’m trying to point out.

dimension_group: dt {
    datatype: datetime
    type: time
    timeframes: [
      raw,
      time,
      date,
      week,
      month,
      quarter,
      year
    ]
    sql: ${TABLE}.dt ;; 
  }

Oh, that makes total sense, ties back to your other post which we determined was a bug that we’ll fix. Even if that’s fixed, looks like this is still something we’d want to fix with our auto generated LookML. I’ll take this to the engineers who work on our model and see what they make of it.

Thanks for bearing with me as I figure this one out :slightly_smiling_face:.

I don’t think BQ will accept the above, but sql: TIMESTAMP (${TABLE}.dt, "{{ _query._query_timezone }}") ;; works just fine.

Just to confirm, I think these are two separate issues, weakly related - one about SQL, one about LookML generated - so I reported them separately. In case some further testing or explaining is needed just let me know.

Super thanks!

Totally, I think I’ve got it through my head now! I’ll keep each thread updated as I get information.