Concatenating Nested Repeated Records

dimension
modeling
bigquery
lookml

(Brent Shreve) #1

Hi all!

I’m trying to accomplish something similar to this in an Explore:

https://stackoverflow.com/questions/29486388/concatenate-the-values-of-a-nested-field-on-bigquery

Here’s an example of our current BigQuery schema and data:

example_schema

This is what I’d like to achieve:

Is this possible without a change to the underlying data model? If not, is it best practice to create a DT or PDT when working with nested fields?

I’ve tried using some of the BigQuery string concatenation functions but they’re aggregations which don’t play nicely with dimensions.

Any tips or tricks would be appreciated!

Thanks,
Brent


(rufus) #2

@brent_shreve,

This is certainly possible without changing the underlying data. BigQuery provides some great advantages when it comes to nested data and we can take advantage of them natively without pdts.

You could use the group_concat approach to create the dimensions you need. Something like:

dimension: old_value {
    type: string
    sql: (SELECT group_concat(old_value) FROM UNNEST(${TABLE}.log_changes)) ;;
  }

A similar approach that would give you a lot more flexibility in terms of grouping on the front end would be to create individual dimensions for each piece of the address. Something like this:

dimension: old_address1 {
    type: string
    sql: (SELECT old_value FROM UNNEST(${TABLE}.log_changes) where property = 'address1') ;;
  }

From there you could also concat those dimensions together to give you one old_value field.

For reference there are some examples of how that works here - Modeling nested data in Big Query Standard SQL

Also the BigQuery docs have some examples of unnesting through multiple levels of struct/array combos. You can get pretty advanced very quickly which is why BigQuery’s nesting functionality is so powerful - Querying Array Type Fields in a Struct

Hope that helps!