How do you deploy non-LookML views?

(George Fraser) #1

We have a few views at Fivetran that we want to reference both from Looker and from other tools, for example we’re using Census to sync data back into Salesforce. Our current trick is we’re tracking view definitions in the same Github repo as looker, and we’ve set up CircleCI to deploy these view definitions using this configuration:

version: 2
      - image: google/cloud-sdk
      - checkout
      - run: |
          echo $GCLOUD_SERVICE_KEY | gcloud auth activate-service-account --key-file=-
      # TODO in a perfect world, we would deploy pull-request views to their own schema and looker development mode would know where to find them
      - deploy:
          command: |
            if [ "${CIRCLE_BRANCH}" == "master" ]; then
              bq query --project_id=GCP_PROJECT_ID --dataset_id=scratch --use_legacy_sql=false "$(cat views/scratch/test.sql)"
              bq query --project_id=GCP_PROJECT_ID --dataset_id=share_looker --use_legacy_sql=false "$(cat views/share_looker/github_issue.sql)"
              bq query --project_id=GCP_PROJECT_ID --dataset_id=github_views --use_legacy_sql=false "$(cat views/github_views/github_issues_looker_view.sql)"
              bq query --project_id=GCP_PROJECT_ID --dataset_id=github_views --use_legacy_sql=false "$(cat views/github_views/issue_priority_level.sql)"
              bq query --project_id=GCP_PROJECT_ID --dataset_id=share_census --use_legacy_sql=false "$(cat views/share_census/github_issue.sql)"
              bq query --project_id=GCP_PROJECT_ID --dataset_id=share_census --use_legacy_sql=false "$(cat views/share_census/github_junction.sql)"

The main shortcoming of this approach is that it doesn’t interoperate with Looker’s “dev/production” mode. If you want to change both your “native view” and your LookML, you have to:

  • edit the native view
  • merge to master
  • wait for it to deploy
  • then you can edit the LookML

It’s not the end of the world, but I’m wondering how other people manage “native views” in conjunction with LookML?

(Izzy) #2

@George_Fraser, I’ve never done this, but I figured I’d pop in with an idea… It might be less performant, more expensive, and is definitely more looker-centric (but hey, I work for Looker, so who can blame me), but what if you just created your “views” as PDTs and then used those as the tables that you sync back to salesforce?

That way there’d still be that all-powerful “single source of truth” in the form of the derived table definition (in LookML), and if you wanted to edit the query, you’d have to edit it in only one place, in LookML, and you’d maintain the dev/prod versioning…

It might be that views are just way more performant, or querying the looker_scratch tables isn’t optimal for your setup, but food for thought!

Anyone else have experience doing this with actual views instead of my way?

(Jacopo Chiapparino) #3

You’re doing the right thing George.

You should write migrations for your native tables in your own migration manager (in this case CircleCI).

Building PDTs on top of looker like in the suggestion above is not a good industry standard practice: the moment you will need to run those migrations internally against eg a new database (because you are migrating to some new stack) or your team may need a table to build a new data product, that table won’t be available.

Being in control on how the data is materialized more upstream than your BI tool and leave the PDTs to analysts is a more future proof approach in our experience.

(Izzy) #4

Yeah I actually agree on second thought :slight_smile:
Another issue I thought of with my suggestion is that if you happened to use a persist_for instead of a sql_trigger_value, you might not even find the PDT table when you went to query it from outside of Looker, and with the changing hashes it could get confusing even if it was there. Probably not a good idea to use one tool to generate components that affect something “upstream” as you said.

Thanks for the input!

(George Fraser) #5

Even though these views go to other places, Looker is still the primary way we interact with data, so in a perfect world Looker’s branching mechanism would somehow interlock with the deployment of new versions of these views. But given the constraints of CircleCI/BigQuery/Looker, I suspect we’ll just follow a procedure where we change the views first, then change the LookML.

I suppose this workflow is inevitable when you’re sending the data to multiple places…for example, I’m a big fan of data studio explorer and would love to be able to interact with the same views from Looker in 90% of cases but drop down to data studio explorer in 10% of cases when I want to look at something that doesn’t fit the LookML model.

(Izzy) #6

That’s cool to hear! Yeah, I think until Looker becomes the center of the known universe (we’re workin on it!) there’ll be a few situations like this where the workflow has to be a bit more clunky. I wonder if there’s some complex setup with Git webhooks that you could create to at least automate it entirely–Maybe that’s what CircleCI does already, I’m unfamiliar.