Creating a Data Dictionary Using Looker's API

low_priority
done
reply
(Haarthi Sadasivam) #1

You can make use of certain API calls to create a data dictionary based on fields definitions and descriptions defined in the Looker Modeling Layer.

Step 1:

Connect to our API. View Documentation on how to connect to the API

Step 2:

Define Fields with descriptions and other parameters within the Model in your looker instance.

Step 3:

Call the Method (lookML_model_explore) via the API and pull in the appropriate fields that you would want to identify within your Data Dictionary. Sample Ruby code is provided.

#Get Data Dictionary 

require 'looker-sdk'

module ApplicationHelper

	def self.api_auth
      sdk = LookerSDK::Client.new(
      # Looker/API Credentials
        :client_id => ENV['API_CLIENT_ID'],
	    :client_secret => ENV['API_SECRET'],
	    :api_endpoint => ENV['API_ENDPOINT']
      )
      return sdk
	end

	def self.get_field_values(model_name, explore_name)

		sdk = self.api_auth()
		fields = {:fields => 'id, name, description, fields'}

		#API Call to pull in metadata about fields in a particular explore
		fields = sdk.lookml_model_explore(model_name, explore_name, fields)


		my_fields = []

		#Iterate through the field definitions and pull in the description, sql, and other looker tags you might want to include in  your data dictionary. 
		fields[:fields][:dimensions].to_a.each do |x|
			dimension = {
				:field_type => 'Dimension',
				:view_name => x[:view_label].to_s,
				:field_name => x[:label_short].to_s, 
				:type => x[:type].to_s,
				:description => x[:description].to_s,
				:sql => x[:sql].to_s
			}
			my_fields << dimension
		end

		fields[:fields][:measures].to_a.each do |x|
			measure = {
				:field_type => 'Measure',
				:view_name => x[:view_label].to_s,
				:field_name => x[:label_short].to_s, 
				:type => x[:type].to_s,
				:description => x[:description].to_s,
				:sql => x[:sql].to_s
			}

			my_fields << measure
		end

		return my_fields
	end

end

Step 4:

Format the results from the get_field_values call. Consider using Plugins to allow for Search and Sort functionality. In our example, we use the DataTable JS Plugin.

Examples:

Default Dictionary with Pagination and Sorts

Dictionary filtered to fields that contain the word “lifetime”

6 Likes

Writing a simple data dictionary to csv using the Looker API and the Python requests library
Generating a Data Dictionary in Google Sheet
Building a data dictionary in Python
(Michael Erasmus) #2

If I understand this correctly you still need to know which fields are in your explores/views for this to work (they are hardcoded in your example)?

Any way to discover the field names automatically?

EDIT:

Ah sorry about that, I think I misunderstood. Those are the field attributes, sorry!

0 Likes

(John Norman) #3

I think this is neat, but why not parse the LookML in a checkout of the repo? I bet it would be radically faster than doing API calls.

1 Like

(Timothy Burke) #4

@John_Norman (or any others) did you attempt this? I imagine it is indeed possible to parse the repo and identify definition statements for measures and dimensions based on text patterns; producing a similar output, and it may be a lower learning curve than using Ruby and the API.

We’d love to do a general audit of our codebase for consistent measure names, dimension names, use of value formats, drill fields, etc., so having this data dictionary would be great, but generating it seems like somewhat of a project of its’ own.

1 Like

(John Norman) #5

We have something in progress that looks at the repo – it’s very fast.

When it’s a bit further, I’ll share it.

2 Likes

(Michael Ghen) #6

Where will it be shared @John_Norman? Sounds like really great tooling

0 Likes

(John Norman) #7

When we get it a bit further, I’ll put it on Github.

0 Likes

(Timothy Burke) #8

We ended up having success with this approach which uses Python:

0 Likes

(Ritesh Muhatte) #9

It would be great if someone can also provide the DataTable JS Plugin setup in this example.

0 Likes

(Izzy) #10

I’ll see if I can rustle that up internally. In the meantime, @John_Norman, did you end up using the datatable JS plugin for your implementation?

1 Like

(Ritesh Muhatte) #11

@Haarthi @izzy
I set up the files from the GIT (https://github.com/llooker/python_api_samples), especially for DATA Dictionary. I successfully generated the csv. In the csv file I had and blank line after each record? Did anyone see the same result set ? This is making the csv file too long unnecessary. Am I missing something in here ?
Eg output:

1,A,AB,ABC

2,B,BC,BCD

3,C,CD,CDE

0 Likes

(Izzy) #12

I haven’t tested that api script, but I’m curious if it actually increases size of the csv measurably/messes with loading it into programs. I don’t think a blank line should cause problems when importing into programs / db’s, but if it does, I’ll check it out and make sure there’s nothing broken there.

Is it just a cosmetic annoyance, or is it actually messing with your workflow? Thanks for trying it out and reporting back!

0 Likes

(Ritesh Muhatte) #13

So far it did not mess up anything, But it did minutely reduce the size after I cleared the blanks. You could say it is more of a cosmic annoyance having to scroll through.

0 Likes

(Ritesh Muhatte) #14

Were you able to rustle up something ? Also hoping to hear from @John_Norman getting curious what was his approach for this.

0 Likes