Generating a PowerPoint presentation from all Looks in a Space

sdk
api
python

(jonathan.walls) #1

A nice example of how you can use Looker’s APIs to expand the capabilities of the data platform is to generate new kinds of content. Even data-driven businesses sometimes want to put things into a PowerPoint presentation:

With just a little bit of scripting, you can automatically generate a slide deck by downloading PNG images via the API.

1. Install the required Python modules.
This code uses the python-pptx module. There is good documentation online, including a lot more you can do with PowerPoint slides: http://python-pptx.readthedocs.io/.

While there isn’t an official Python Looker SDK client, you can install my build (from using Swagger Codegen) directly from GitHub: https://github.com/ContrastingSounds/looker_sdk_30/.

Instructions for building your own client can be found here: Generating Client SDKs for the Looker API.

Note that for Python 3 users there is a slight glitch in Swagger Codegen 2, which doesn’t handle APIs that return different content types very well. See the change history on the GItHub example above to see the necessary changes.

pip install python-pptx
pip install git+https://github.com/ContrastingSounds/looker_sdk_30

2. Generate your Looker API key
See the instructions under API3 Keys on the Users Admin documentation.

3. Prepare an empty PowerPoint slidedeck to act as a template
When we’re producing a presentation for others, we usually want to start with some standard branding and colour palettes. You can re-use an existing slide deck by just creating a copy and deleting all the slides – the deck will be empty, but all the slide layouts will be available. You’ll need the index number of the layout to use for the generated slides.

from pptx import Presentation
from pptx.util import Cm

pptx = Presentation(powerpoint_template)
for idx, layout in enumerate(pptx.slide_layouts):
    print(idx, layout.name)

In a default PowerPoint file, the available layouts will be as listed below. It’s probable that the exact contents of your own slide deck are different, but you’ll usually be able to find a layout still called something like “Title Only”.

0 Title Slide
1 Title and Content
2 Section Header
3 Two Content
4 Comparison
5 Title Only
6 Blank
7 Content with Caption
8 Picture with Caption
9 Title and Vertical Text
10 Vertical Title and Text

4. Set the number of the title layout

title_only_layout = pptx.slide_layouts[5]

5. Set the parameters and authenticate the client
Note that you need the number of the Space you want to convert to a PowerPoint. This can be found at the end of its URL:
https://your-company.looker.com/spaces/351

looker_instance = 'your-company.looker.com'
target_space = 351 # 'Period over Period' Space on the Looker instance
powerpoint_template = 'looker_template.pptx'

client_id = 'xxxxxxxx'
client_secret = 'xxxxxxxx'

# instantiate Auth API
unauthenticated_client = looker_client.ApiClient(configuration=None)
unauthenticated_client.configuration.host = f'https://{looker_instance}:19999/api/3.0/'
unauthenticated_authApi = looker_client.ApiAuthApi(unauthenticated_client)

# authenticate client
token = unauthenticated_authApi.login(client_id=client_id, client_secret=client_secret)
client = looker_client.ApiClient(header_name='Authorization', header_value='token ' + token.access_token)
client.configuration.host = f'https://{looker_instance}:19999/api/3.0/'

6. Get the details of the Space and the Looks it contains

space = looker_client.SpaceApi(client).space(target_space)
looks = looker_client.SpaceApi(client).space_looks(target_space)

7. Generate the PowerPoint

The PowerPoint itself is fairly simple to generate. Note that you might want to remove the print() statements or replace them with proper logging.

The loop below will:

  1. Generate an API request for each Look, saving the PNG image generated
  2. Add a new slide, give it the same title as the Look
  3. Set the slide’s title with the name of the Look
  4. Add a hyperlink back to the Look itself on your Looker instance

The presentation will then be saved to disk, with the same name as the original Space.

for idx, look in enumerate(looks):
    print(idx, look.id, look.title)
    
    look_request = {
        "look_id": look.id, 
        "result_format": 'png', 
        "image_width": 960, 
        "image_height": 540
    }
    try:
        image = looker_client.LookApi(client).run_look(**look_request)
        image_file = ''.join([str(look.id), '.png'])
        shutil.move(image, image_file)
    except:
        print(f'Look failed {look.id}: {look.title}')
        image_file = None
        
    pptx.slides.add_slide(title_only_layout)
    pptx.slides[idx].shapes.title.text = f'{look.title} – Look #{look.id}'
    
    try:
        pptx.slides[idx].shapes.add_picture(image_file, Cm(5.4), Cm(4.95), width=Cm(23)) # image, left, top, width
    except:
        print('Failed to add image to slide')
        
    text_box = pptx.slides[idx].shapes.add_textbox(Cm(1.0), Cm(17.6), width=Cm(23.0), height=Cm(1.0))

    p = text_box.text_frame.paragraphs[0]
    run = p.add_run()
    run.text = ''.join([f'https://{looker_instance}/looks/', str(look.id)])
    run.hyperlink.address = run.text

pptx.save(space.name + '.pptx')

(Tim Pansino) #2

I’m attempting to do nearly exactly this, but I need to change a filter that exists on every look before it runs. Do you know of a proper way to filter when running a look? I’ve tried to pick apart the generated sdk for the api but have had no luck.


(Will Savage) #3

@Tim_Pansino A look contains a single query_id, which which will contain all the filters that were saved in the original look. However, you could extract the query defintion from the Look API endpoint, and then modify the query via the Query API endpoint – would that work for your needs?


(jonathan.walls) #4

To modify the query in a Look, you need to use the Query API rather than the Look API. There’s an article on that here: Can you apply a filter to a pre-defined look via API3?

The basic steps are:

  1. Get the query definition via the Look API
  2. Edit the filters as necessary
  3. Run the updated query via the Query API

Getting the query:

look = lookApi.look(770, fields='query')
look_query = look.to_dict()['query']

If you just look at the filter values, you’ll see something like this:

{'view_name.yesno_dimension_name': 'Yes',
 'view_name.date_dimension_name_': '2 years'}

The full query dictionary contains some read-only values, so you’ll need to update a few values. Here’s an example:

look_query_dict.pop('client_id')
look_query_dict['vis_config'] = {
    'type': 'looker_column',
    'colors': "['palette: Looker Classic']",
    # 'series_colors': '{}',
    'series_types': '',
    'show_dropoff': 'True',
    'show_view_names': 'False',
    # 'totals_color': '#808080',
} 
look_query_dict['filters'] = {
     'view_name.yesno_dimension_name': 'Yes',
     'view_name.date_dimension_name_': '3 years'
}

Try experimenting a bit with that – notice it’s not a perfect round-trip on the visualisation options currently. Here’s the code to run the query, including renaming the image file as per the original article:

image = queryApi.run_inline_query(result_format='png', image_width=960, image_height=540, body=look_query_dict)

image_file = ''.join([image, '.png'])
shutil.move(image, image_file)

Hope that helps!

Final note: I updated the generated SDK to handle the same response type bug in Swagger Codegen found in the Look API. Best to reinstall the looker client module from GitHub.


(Tim Pansino) #5

What client module GitHub are you referring to? I built my sdk with the swagger.json file and swagger-codegen 2.1.6. I noticed there’s a bug when receiving png files so I put a work around into the sdk to get my png file out.

Additionally, the main issue I’ve come across is that when I update the query without changing the vis_config section the visualizations are completely broken and won’t run. Do you happen to know what sections work and what do not? I could theoretically strip out just the broken sections and see if the graph looks similar enough.


(jonathan.walls) #6

For GitHub, I’m referring to the pip install instructions under step 1 in the article – but it sounds like you’ve caught and resolved the same issue for binary files in your own SDK code from swagger.

For the vis_config, I tried just re-using the unmodified vis_config section and you’re right, there are some bugs to work out. I’ve chatted to our engineers and we’re taking a closer look. Below is the simple example I used similar to the screencaps in the article. You can see it doesn’t necessarily need much to replicate your Look.

{
	    'type': 'looker_column',
	    'colors': ['palette: Looker Classic'],
	    'series_colors': {},
	    'series_types': '',
	    'show_dropoff': True,
	    'show_view_names': False,
	    'totals_color': '#808080',
	}

This Discourse provides some good examples (based on including vis_config settings in a LookML model): More Powerful Data Drilling


(Tim Pansino) #7

I actually found the bugs that cause the visualizations to not work properly and patched a work around in, but now my queries fail when rendering. I can run the same query with a json result format, but running a render produces this error:

query failed: NoMethodError: undefined method `scan’ for 97:Fixnum

97 is the value of my changed filter here. I can’t wrap my head around why it would throw an error only on a render for this. Any ideas?


(jonathan.walls) #8

The only changes I’ve had to make are the ones discussed so far: handling binary downloads, and sanitising the vis_config input. And you can avoid the first change by using the _preload_content parameter to flag binary files, like at the end of this post: Using the Looker API to generate PDFs e.g.

results = renderTask.render_task_results(task_id, _preload_content = False)

Using that approach might be a good way to check no bugs have been introduced via your patches. You can then test the API calls just by modifying the vis_config dictionary.


(Tim Pansino) #9

I’ve fixed everything now, so I’ll write up what I did for anyone with these issues in the future.

Inside the query, dynamic fields is returned as a list of every character instead of the correct data type. The solution is this

look["query"]["dynamic_fields"] = "".join(look["query"]["dynamic_fields"])

The next issue is removing read only fields from the query. I removed the following fields.

look["query"].pop("client_id")
look["query"].pop("filter_config")
look["query"].pop("filter_expression")
look["query"].pop("id")

Next I changed a filter, but the filter was a number field and that threw internal errors when visualizing for some reason. The solution was to make it a string with a leading space. I did that like this.

look["query"]["filters"]["filter_to_change"] = " {}".format(new_value)

Next, I realized the vis_config had some attributes that were strings when they needed to be dicts or lists, and some that were “True” instead of True and “False” instead of False. By evaluating them and turning strings into corresponding booleans, the visualizations were perfect. Here’s the function I made to patch them up.

def fix_vis_config(config):
    fixed = dict()
    for key,val in config.items():
        if len(val) > 0:
            if val[0] == "[" or val[0] == "{":
                fixed[key] = ast.literal_eval(val)
            elif val == "False":
                fixed[key] = False
            elif val == "True":
                fixed[key] = True
            else:
                fixed[key] = val
        else:
            fixed[key] = ""
    return fixed

After that you can create the query and run it regularly, make it a render task, or run the query inline. Getting the result back can be tricky if you’re generating your SDK yourself, so I would recommend using the one mentioned above.

If Looker is interested in any more information on how to reproduce these issues to potentially patch them, feel free to reach out to me.


(jonathan.walls) #10

Thanks Tim, I’ll add all that detail to the internal ticket.

Filter expressions are strings, but we would like to be forgiving to predictable issues like submitting a filter as an integer when that is the intended logic. Very useful input on the cases you’ve identified where values are not accepted as strings.

We are looking at developing official clients for more languages, although I’m afraid there’s no formal commitments to share at the moment.


(Charlie Walsh) #11

@jonathan.walls Have their been any updates or improvements to the API since Tim submitted his comments which would impact how we apply dynamic filters during the export to powerpoint workflow?


(Peggy Beard) #12

Hi @Charlie_Walsh I don’t see anything in the recent release notes to indicate any changes to this, so the code above should still be the best way to do this!