Generating a PowerPoint presentation from all Look...

A nice example of how you can use Looker’s APIs to expand the capabilities of the data platform is to generate new kinds of content. Even data-driven businesses sometimes want to put things into a PowerPoint presentation:

With just a little bit of scripting, you can automatically generate a slide deck by downloading PNG images via the API.

1. Install the required Python modules.
This code uses the python-pptx module. There is good documentation online, including a lot more you can do with PowerPoint slides: http://python-pptx.readthedocs.io/.

While there isn’t an official Python Looker SDK client, you can install my build (from using Swagger Codegen) directly from GitHub: https://github.com/ContrastingSounds/looker_sdk_30/.

Instructions for building your own client can be found here: Generating Client SDKs for the Looker API.

Note that for Python 3 users there is a slight glitch in Swagger Codegen 2, which doesn’t handle APIs that return different content types very well. See the change history on the GItHub example above to see the necessary changes.

pip install python-pptx
pip install git+https://github.com/ContrastingSounds/looker_sdk_30

2. Generate your Looker API key
See the instructions under API3 Keys on the Users Admin documentation.

3. Prepare an empty PowerPoint slidedeck to act as a template
When we’re producing a presentation for others, we usually want to start with some standard branding and colour palettes. You can re-use an existing slide deck by just creating a copy and deleting all the slides – the deck will be empty, but all the slide layouts will be available. You’ll need the index number of the layout to use for the generated slides.

from pptx import Presentation
from pptx.util import Cm

pptx = Presentation(powerpoint_template)
for idx, layout in enumerate(pptx.slide_layouts):
    print(idx, layout.name)

In a default PowerPoint file, the available layouts will be as listed below. It’s probable that the exact contents of your own slide deck are different, but you’ll usually be able to find a layout still called something like “Title Only”.

0 Title Slide
1 Title and Content
2 Section Header
3 Two Content
4 Comparison
5 Title Only
6 Blank
7 Content with Caption
8 Picture with Caption
9 Title and Vertical Text
10 Vertical Title and Text

4. Set the number of the title layout

title_only_layout = pptx.slide_layouts[5]

5. Set the parameters and authenticate the client
Note that you need the number of the Space you want to convert to a PowerPoint. This can be found at the end of its URL:
“https://your-company.looker.com/spaces/351”

import looker_client_30 as looker_client

looker_instance = 'your-company.looker.com'
target_space = 351 # 'Period over Period' Space on the Looker instance
powerpoint_template = 'looker_template.pptx'

client_id = 'xxxxxxxx'
client_secret = 'xxxxxxxx'

# instantiate Auth API
unauthenticated_client = looker_client.ApiClient(configuration=None)
unauthenticated_client.configuration.host = f'https://{looker_instance}:19999/api/3.0/'
unauthenticated_authApi = looker_client.ApiAuthApi(unauthenticated_client)

# authenticate client
token = unauthenticated_authApi.login(client_id=client_id, client_secret=client_secret)
client = looker_client.ApiClient(header_name='Authorization', header_value='token ' + token.access_token)
client.configuration.host = f'https://{looker_instance}:19999/api/3.0/'

6. Get the details of the Space and the Looks it contains

space = looker_client.SpaceApi(client).space(target_space)
looks = looker_client.SpaceApi(client).space_looks(target_space)

7. Generate the PowerPoint

The PowerPoint itself is fairly simple to generate. Note that you might want to remove the print() statements or replace them with proper logging.

The loop below will:

Generate an API request for each Look, saving the PNG image generated
Add a new slide, give it the same title as the Look
Set the slide’s title with the name of the Look
Add a hyperlink back to the Look itself on your Looker instance

The presentation will then be saved to disk, with the same name as the original Space.

for idx, look in enumerate(looks):
    print(idx, look.id, look.title)
    
    look_request = {
        "look_id": look.id, 
        "result_format": 'png', 
        "image_width": 960, 
        "image_height": 540
    }
    try:
        image = looker_client.LookApi(client).run_look(**look_request)
        image_file = ''.join([str(look.id), '.png'])
        shutil.move(image, image_file)
    except:
        print(f'Look failed {look.id}: {look.title}')
        image_file = None
        
    pptx.slides.add_slide(title_only_layout)
    pptx.slides[idx].shapes.title.text = f'{look.title} – Look #{look.id}'
    
    try:
        pptx.slides[idx].shapes.add_picture(image_file, Cm(5.4), Cm(4.95), width=Cm(23)) # image, left, top, width
    except:
        print('Failed to add image to slide')
        
    text_box = pptx.slides[idx].shapes.add_textbox(Cm(1.0), Cm(17.6), width=Cm(23.0), height=Cm(1.0))

    p = text_box.text_frame.paragraphs[0]
    run = p.add_run()
    run.text = ''.join([f'https://{looker_instance}/looks/', str(look.id)])
    run.hyperlink.address = run.text

pptx.save(space.name + '.pptx')

4 25 6,489

25 REPLIES 25

I’m attempting to do nearly exactly this, but I need to change a filter that exists on every look before it runs. Do you know of a proper way to filter when running a look? I’ve tried to pick apart the generated sdk for the api but have had no luck.

@Tim_Pansino A look contains a single query_id, which which will contain all the filters that were saved in the original look. However, you could extract the query defintion from the Look API endpoint, and then modify the query via the Query API endpoint – would that work for your needs?

To modify the query in a Look, you need to use the Query API rather than the Look API. There’s an article on that here: Can you apply a filter to a pre-defined look via API3?

The basic steps are:

Get the query definition via the Look API
Edit the filters as necessary
Run the updated query via the Query API

Getting the query:

look = lookApi.look(770, fields='query')
look_query = look.to_dict()['query']

If you just look at the filter values, you’ll see something like this:

{'view_name.yesno_dimension_name': 'Yes',
 'view_name.date_dimension_name_': '2 years'}

The full query dictionary contains some read-only values, so you’ll need to update a few values. Here’s an example:

look_query_dict.pop('client_id')
look_query_dict['vis_config'] = {
    'type': 'looker_column',
    'colors': "['palette: Looker Classic']",
    # 'series_colors': '{}',
    'series_types': '',
    'show_dropoff': 'True',
    'show_view_names': 'False',
    # 'totals_color': '#808080',
} 
look_query_dict['filters'] = {
     'view_name.yesno_dimension_name': 'Yes',
     'view_name.date_dimension_name_': '3 years'
}

Try experimenting a bit with that – notice it’s not a perfect round-trip on the visualisation options currently. Here’s the code to run the query, including renaming the image file as per the original article:

image = queryApi.run_inline_query(result_format='png', image_width=960, image_height=540, body=look_query_dict)

image_file = ''.join([image, '.png'])
shutil.move(image, image_file)

Hope that helps!

Final note: I updated the generated SDK to handle the same response type bug in Swagger Codegen found in the Look API. Best to reinstall the looker client module from GitHub.

What client module GitHub are you referring to? I built my sdk with the swagger.json file and swagger-codegen 2.1.6. I noticed there’s a bug when receiving png files so I put a work around into the sdk to get my png file out.

Additionally, the main issue I’ve come across is that when I update the query without changing the vis_config section the visualizations are completely broken and won’t run. Do you happen to know what sections work and what do not? I could theoretically strip out just the broken sections and see if the graph looks similar enough.

For GitHub, I’m referring to the pip install instructions under step 1 in the article – but it sounds like you’ve caught and resolved the same issue for binary files in your own SDK code from swagger.

For the vis_config, I tried just re-using the unmodified vis_config section and you’re right, there are some bugs to work out. I’ve chatted to our engineers and we’re taking a closer look. Below is the simple example I used similar to the screencaps in the article. You can see it doesn’t necessarily need much to replicate your Look.

{
	    'type': 'looker_column',
	    'colors': ['palette: Looker Classic'],
	    'series_colors': {},
	    'series_types': '',
	    'show_dropoff': True,
	    'show_view_names': False,
	    'totals_color': '#808080',
	}

This Discourse provides some good examples (based on including vis_config settings in a LookML model): More Powerful Data Drilling

I actually found the bugs that cause the visualizations to not work properly and patched a work around in, but now my queries fail when rendering. I can run the same query with a json result format, but running a render produces this error:

query failed: NoMethodError: undefined method `scan’ for 97:Fixnum

97 is the value of my changed filter here. I can’t wrap my head around why it would throw an error only on a render for this. Any ideas?

The only changes I’ve had to make are the ones discussed so far: handling binary downloads, and sanitising the vis_config input. And you can avoid the first change by using the _preload_content parameter to flag binary files, like at the end of this post: Using the Looker API to generate PDFs e.g.

results = renderTask.render_task_results(task_id, _preload_content = False)

Using that approach might be a good way to check no bugs have been introduced via your patches. You can then test the API calls just by modifying the vis_config dictionary.

I’ve fixed everything now, so I’ll write up what I did for anyone with these issues in the future.

Inside the query, dynamic fields is returned as a list of every character instead of the correct data type. The solution is this

look["query"]["dynamic_fields"] = "".join(look["query"]["dynamic_fields"])

The next issue is removing read only fields from the query. I removed the following fields.

look["query"].pop("client_id")
look["query"].pop("filter_config")
look["query"].pop("filter_expression")
look["query"].pop("id")

Next I changed a filter, but the filter was a number field and that threw internal errors when visualizing for some reason. The solution was to make it a string with a leading space. I did that like this.

look["query"]["filters"]["filter_to_change"] = " {}".format(new_value)

Next, I realized the vis_config had some attributes that were strings when they needed to be dicts or lists, and some that were “True” instead of True and “False” instead of False. By evaluating them and turning strings into corresponding booleans, the visualizations were perfect. Here’s the function I made to patch them up.

def fix_vis_config(config):
    fixed = dict()
    for key,val in config.items():
        if len(val) > 0:
            if val[0] == "[" or val[0] == "{":
                fixed[key] = ast.literal_eval(val)
            elif val == "False":
                fixed[key] = False
            elif val == "True":
                fixed[key] = True
            else:
                fixed[key] = val
        else:
            fixed[key] = ""
    return fixed

After that you can create the query and run it regularly, make it a render task, or run the query inline. Getting the result back can be tricky if you’re generating your SDK yourself, so I would recommend using the one mentioned above.

If Looker is interested in any more information on how to reproduce these issues to potentially patch them, feel free to reach out to me.

Thanks Tim, I’ll add all that detail to the internal ticket.

Filter expressions are strings, but we would like to be forgiving to predictable issues like submitting a filter as an integer when that is the intended logic. Very useful input on the cases you’ve identified where values are not accepted as strings.

We are looking at developing official clients for more languages, although I’m afraid there’s no formal commitments to share at the moment.

@jonathan_walls Have their been any updates or improvements to the API since Tim submitted his comments which would impact how we apply dynamic filters during the export to powerpoint workflow?

Hi @Charlie_Walsh I don’t see anything in the recent release notes to indicate any changes to this, so the code above should still be the best way to do this!

Thanks so much for posting this!

Hi, this is a very interesting tutorial. However, I am having issues with the Looker Authentication code. The error it gives me refers to the following line: unauthenticated_client = looker_client.ApiClient(configuration=None). The error is: NameError: name ‘looker_client’ is not defined. How is this metric defined as I was not able to see it from the tutorial either. Can someone help please? Thanks in advance.
PS: I am working within Databricks.

Oh, there’s a line missing. You need to have this import statement at the top of your Python script:

import looker_client_30 as looker_client

That will define the looker_client for later. I’ll update the article to include this.

Yay, it worked! The only issue I have is this line:
client = looker_client.ApiClient(header_name=‘Authorization’, header_value='token ’ + token.access_token)

AttributeError: ‘str’ object has no attribute ‘access_token’

Do you know what this means? Also, the token is a long string, so I get one which is good but why does the token.access_token not work?

Hi @jonathan_walls

While there isn’t an official Python Looker SDK client

Actually, now there is ! Check out https://pypi.org/project/looker-sdk/

which is created by the Looker SDK codegen project: https://github.com/looker-open-source/sdk-codegen

Thanks,
-Danny

Hi Danny,
Could you please assist? I posted a week back and didn’t get any help. I am wondering at this point how your script connects these two commands:

powerpoint_template = ‘looker_template.pptx’
pptx.save(space.name + ‘.pptx’)

The first one is to create a template but then how is it referenced later when the pptx is saved (second command)? I dont see the connection in your tutorial. Can you please clarify?

Hi tintin84,

I followed the above steps and was able to produce a pptx file with png’s from looks.
I believe this line
powerpoint_template = ‘looker_template.pptx’
is not for creating a template, but for loading a template. For example you have some company wide pptx template that is used for all presentations.
I just put a path to the template file there like this:
powerpoint_template = r"C:\Users\*********\*******\*****\template.pptx"
replace ** with your actual path.

You can also omit the template if you don’t have one. The python-pptx module has a default template which is used if you provide no template in this line:
pptx = Presentation(powerpoint_template)
just leave the brackets empty like this:
pptx = Presentation()
See python-pptx documentation here: link to docs

This line
pptx.save(space.name + ‘.pptx’)]
just saves to file.

I hop this helps. Let me know if you need any help.

Hey, is this package still functioning with Python 3.7? When I try to run “import looker_client_30 as looker_client”, I get the error " async=params.get(‘async’),
SyntaxError: invalid syntax". After a bit of googling, it looks that async is now a reserved keyword in Python 3.7 so I’m thinking this may cause the issue. Thank you!

Can you do something like this with google slides instead of powerpoint?

Echoing that last one as well...what is the integration with google slides?

Above mentioned code is for downloading looks reports in PPTX format. What are the changes needed if we want to download dashboards reports? How to select “target_space = 18 # 'Period over Period' Space on the Looker instance”. Please reply @Tim_Pansino @jonathan_walls

Also interested in a solution for Google Slides

Hey, is this package still functioning with Python 3.7? When I try to run “import looker_client_30 as looker_client”, I get the error " async=params.get(‘async’),

SyntaxError: invalid syntax". After a bit of googling, it looks that async is now a reserved keyword in Python 3.7 so I’m thinking this may cause the issue. Thank you!

Hey, I'm having the same issue here! When I tried to import looker_client_30 it didn't work. I am trying to find a way to avoid doing copy and paste in Google Slides, but right now it seems impossible to me. If someone did it recently, please, share with us.

@pereiraf @AnaLimaPS @jason_ce @LisaW1 you should check out Rollstack. It allows you to connect Looker to Slides and automate the process of creating and updating slide decks and documents from Looker, Tableau, Metabase, etc.

never-displayed

Generating a PowerPoint presentation from all Looks in a Space