As of version 4.6, Looker supports Amazon Athena, an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage. You are charged only for the queries that are run.
This article describes how to connect to an Amazon Athena instance.
Before starting, make sure that you have:
A pair of Amazon AWS access keys
An S3 bucket
The Amazon AWS access keys must have read-write access to this bucket.
Knowledge of where your Amazon Athena instance data is located. The Region Name can be found in the upper right hand portion of the Amazon console as shown below.
In Looker, go to Admin —> Connections —> New Database Connection.
Looker displays the form for specifying database connection details
Fill out the database connection details:
Specify the name of the connection and how it will be referred to in LookML projects.
Select Amazon Athena.
Specify the name of the host and port. As described in the Athena documentation on the JDBC URL format, the host should be a valid Amazon endpoint (like
athena.eu-west-1.amazonaws.com), and the port should stay at
443. An up-to-date list of endpoints that support Athena can be found here.
Specify the default database that you would like modeled. Other databases can be accessed, but Looker treats this database as the default database.
Specify the AWS Access key ID.
Specify the AWS Secret access key.
For example, the following settings set up an Athena connection using the US East data center:
In the last field, Additional Parameters, you must configure the bucket that was used for staging by using the
staging-bucket is the name of the staging bucket. See the Athena documentation on JDBC Driver Options for more information and a list of all available JDBC driver options.
Modify the following options as desired or leave them at their default values:
By default, the max connections is set to 5 because this is the maximum number of concurrent queries to Amazon Athena per account. See the Athena service limits documentation for more details about the service limits. See this page for more information about the Max Connections field.
Connection Pool Timeout
Specify the connection pool timeout. By default, the timeout is set to 120 seconds. See this page for more information on the Connection Pool Timeout field.
Database Time Zone
Specify the timezone used by the dates in the database. Leave this field blank if you do not want timezone conversion. See this page for more information on the Database Time Zone field.
Query Time Zone
The desired timezone for Looker queries. Leave this field blank if you do not want timezone conversion. See this page for more information about the Query Time Zone field.
log_path JDBC driver options for debugging connections. To use them, add
&log_level=DEBUG&log_path=/tmp/athena_debug.log to the end of the Additional Params field and test the connection again.
If Looker is hosting the instance, then support or your analyst will need to retrieve this file to continue debugging.
Looker’s ability to provide some features depends on the database dialect’s ability to support that functionality. In the current Looker release, Amazon Athena supports the following Looker features: