Looker Community

Configuring Looker to use Impala JDBC Connector for Cloudera Enterprise

Looker 4.12 added the ability to configure Looker to connect to Cloudera Impala using the Impala JDBC Connector for Cloudera Enterprise.

By default, Looker will connect to Cloudera Impala using open source Hive drivers.

There are times when the Cloudera-provided drivers are preferred, specifically for some types of security configurations that aren’t supported by the open source Hive drivers such as

  • Kerberos with delegation
  • Using SSL certificates with load balanced configurations

This article explains the steps involved in configuring Looker to use these Cloudera-provided drivers:

###1. Download the drivers from www.cloudera.com or https://www.cloudera.com/documentation/other/connectors.html.

Note:

  • Be sure to download an appropriate driver for the Impala version that you are connecting to. The installation guide will list the supported Impala versions.
  • Use the later of the JDBC API version available. (4.1 as of version 2.5.37).

###2. Go to the looker home directory and configure Looker to use the external driver:

  • mkdir custom_jdbc_drivers.
  • mkdir custom_jdbc_drivers/impala.
  • Copy the driver jars into the custom_jdbc_drivers/impala directory. As of the writing of this article, those jars are:
commons-codec-1.3.jar
commons-logging-1.1.1.jar
hive_metastore.jar
hive_service.jar
httpclient-4.1.3.jar
httpcore-4.1.3.jar
ImpalaJDBC41.jar
libfb303-0.9.0.jar
libthrift-0.9.0.jar
log4j-1.2.14.jar
ql.jar
slf4j-api-1.5.11.jar
slf4j-log4j12-1.5.11.jar
TCLIServiceClient.jar
zookeeper-3.4.6.jar
  • Add this to (or create) lookerstart.cfg:
    LOOKERARGS=--use-custom-jdbc-config
  • Create the file custom_jdbc_config.yml with this content:
- name: impala
  dir_name: impala
  module_path: com.cloudera.impala.jdbc41.Driver

###3. Restart Looker

./looker restart

###4. Configure the connection to Impala

  • Create a new connection by going to the Admin screens and choosing Connections. Then choose “New Connection”.
  • Under Dialect, choose “Cloudera Impala with Native Driver”. This option is only available when the driver is installed properly. If it does not appear, review that you properly configured everything in step 2 and restarted Looker.
  • (Updated) In Looker 4.14.9 and prior, password based authentication is not supported. You can connect anonymously, with a username, or via Kerberos. User/password authentication is supported starting with 4.14.10.
  • You may need to refer to the Cloudera JDBC Driver documentation to find the proper parameters to use. Looker will automatically set the parameters UseNativeQuery=1 for all connections. Looker will also set AuthMech=?. If the SSL checkbox is checked Looker will add SSL=1 to the connection string, and CAIssuedCertNamesMismatch=1 will also be added unless Verify SSL Cert is also checked.
  • For an “on-prem” install using Kerberos, you will need to add the following to Additional Params:
    KrbRealm=EXAMPLE.COM;KrbHostFQDN=node1.example.com;KrbServiceName=impala. In addition, you need to make sure that the account under which Looker is running has an active Kerberos ticket and Looker has been configured for Kerberos authentication. The steps in the article Connecting Looker to Hive With Kerberos Authentication (up to the point where the Hive connection is actually configured) apply to Impala as well, with one modification. The gss-jaas.conf file should be configured like this…
Client {
    com.sun.security.auth.module.Krb5LoginModule required
    useTicketCache=true
    doNotPrompt=true;
};

If you are using Kerberos with both the Impala JDBC connector and with another connector you can have both configurations in this file…

com.sun.security.jgss.initiate {
    com.sun.security.auth.module.Krb5LoginModule required
    useTicketCache=true
    doNotPrompt=true;
};
Client {
    com.sun.security.auth.module.Krb5LoginModule required
    useTicketCache=true
    doNotPrompt=true;
};

###5. Test the connection
Use the “Test These Settings” button to test the connection.

In Looker 4.14.10 and later username/password authentication is supported.

I have set up exactly as is stated here, but my Kerberized Impala running in Cloudera CDH cannot be connected.
Somehow Looker puts the AuthMech=0 into the connection string even though I explicitly stated AuthMech=1 (Kerberos).

2018-02-19 19:50:50.386 +0000 [INFO|00058|db::4] :: jdbc connect using: jdbc:impala://ip-10-197-28-99.eu-west-1.compute.internal:21050/work;UseNativeQuery=1;AuthMech=0;AuthMech=1;principal=impala/ip-10-197-28-99.eu-west-1.compute.internal@MYREALM.LOCAL

Hi @TomasF,

Is this behavior still persisting for you? Also, if you try not explicitly setting AuthMech in additional parameters, are you still seeing this incorrect setting and failure to connect? Please let me know.

Thanks,

Quinn

I was able to set up the connection, dont remember what exactly was the root cause.

This is my working additional params for impersonation:

KrbHostFQDN=.eu-west-1.compute.internal;KrbRealm=;KrbServiceName=impala;DelegationUID={{_user_attributes[‘principal’]}}

Tomas