[RETIRED] General notes for connecting Looker to Spark SQL


(Todd Nemet) #1

The content of this article has been updated and moved to Looker’s technical documentation here.


(Michael Erasmus) #2

Thanks so much for this! I got it working on our Looker instance just now. I got this running on EMR, which is pretty smooth with their Spark support. One thing to mention (I wasn’t quite sure about this) is that I had to use the name ‘default’ for the database. I also had to open the port 10000 on my EC2 security group.


(Todd Nemet) #3

Thanks for the notes, and I’m glad that you found it useful. What version of Spark are you connecting to?


#4

Thanks for the article. I have a custom UDF jar how do I add it to the Thrift Server classpath? I did something like

sudo -u spark /path/to/spark/sbin/start-thriftserver.sh --queue interactive.thrift --jars /opt/lib/custom-udfs.jar

Is this correct?

Then tried to use beeline to do JDBC connection

beeline -u jdbc:hive2://localhost:10000/default

I am able to connect. But how do I verify that the customer udfs indeed is working?

Thanks


(Max Corbin) #5

Hey there @Christan!

I think the best way to test if the custom UDF jar is working is to try running those commands via beeline. If you’re able to call those user-defined functions in a beeline session, this mean that the Thrift Server has access to them and you are doing this correctly.


(kenneth.vinson) #6

Just added the “Feature Support” section to this article.


(kenneth.vinson) #7

I retired this article. The content can be found in Looker’s documentation here.