Documentation

Documentation

  • Home
  • Blog
  • API
  • Contact

›User Guide

Overview

  • Lentiq introduction
  • Lentiq architecture
  • What is a data pool?
  • What is a project?
  • Migrating from Hadoop

Getting started

  • Deploying applications and processing clusters
  • Connecting to Spark from a notebook
  • Uploading data to Lentiq
  • Creating a data pool
  • Deploying on GCP
  • Deploying on AWS

User Guide

    Managing applications

    • Working with applications
    • Managing compute resources

    Managing data

    • Working with data and metadata
    • Sharing data between data pools
    • Querying data with SQL (DataGrip)
    • Connecting Tableau to Lentiq

    Managing models

    • Working with models
    • Publishing notebooks
    • Training and serializing a model
    • Managing model servers

    Managing workflows

    • Working with workflows
    • Creating a reusable code block from a notebook
    • Creating a docker image based reusable code block
  • Glossary
  • API

Tutorials

  • End-to-end Machine Learning Tutorial

Connecting Tableau to Lentiq

Tableau is a very popular data exploration tool that we love and that works nicely with Lentiq. The query engine is SparkSQL which uses Spark's in-memory mechanisms and query planner to execute SQL queries on data.

Prerequisites

  1. Have Tableau Desktop installed. If you don't already have one installed, a trial can be downloaded from the tableau.com website.
  2. Have a Lentiq account, a data pool provisioned and at least one table registered showing up in the Table Browser. For more information follow the working with data guide

1. Deploy the Spark SQL Application

From the Lentiq's left-hand application panel click on the SparkSQL icon.

sparksql jdbc connector

Note that the resources allocated can always be increased. You will want to increase these resources if multiple users use the same application or create separate applications for each user.

Make sure that the IP of the client connecting with Tableau is white-listed in the Firewall tab either at the application level or the data pool or project level.

Click Create Spark SQL and wait for it to be provisioned.

2. Download & install driver (optional)

Tableau uses database drivers to connect to the various data sources. We will be using a driver called Simba Spark ODBC Driver to connect to our SparkSQL query engine, for both Mac and Windows. This driver is now shipped with Tableau starting 10.2 - 2019.1.3 version. If you use an older version follow the steps below. Otherwise, skip to step 3.

To download and install the driver:

  1. Navigate to Tableau Drivers page
  2. Search for Spark SQL
  3. Select your operating system and download the driver
  4. Follow on-screen instructions

sparksql jdbc connector

3. Configure Connection details

In the Dashboard click on the SparkSQL api's EDIT button and copy the IP to your clipboard or take a note of it. The port is always 10000.

sparksql get host

3.1. Create ODBC DSN (Windows)

To specify connection details ODBC drivers use what's called a DSN which is a collection of settings.

  1. Open ODBC Data Source Administrator (64 bit). You will need Administrator rights. sparksql get host

  2. On the SystemDSN tab double click the Simba Spark DSN and configure the host field with the IP collected above. tableau configure ODBC DSN

  3. Click OK.

3.2 Create ODBC DSN (macOS)

On Mac edit the ~/Library/ODBC/odbc.ini file using vim or some other editor (you will need administrator rights):

sudo vim ~/Library/ODBC/odbc.ini

And append the following DSN configuration:

[ODBC Data Sources]
Lentiq=Simba Spark ODBC Driver

[Lentiq]
Driver=/Library/simba/spark/lib/libsparkodbc_sbu.dylib
Host=35.202.105.247
Port=10000

4. Configure Tableau data source

Now you need to tell Tableau to use our newly created DSN.

4.1 Configure Tableau data source (Windows)

  1. In Tableau go to the Connect to data page

tableau configure ODBC DSN

  1. Select other ODBC Databases

tableau configure ODBC DSN

  1. Select the previously defined DSN from the dropdown (Lentiq).

tableau configure ODBC DSN

Click Connect and then click Sign in.

An executing query dialog will appear and it will take a while to execute. That is normal.

  1. In the Data Source view select the SPARK database and the default schema. If the schema does not appear in the dropdown click on the magnifying glass icon or type in the name of the schema.

tableau configure ODBC DSN

  1. Select tables of interest

Add tables of interest to your tableau sheets. If the tables do not appear in the list below click on the magnifying glass icon or type in the name of the table as it appears in Lentiq's table browser.

tableau configure ODBC DSN

4.2 Configure Tableau data source (macOS)

  1. In Tableau go to the Connect to data page

tableau configure ODBC DSN

  1. Select other ODBC Databases

tableau configure ODBC DSN

  1. Select the previously defined DSN from the dropdown (Lentiq).

tableau configure ODBC DSN

Click Sign in.

An executing query dialog will appear and it will take a while to execute. That is normal.

  1. In the Data Source view select the SPARK database and the default schema. If the schema does not appear in the dropdown click on the magnifing glass icon or type in the name of the schema.

tableau configure ODBC DSN

  1. Select tables of interest

Add tables of interest to your tableau sheets.

tableau configure ODBC DSN

You can now browse and work with data in all tables in the data lake. The performance of the entire system is heavily dependent on the amount of resources you allocate to your SparkSQL query engine. Since spark caches data in RAM the very first interactions will be slower but subsequent ones should be fast.

← Querying data with SQL (DataGrip)Working with models →
  • Prerequisites
  • 1. Deploy the Spark SQL Application
  • 2. Download & install driver (optional)
  • 3. Configure Connection details
    • 3.1. Create ODBC DSN (Windows)
    • 3.2 Create ODBC DSN (macOS)
  • 4. Configure Tableau data source
    • 4.1 Configure Tableau data source (Windows)
    • 4.2 Configure Tableau data source (macOS)
Copyright © 2019 Lentiq