Documentation

Documentation

  • Home
  • Blog
  • API
  • Contact

›Getting started

Overview

  • Lentiq introduction
  • Lentiq architecture
  • What is a data pool?
  • What is a project?
  • Migrating from Hadoop

Getting started

  • Deploying applications and processing clusters
  • Connecting to Spark from a notebook
  • Uploading data to Lentiq
  • Creating a data pool
  • Deploying on GCP
  • Deploying on AWS

User Guide

    Managing applications

    • Working with applications
    • Managing compute resources

    Managing data

    • Working with data and metadata
    • Sharing data between data pools
    • Querying data with SQL (DataGrip)
    • Connecting Tableau to Lentiq

    Managing models

    • Working with models
    • Publishing notebooks
    • Training and serializing a model
    • Managing model servers

    Managing workflows

    • Working with workflows
    • Creating a reusable code block from a notebook
    • Creating a docker image based reusable code block
  • Glossary
  • API

Tutorials

  • End-to-end Machine Learning Tutorial

Connecting to Spark from a notebook

In Lentiq applications can be interconnected seamlessly. In this guide we are going to explore how a notebook can be connected with a Spark cluster when it is needed to scale data science tasks.

Prerequisites

There are some prerequisites for this:

  1. An up and running data pool
  2. An up and running project
  3. An up and running Jupyter Notebook instance
  4. An up and running Spark cluster

How to connect a notebook to a Spark cluster

Once all the prerequisites are in place, follow the next steps.

  1. Connect to the Jupyter Notebook instance. Use the URL and password provided in the interface.

Connect to jupyter

  1. Create a new notebook or enter the Getting Started Guide notebook. Enter the Getting Started Guide notebook

  2. Add a new cell where you will configure connection to a Spark cluster or identify the Spark connection cell in the Getting Started Guide notebook. Connection to Spark cluster

  3. Now copy the Spark Master connection URL. You can find it in the Application Management view in the Spark cluster application card. Spark Master

  4. Enter the Spark Master connection URL in the newly created cell or in the Spark connection cell in the Getting Started Guide notebook.

  5. Run the cell by hitting Shift+Enter.

  6. Wait for the cell to be run and check the Spark Master Web UI to see that the application is registered. Spark Application

← Deploying applications and processing clustersUploading data to Lentiq →
  • Prerequisites
  • How to connect a notebook to a Spark cluster
Copyright © 2019 Lentiq