Documentation

Documentation

    ›Getting started

    Overview

    • About Lentiq
    • Lentiq introduction
    • What is a Data Pool?
    • What is a Project?
    • Lentiq Architecture

    Getting started

    • Deploying on GCP
    • Deploying on AWS
    • Creating a data pool
    • Upload data to Lentiq
    • Deploy applications and clusters
    • How to connect a notebook to Spark?
    • Publishing notebooks

    User Guide

    • Data Management
    • Applications
    • Sharing data between data pools
    • Glossary

    Creating a data pool

    Lentiq acts as a management layer on top of the cloud providers you choose to use for your data lake. If you haven't set up your cloud services provider account already, check out the Deploying Lentiq on GCP or Deploying Lentiq on AWS guides for additional information on the prerequisite steps that have to be taken before starting your first Lentiq data pool.

    Creating a data pool

    A data pool is part of a data lake and consists of a Kubernetes cluster that can host one or more projects.

    The first data pool and project are easily created using our wizard:

    1. Type in a name for your data pool.
    2. Select the data lake that the data pool will be created on from the dropdown menu.
    3. Click Next step. data pool configuration

    Configuring your project

    1. Type in a relevant name for the project
    2. Modify the project owner email address if you are creating the project on behalf of someone else.
    3. Click Next step. cloud provider credentials configuration

    Configuring the project applications

    Multiple application clusters can be set up from the get go. The default application running on each data pool is a Spark cluster.

    1. Type in a relevant name for the application.
    2. Configure the number of workers that will be created for the application and the number of cores and amount of RAM used by each worker.

    Some clusters use multiple applications that can be configured independently.

    1. Click Next step. cloud provider credentials configuration

    Configuring your cloud provider credentials

    We are currently supporting Amazon Web Services and Google Cloud Platform.

    1. Select your cloud provider from the dropdown menu.
    2. Select the zone where the cloud resources will be provisioned.
    3. Select the cloud provider's credentials that will be used by projects created on the data pool.

    cloud provider credentials configuration

    If you don't already have credentials set up then click Create credential.

    1. Type in a name for the credentials to serve as identifier for further use.
    2. paste in the access key ID from your cloud account.
    3. paste in the entire secret access key from your cloud account.

    cloud provider credentials configuration

    If you don't have a secret access key set up you can create a new one from the cloud services provider dashboard.

    1. Click Next step.

    Configuring the hardware resources

    The amount of resources required by the project varies depending on the application configuration.

    1. Select the type of instance to be used by the underlying infrastructure.

    If your desired instance type is not available and you do not want to select another type, wait for a few minutes and try again.

    1. Increase the number of instances until the required amount of hardware resources is met.
    2. Click Next step. cloud provider credentials configuration

    Configuring project connectivity

    1. Add specific firewall rules for all the IP addresses (or ranges) that will need to connect to the data pool. cloud provider credentials configuration

    2. Click Next step.

    Provisioning the data pool

    Once you are done with all the configuration steps click Provision now to have the hardware deployed and all the applications installed and configured.

    cloud provider credentials configuration

    The provisioning step may take a few minutes to finish, depending on the complexity of the configuration.

    Once the provisioning step is completed, you will be redirected to the project dashboard.

    ← Deploying on AWSUpload data to Lentiq →
    • Creating a data pool
    • Configuring your project
    • Configuring the project applications
    • Configuring your cloud provider credentials
    • Configuring the hardware resources
    • Configuring project connectivity
    • Provisioning the data pool
    Copyright © 2019 Lentiq