Documentation

Documentation

  • Home
  • Blog
  • API
  • Contact

›User Guide

Overview

  • Lentiq introduction
  • Lentiq architecture
  • What is a data pool?
  • What is a project?
  • Migrating from Hadoop

Getting started

  • Deploying applications and processing clusters
  • Connecting to Spark from a notebook
  • Uploading data to Lentiq
  • Creating a data pool
  • Deploying on GCP
  • Deploying on AWS

User Guide

    Managing applications

    • Working with applications
    • Managing compute resources

    Managing data

    • Working with data and metadata
    • Sharing data between data pools
    • Querying data with SQL (DataGrip)
    • Connecting Tableau to Lentiq

    Managing models

    • Working with models
    • Publishing notebooks
    • Training and serializing a model
    • Managing model servers

    Managing workflows

    • Working with workflows
    • Creating a reusable code block from a notebook
    • Creating a docker image based reusable code block
  • Glossary
  • API

Tutorials

  • End-to-end Machine Learning Tutorial

Managing compute resources

Managing the resources associated to data pools and projects are a fundamental task that happens many times in the beginning of a project or during it as your data and your team's needs grow.

application project resources

Resources are provisioned (nodes are deployed) on a data pool, allocated to projects and then consumed by applications and workflow tasks.

  1. What is a data pool?
  2. What is a project?

Increasing data pool resources

The compute resources, the nodes provisioned for the underlying Kubernetes cluster are provisioned at the data pool level. These resources are further allocated to projects. This will change in the future with autoscaling.

To add more resources to the cluster, add additional nodes in the data pool:

  1. Select the desired data pool that you can manage (are a Manager or an Owner) from the Data Pool dropdown at the bottom of the UI.
  2. Click on the Data Pool settings button from the bottom of the page.
  3. Increase the node count.

Data pool settings

Increasing the data pool's available resources by adding nodes does not immediately increase the resources allocated to projects or to applications. You need to allocate the newly available space manually.

Note: You cannot change then node type after the initial deployment of a data pool so, before creating a data pool think about the proper CPU-to-RAM ratio that you will most likely need. A good ratio is 1 core for every 4 GB or RAM.

Allocating resources to projects

From the total pool of resources that a data pool has, a certain portion is allocated to a project. To allocate more resources:

  1. Select the project you want to increase resources for from the dropdown of the top of the UI.
  2. Click on the Project settings button from the top of the UI.
  3. Increase the resources by clicking on the "+" sign for both RAM and CPU.

Data pool settings

Note: Workflows consume resources from the project so make sure to have enough free resources at all times to allow the workflows to execute. This will change in the future with autoscaling.

Allocating resources to applications or workflow tasks.

Allocating resources from the project the applications is done through the configuration tab of each application by clicking the Edit button for each application in the Applications view.

spark configuration

Tasks have an equivalent control to allocate resources. Note that these resources need to be available in the project at the time of the execution otherwise the task will wait in the PENDING state until resources are available.

spark configuration

For more details on working with applications and workflows follow:

  1. Working with applications
  2. Working with workflows
← Working with applicationsWorking with data and metadata →
  • Increasing data pool resources
  • Allocating resources to projects
  • Allocating resources to applications or workflow tasks.
Copyright © 2019 Lentiq