Documentation

Documentation

  • Home
  • Blog
  • API
  • Contact

›User Guide

Overview

  • Lentiq introduction
  • Lentiq architecture
  • What is a data pool?
  • What is a project?
  • Migrating from Hadoop

Getting started

  • Deploying applications and processing clusters
  • Connecting to Spark from a notebook
  • Uploading data to Lentiq
  • Creating a data pool
  • Deploying on GCP
  • Deploying on AWS

User Guide

    Managing applications

    • Working with applications
    • Managing compute resources

    Managing data

    • Working with data and metadata
    • Sharing data between data pools
    • Querying data with SQL (DataGrip)
    • Connecting Tableau to Lentiq

    Managing models

    • Working with models
    • Publishing notebooks
    • Training and serializing a model
    • Managing model servers

    Managing workflows

    • Working with workflows
    • Creating a reusable code block from a notebook
    • Creating a docker image based reusable code block
  • Glossary
  • API

Tutorials

  • End-to-end Machine Learning Tutorial

Training and serializing a model for serving with the Model Server

To train a model in a format that is supported by the model server you need to build and fit a prediction pipeline using either pySpark or scikit-learn. The model needs to be uploaded to the data pool for it to be reachable by the model server.

We provide to example notebooks:

  1. Example scikit-learn train and serve notebook
  2. Example pySpark train and serve notebook

These notebooks are pre-loaded into your Jupyter environment but you might need to adapt some of them to your needs.

Both train a very simple classification model on the Iris dataset. They train and serialize the models using the MLap library and then upload the models to the data pool using the bdl utility. In production you will want to have separate Reusable Code Blocks for each step.

← Publishing notebooksManaging model servers →
Copyright © 2019 Lentiq