Training and serializing a model for serving with the Model Server

To train a model in a format that is supported by the model server you need to build and fit a prediction pipeline using either pySpark or scikit-learn. The model needs to be uploaded to the data pool for it to be reachable by the model server.

We provide to example notebooks:

Example scikit-learn train and serve notebook
Example pySpark train and serve notebook

These notebooks are pre-loaded into your Jupyter environment but you might need to adapt some of them to your needs.

Both train a very simple classification model on the Iris dataset. They train and serialize the models using the MLap library and then upload the models to the data pool using the bdl utility. In production you will want to have separate Reusable Code Blocks for each step.

Documentation

Managing applications

Managing data

Managing models

Managing workflows

Training and serializing a model for serving with the Model Server