Training and serializing a model for serving with the Model Server
To train a model in a format that is supported by the model server you need to build and fit a prediction pipeline using either pySpark
or scikit-learn
. The model needs to be uploaded to the data pool for it to be reachable by the model server.
We provide to example notebooks:
These notebooks are pre-loaded into your Jupyter environment but you might need to adapt some of them to your needs.
Both train a very simple classification model on the Iris dataset. They train and serialize the models using the MLap library and then upload the models to the data pool using the bdl
utility. In production you will want to have separate Reusable Code Blocks for each step.