Pricing

What's included

Get all the features you need for performing data science and analytics at scale in
a distributed data lake

Unified Management

Regardless of the underlying infrastructure provider.

Self-Service

Choose the right stack for your team and business use case.

Lower Maintenance Costs

No ops or specialized skills are required.

End to End Data Science

Put data science in production through automation and reusable code blocks.

Key Features

Data and Knowledge Sharing

✓ Share curated datasets with the rest of the organization

✓ Share curated notebooks with the rest of the organization

✓ Connect with other data teams

Open Source App Management

✓ Code in Python

✓ Jupyter Notebooks as a Service

✓ Apache Spark as a Service

✓ Apache Kafka as a Service

✓ PostgreSQL as a Service

✓ Top Python libraries: Pandas, Ray, Numpy, Dask, Seaborn, XGBoost, Matplotlib, Scikit-learn, Spark ML

✓ Provision clusters and scale them as needed

Data Science and AI at Scale

✓ Data science at scale through Dask, Spark and Ray

✓ Run Spark jobs on independent clusters

✓ Model management and deployment

✓ Provision use case specific projects with their own budget are resources

Data & Metadata Management

✓ Annotate files and tables before sharing

✓ Create tables from files through Spark and explore them in the table browser

✓ Document table columns before sharing and improve data set explainability and adoption

Our service is free