Spark as a Service On Lentiq EdgeLake

Perform data analytics and machine learning at scale effortlessly

From data engineer to data scientist, Lentiq EdgeLake lets you run, optimize and scale any batch, real-time, data processing or machine learning workload. Access finely tuned Spark clusters with no technical knowledge, scale with a click to obtain the best results from your analysis, faster.

Get Early Access

Scalable without limits

Spark as a Service running on Lentiq EdgeLake is quick and easy to set up by the entire data team, offering a frictionless and fully optimized environment.

  • Dedicated Spark clusters for specific workloads in the same project
  • Highly-available and resilient by design, based on containers managed by Kubernetes
  • Highly elastic and scalable through one click, without technical knowledge
  • Zero ops requirements since the entire setup is completely managed by EdgeLake
  • Decoupled storage and compute by using a container based deployment for the Spark cluster and the object storage for data storage

Make notebooks production ready

Connect notebooks to an existent Spark cluster through a one-liner and accelerate your data science, machine learning, and data processing tasks.

Connecting to a Spark cluster pre-created in the data pool.

from pyspark.sql import SparkSession
spark = SparkSession.builder\
.master(”spark://35.228.151.102:7077")\
.getOrCreate()

An existing Spark cluster is required. Click on the Spark icon in the Lentiq interface or provision one. If Spark will not be used for processing a minimal resources configuration can be used. Copy the Spark master URL from the widget and use below as an argument to the master function.


Notebooks from prototype to production with no-ops involvement

By using Spark under the hood with Jupyter Notebooks, you can maintain the IDE you are loving and benefit from large scale processing when needed, as well as easily scale with the dataset.

Package your notebooks as executable code ready to be put in production, through our unique "Reusable Code Blocks" technology. This can help you automate your data science workflow.

Data Ingestion
Data Exploration
Baseline Modelling
Training & Parameter tuning
Disseminating results

Fully integrated metadata management

Lentiq's Spark as a Service is seamlessly integrated with the rest of the platform. Creating tables results in having them available for data documentation in the Table Browser page or for standard BI tools through the JDBC connector.

Try EdgeLake with your team for free

Get Early Access