Let’s talk data

Read about our company, how we work and the latest developments in the industry.

← Back to articles

Why did we build on Kubernetes?

Trying to bring forward a solution that is both multi-cloud and capable enough to provide multiple tools for our users, Kubernetes represents the backbone on which we built our stack of applications at Lentiq. We decided to utilize Kubernetes-as-a-Service from cloud providers for quick and reliable compute environments.

So what does Kubernetes bring to the table that led us to build Lentiq on it?

Existing support for provisioning inside cloud providers

In order to provide stateful applications that can be accessed from the outside of the cluster as well as from the inside, Kubernetes comes with built-in support for major cloud providers, which we properly configure for usage in the chosen provider. Having these elements means less interaction from our part with the provider and also means that adding a new cloud provider on Lentiq can be done very quickly, as long as it has the support already implemented.

The following features are the ones we use to simplify our interactions with the providers we support:

  • Loadbalancer services - In order to access applications from outside of the clusters, certain building blocks need to be created dynamically inside providers once the application is created. This type of service provides the needed support, by knowing what to create and what resources to request from the providers, so that each exposed application port can be accessed through an easy URI. Moreover, if needed, it can also loadbalance amongst the multiple replicas of the same application to provide resiliency.

  • Firewall - Alongside loadbalancing, allowing access only to certain users, by whitelisting the IPs from which they can access the applications, is necessary to provide access security. Besides this, adding or removing IPs from the list needs to be done without having to perform application restarts. Loadbalancer services provide this, by having a list of loadBalancerSourceRanges as the only sources that can access the exposed ports. This whitelisting is performed both at Kubernetes and cloud provider layers, for a further increase of security.

  • Dynamic volume provisioning - PersistentVolumes (through PersistentVolumeClaims) need to be linked to the containers, to achieve data persistency across application restarts. These volumes need to be linked to actual storage devices and this is where the dynamic provisioning system implemented inside Kubernetes comes to help us. This is done by allowing to specify storage classes that will get actual storage devices to be created inside the cloud providers on request.

Shared execution environment for different types of applications

As each application has its own requirements in terms of CPU and RAM usage, having an environment in which all types of applications can be executed at the same time represented one of our main focuses, providing a way in which the allocated compute power can be used optimally. Kubernetes, by design, offers this, as each container is executed in isolation in its own environment, allowing multiple applications to be run in the same cluster without influencing each other. Alongside this, applications can be grouped inside namespaces, which allows them to be isolated internally inside the cluster from other applications in other namespaces. This is the basis for the projects inside Lentiq which represent independent environments inside the same Kubernetes cluster. Inside the same project, applications can work together using the internal network, allowing multiple teams to use the same cluster without knowing of each other and without interfering with each other. Also, by using the provided quotas, certain resources can be reserved for each project, so that each team can use only the resources that are allocated to it.

Extensibility, allowing building on top and together with Kubernetes

Kubernetes is a very good container orchestrator, but running an application inside means working with multiple primitives, in order to implement its computing, communication and storage needs. Even though Kubernetes doesn't offer direct support for managing it all, it offers ways of being extended with third party controllers. These controllers reside as containers inside the system and can perform the needed application orchestration on top of the basic primitives. Moreover, they can potentially expose new APIs and custom resource definitions which allow users to discuss with Kubernetes directly to work with the extensions.

At Lentiq, we're working with our own controller, built on top of Helm, to easily install the applications we offer to our users. Alongside application management, whole other features can be built using the controller system. For example, we're building our workflow engine using one of these controllers, to simplify workflow definition and task execution.

By putting all these elements together and defining an abstraction for object storage, on top of the providers' implementations, in the form of our own filesystem, we deployed a proper multi-cloud solution that comes packed with multiple tools that can help anybody start and maintain a data processing project.

You can test it yourself by creating a new account at datalake.lentiq.com and choosing a demo datapool - you’ll get a 14 day trial with the computing resources provided by us! For more information about containers and Kubernetes, we have a comprehensive intro into the subject - Introduction to Kubernetes.

Alex Sirbu, R&D Team Lead at Lentiq, has a 7-year experience in architecting and building reliable distributed systems.

TwitterFacebookLinkedIn

Readers also enjoyed:

How to use Lentiq with AWS

Lentiq is a new data science platform that was specifically designed to work in multiple cloud environments. The deploy sequence is fast and easy regardless…

How to use Lentiq with Google Cloud

Lentiq is a new data science platform that was specifically designed to work in multiple cloud environments. The deploy sequence is fast and easy…

Lentiq – The Freedom to Innovate

Lentiq reimagines the vision of the data lake concept by moving away from a centralized, unified data repository to a fully distributed architecture.…

Try Lentiq with your team. 14 days free trial.

Create a Free Account