Self-Service Data Platform

Challenge

Enabling decentralized domain ownership

There is an industry trend towards moving from a centralized data team to decentralized domain ownership, where domains own (develop, operate, and support) their data products. The rationale is that domains, owning the data sources and possessing the requisite domain expertise, are better positioned to interpret the data. Conversely, a centralized data team often becomes a bottleneck in integrating new data sources and transforming data for consumption.

While the concept is logically sound, it presents several challenges. A significant challenge is the organizational transformation required to shift ownership. There may be a lack of willingness or incentives, coupled with a deficiency in the necessary knowledge and skills to assume this augmented responsibility.

From a technological standpoint, there’s a risk that each domain might “reinvent the wheel,” opting for divergent design and technology solutions, thereby complicating the technology landscape. If every domain addresses the same challenges in different ways, this approach might not represent an advancement over a centralized, highly skilled team.

The pivotal question is: How can an organization enable domains to assume data ownership without escalating the requisite skills and technology demands? And importantly, how can this empowerment be structured to ensure adherence to data and security governance standards?

Contact usLook behind the scenes

Solution

We’ve developed a multitenant data platform that provides self-service capabilities, enabling domains to develop and publish their data products. This platform supports the secure exchange of data while ensuring stringent security and governance, through:

1

A multitenant Kubernetes developer platform.
2

Common data services (like a lakehouse, messaging, and databases) with strictly enforced data access management policies.
3

A “Paved Road” approach for lifecycle management and data product technologies.
4

A data catalog for easy data product discovery.

Results

A broader community of data owners can autonomously make their data accessible to other domains. The self-service data platform conceals the complexities of infrastructure, deployment, security, and multitenancy, offering “paved road” best practices for the creation and publication of data products.
This leads to:

  1. Quicker development times for data products and applications.
  2. Enhanced ownership and data sharing within the organization.
  3. Strengthened security and governance, as the platform verifies and enforces policies.

 

Behind the scenes

We developed a secure and multitenant Kubernetes developer platform, offering comprehensive self-service capabilities. This allows each tenant to independently develop and deploy their applications or data product services.

The platform provides a range of secured, multitenant data services, including a data catalog, a data lakehouse, messaging solutions (like Kafka), and databases, all integrated with robust data access management policy enforcement. This integration facilitates the secure exchange of data and ensures adherence to global data governance policies.

The Technology

  • Kubernetes multitenancy utilizing Capsule
  • CloudNativePG-managed Postgresql database
  • Apache Kafka
  • Open Policy Agent

The Expertise

  • Kubernetes
  • Kubernetes security and multitenancy
  • Data access management

The Expert

No organization desires to rely on individuals remembering data policies, attempting to comprehend them, and correctly applying them. Thus, we embarked on constructing a data platform to address this challenge. Initially, our aim was to establish contracts or policies in a non-technical format to define data governance within the organization. This alone posed a fascinating organizational obstacle.
However, the true test arose when we sought to technically enforce these policies across all common data services offered by the platform. This often necessitated integrating multiple open source tools, such as pairing data streaming solutions with policy enforcement frameworks like Open Policy Agent.
As we delved deeper into this endeavor, we soon realized that tackling data issues on a case-by-case basis is relatively straightforward. Yet, the real complexity emerged when we endeavored to resolve issues across multiple use cases simultaneously. Creating a shared language through which we could articulate these technical policies for vastly different use cases proved to be exceptionally challenging.

Creating a shared language through which we could articulate these technical policies for vastly different use cases proved to be exceptionally challenging.

Roel Van NyenR&D Manager

Join us!

Want to work on similar projects?

Introverts and extroverts, geeks, nerds, and digital poets... Klarrio is the perfect place to learn and teach, experiment and brainstorm, exercise your brain, and feed your passion. Surrounded by people with amazing, world-changing talents.

We're hiring

Contact us!

We're your one-stop cloud-native partner

We design cloud native, cloud agnostic software solutions to empower you to control your data, limit cloud costs, and optimize performance–all without compromise. What can Klarrio do for you today?

Contact us

Other Projects

Just a few projects examples.