Confluent Strategic Move Continues: Kafka Summit 2024

data architecture, data engineering, solutions

To drive successful digital transformation in today's fast-paced business landscape, organisations need robust streaming platforms that can handle data's scale, speed, and complexity in real time. In that aspect, Apache Kafka positions itself as one of the premier components of modern data architectures. Our CTO, Sabri Skhiri, recently travelled to London for the Kafka Summit 2024: with attendees representing a wide range of organisations, from tech giants such as Meta, AWS, or BlaBlaCar to financial institutions like ING, BNP, or the Bank of England, the event was an excellent opportunity to observe where the market is heading, see concrete customer cases with their corresponding architecture, and learn about the new things coming from Confluent. Let's dive into the future of streaming technologies with Sabri!

 

Before starting, let me mention that I will not go into technical details in this article, but as always, feel free to consult my longer blog post on our research website, where I go deeper into Jay’s keynote, practical applications, tech discoveries, and favourite talks.

 

Strong Kafka Adoption and Thriving Ecosystem

Firstly, the Kafka Summit 2024 highlighted Kafka's position as a widely adopted technology across various industries. With over 150,000 organisations relying on it in their production environments, it holds a critical role in modern data infrastructure. 

This widespread adoption is further supported by a thriving Kafka ecosystem. For example, solutions like Conduktor offer cluster management and security functionalities. Frameworks like Restate and NStream simplify microservice development by providing robust API functionalities. QuestDB and Druid Data offer storage options for real-time scenarios. Moreover, with over 1000 Kafka Improvement Proposals (KIPs) submitted, the commitment to enhancing Kafka’s speed, reliability, and functionality is evident. 

 

Confluent’s Strategic Advancements

Secondly, the keynote speech presented several pivotal updates and strategic shifts for Confluent and the broader Kafka community. 

The universal data product
This represents a pivotal shift for Confluent and its mission to redefine data management. Confluent's focus has evolved from merely processing data swiftly and in real-time to tackling the pervasive "data mess." This complexity, a source of widespread developer frustration, is being addressed through a paradigm shift towards "Data Products" – a concept that emphasises developer empowerment and efficiency. 

Confluent vision seeks to blur the lines between operational and analytical data silos within organisations, fostering a unified data management approach. Universal data products encapsulate Kafka topics, schemas, and owners into discoverable, consumable, and governed entities. By transforming data topics into universal data products, Confluent wants to enable the creation of versatile data products that cater to both operational and analytical needs. 

Confluent Cloud Platform: A Game Changer 
Jay Kreps introduced a groundbreaking development – the complete integration of KConnect, Kafka, Flink, and Iceberg within their Confluent Cloud platform. The integration creates a unique data lakehouse solution that can handle both real-time and batch data processing capabilities.

This positions Confluent Cloud as a strong competitor to established solutions like Databricks Delta, Apache Hudi, Watsonx.data and even Azure Fabric. Notably, Confluent Cloud is one of the few​​ platforms that offer a data lakehouse deployable in both operational and analytical environments, but the only one that truly supports real-time. You make it easy to integrate data and to properly organise the data architecture in data products and in medallion architecture, without coding, without deploying anything. 

To bolster this ambitious vision, Confluent introduced significant advancements in the governance layer (FlinkSQL integration, UDF integration,...) that further enhanced the platform's capabilities and user experience.

Kafka Summit 24

Data governance - the mandatory feature for a data platform
Data governance received considerable attention, with new features aimed at establishing Kafka as a central data management platform, recognising that robust data governance is critical for achieving this goal. Kreps’ presentation included a detailed demonstration of the newly introduced data governance features: 

  • Stream Quality focuses on data validation and the establishment of data quality rules.
  • Stream Catalogue manages technical and business metadata, including sensitive information classification, thereby enhancing data security and compliance. 
  • Stream Lineage provides comprehensive visibility into the data flow within the ecosystem. It displays detailed information about each topic, including its metadata and ownership, facilitating traceability and accountability.

Finally, Confluent places a strong emphasis on defining data ownership and integrating data access request processes within its governance framework. This ensures efficient and policy-compliant data access, reinforcing data security and operational integrity.

Real-time data product - here is Apache Flink
The emphasis on real-time data processing remains paramount for various reasons, including the need for fresh and online data availability. In line with this, Apache Flink plays a crucial role in the Kafka ecosystem, especially when it comes to processing and distributing data products at scale. 

Flink's integration into Confluent's platform underscores a commitment to real-time data availability, with user-friendly features like "one-click Flink actions" and the introduction of User-Defined Functions (UDFs) enhancing data processing capabilities. Confluent's cloud service for Apache Flink has now reached General Availability (GA) across major cloud providers, including Azure, GCP, and AWS, marking a significant milestone in its deployment capabilities.

 

A Confluent's Strategic Shift

With the integrated Confluent Cloud platform and its focus on Flink & Iceberg, Confluent aims to serve as a one-stop shop for all data management needs, extending its reach beyond real-time analytics into the broader arena of data lakehouse and governance. The next iteration of the conference will be renamed "Current," signalling a wider focus that encompasses data management, lake architecture, governance, and other related technologies, reflecting Confluent’s strategic repositioning as a comprehensive data management solutions provider.
 

All blog articles