Flink Forward 2024: Key Takeaways and Trends in Stream Processing
The Flink Forward 2024 conference, held in Berlin, marked the 10th anniversary of Apache Flink. This event brought together a focused audience of 500 to 600 experts and enthusiasts in stream processing who gathered to discuss the latest advancements, challenges, and future trends. We propose you dive into the key takeaways with our CTO, Sabri Skhiri:
Before starting, let me mention that, as always, this blog post provides a high-level overview of the conference. For a deeper dive into specific topics, refer to the detailed conference report on our research website.
Democratisation and the Rise of Real-Time AI
One of the central themes, both in talks and at booths, was the need to make Flink more accessible and user-friendly. Unlike players such as Databricks, whose messages focus on accessibility, Flink seems to be reserved for experts. While undeniably powerful, simplifying its deployment and usage is paramount for wider adoption. The recent alliance between Confluent and Apache Flink promises to break down these barriers, anchoring Flink in mainstream data infrastructures via the Kafka ecosystem.
AI also acts as a significant catalyst for the broader adoption of real-time data streaming technologies. Flink’s adoption is particularly crucial as AI and Generative AI increasingly depend on real-time data products for contextual awareness and agile decision-making. Its strength in handling stateful streaming applications positions it as the ideal foundation for building agentic architectures that intelligently respond to dynamic situations.
The Streaming Lakehouse: A Unified Approach
The emergence of the "Streamhouse" architecture —combining the strengths of streaming architectures and data lakehouses— was another key takeaway. An example of this approach is Apache Paimon, a cloud-native architecture combining Flink with Kafka or Pulsar, that enables real-time data processing within a lakehouse environment, offering a unified solution for managing both real-time and historical data.
Flink 2.0: A Major Leap Forward
The conference saw the announcement of Flink 2.0, a significant release promising improved performance, scalability, and ease of use. The release brings a focus on cloud-native state management, stream-batch unification exemplified through Apache Paimon, and enhanced support for streaming lakehouses.
In particular, the introduction of FLUSS (Flink Unified Streaming Storage), a new architecture for streaming lakehouses, further solidifies Flink's position as a leader in this domain. Built on Apache Arrow, FLUSS promises to deliver sub-second latency and high throughput for real-time analytics. This means faster processing, improved scalability, and greater efficiency for your data pipelines.
Flink CDC: Simplifying Data Integration
Another noteworthy development is Flink CDC, which Alibaba offered to the ASF in 2024. By leveraging change data capture (CDC) technology, Flink CDC eliminates the need for complex data integration pipelines and ensures your data stays synchronised across different systems. This solution removes the complexity of Kafka connectors and has established itself as a key player in real-time data integration, capturing and propagating changes from databases to downstream consumers. This is a boon for businesses looking to simplify their data infrastructure and improve data consistency.
Other Notable Trends
- Cloud-Native Adoption: The increasing adoption of cloud-native architectures and managed services for stream processing was evident, with platforms like VERA and StreamNative offering simplified deployment and management of Flink in the cloud.
- Privacy and Security: While not a major focus, the conference touched upon the importance of privacy and security in real-time data streaming, highlighting the need for further discussion and development in this area.
- Use Cases: Various use cases were presented, showcasing Flink's versatility in industries like energy trading, fraud detection, predictive maintenance, and fleet management.
Conclusion
Flink Forward 2024 offered a glimpse into the exciting future of stream processing. Flink's open-source nature and flexibility provide a compelling alternative, especially for businesses seeking greater control and customisation. With the release of Flink 2.0, the growing adoption of cloud-native architectures, and the increasing demand for real-time data in AI applications, Flink is well-positioned to play an essential role in the evolving data landscape.
At Euranova, we're passionate about helping businesses navigate the complexities of real-time data processing. Should you need help designing your solution, we can help you identify the right use cases for Flink, design and implement robust data pipelines, and ensure your solutions are scalable, reliable, and secure.