Snowflake with Apache Kafka and Iceberg Connector
Read More

Snowflake Data Integration Options for Apache Kafka (including Iceberg)

The integration between Apache Kafka and Snowflake is often cumbersome. Options include near real-time ingestion with a Kafka Connect connector, batch ingestion from large files, or leveraging a standard table format like Apache Iceberg. This blog post explores the alternatives and discusses its trade-offs. The end shows how data streaming helps with hybrid architectures where data needs to be ingested from the private data center into Snowflake in the public cloud.
Read More
Snowflake and Apache Kafka Data Integration Anti Patterns Zero Reverse ETL
Read More

Snowflake Integration Patterns: Zero ETL and Reverse ETL vs. Apache Kafka

Snowflake is a leading cloud-native data warehouse. Integration patterns include batch data integration, Zero ETL and near real-time data ingestion with Apache Kafka. This blog post explores the different approaches and discovers its trade-offs. Following industry recommendations, it is suggested to avoid anti-patterns like Reverse ETL and instead use data streaming to enhance the flexibility, scalability, and maintainability of enterprise architecture.
Read More
Google Apache Kafka for BigQuery GCP Cloud Service
Read More

When (Not) to Choose Google Apache Kafka for BigQuery?

Google announced its Apache Kafka for BigQuery cloud service at its conference Google Cloud Next 2024 in Las Vegas. Welcome to the data streaming club joining Amazon, Microsoft, IBM, Oracle, Confluent, and others. This blog post explores this new managed Kafka offering for GCP, reviews the current status of the data streaming landscape, and shares some criteria to evaluate when Kafka in general and Google Apache Kafka in particular should (not) be used.
Read More
Streaming Analytics SQL API with Apache Kafka Confluent ClickHouse Tinybird
Read More

Apache Kafka and Tinybird (ClickHouse) for Streaming Analytics HTTP APIs

Apache Kafka became the de facto standard for data streaming. However, the combination of an event-driven architecture with request-response APIs is crucial for most enterprise architectures. This blog post explores how Tinybird innovates with a REST/HTTP layer on top of the open source analytics database ClickHouse in the cloud. Integrating Kafka with Tinybird, the benefits of fully managed services like Confluent Cloud, and customer stories from Factorial and FanDuel show why Kafka and analytics databases complement each other for more innovation and faster time-to-market.
Read More
When NOT to use Apache Kafka
Read More

When NOT to Use Apache Kafka? (Lightboard Video)

Apache Kafka is the de facto standard for data streaming to process data in motion. With its significant adoption growth across all industries, I get a very valid question every week: When NOT to use Apache Kafka? What limitations does the event streaming platform have? When does Kafka simply not provide the needed capabilities? How to qualify Kafka out as it is not the right tool for the job? This blog post contains a lightboard video that gives you a twenty-minute explanation of the DOs and DONTs.
Read More
The Past Present and Future of Stream Processing
Read More

The Past, Present and Future of Stream Processing

Stream processing has existed for decades. The adoption grows with open source frameworks like Apache Kafka and Flink in combination with fully managed cloud services. This blog post explores the past, present and future of stream processing, including the relation of machine learning and GenAI, streaming databases, and the integration between data streaming and data lakes with Apache Iceberg.
Read More
JavaScript Node JS Apache Kafka for Full Stack Data Streaming in Event Driven Architecture
Read More

JavaScript, Node.js and Apache Kafka for Full-Stack Data Streaming

JavaScript is a pivotal technology for web applications. With the emergence of Node.js, JavaScript became relevant for both client-side and server-side development, enabling a full-stack development approach with a single programming language. Both Node.js and Apache Kafka are built around event-driven architectures, making them naturally compatible for real-time data streaming. This blog post explores open-source JavaScript Clients for Apache Kafka and discusses the trade-offs and limitations of JavaScript Kafka producers and consumers compared to stream processing technologies such as Kafka Streams or Apache Flink.
Read More
Data Streaming with Apache Kafka and ARM CPU at the Edge and in the Cloud
Read More

ARM CPU for Cost-Effective Apache Kafka at the Edge and Cloud

ARM CPUs often outperform x86 CPUs in scenarios requiring high energy efficiency and lower power consumption. These characteristics make ARM preferred for edge and cloud environments. This blog post discusses the benefits of using Apache Kafka alongside ARM CPUs for real-time data processing in edge and hybrid cloud setups, highlighting energy-efficiency, cost-effectiveness, and versatility. A wide range of use cases are explored across industries, including manufacturing, retail, smart cities and telco.
Read More
ESG and Sustainability powered by Data Streaming with Apache Kafka and Flink
Read More

Green Data, Clean Insights: How Kafka and Flink Power ESG Transformations

This blog post explores the synergy between Environmental, Social, and Governance (ESG) principles and Kafka and Flink’s real-time data processing capabilities, unveiling a powerful alliance that transforms intentions into impactful change. Beyond just buzzwords, real-world deployments architectures across industries show the value of data streaming for better ESG ratings.
Read More
GenAI Demo with Kafka, Flink, LangChain and OpenAI
Read More

GenAI Demo with Kafka, Flink, LangChain and OpenAI

Generative AI (GenAI) enables automation and innovation across industries. This blog post explores a simple but powerful architecture and demo for the combination of Python, and LangChain with OpenAI LLM, Apache Kafka for event streaming and data integration, and Apache Flink for stream processing. The use case shows how data streaming and GenAI help to correlate data from Salesforce CRM, searching for lead information in public datasets like Google and LinkedIn, and recommending ice-breaker conversations for sales reps.
Read More