Typically, an enterprise service bus (ESB) or other integration solutions like extract-transform-load (ETL) tools have been used to try to decouple systems. However, the sheer number of connectors, as well as the requirement that applications publish and subscribe to the data at the same time, mean that systems are always intertwined. As a result, development projects have lots of dependencies on other systems and nothing can be truly decoupled.

The need for integration—a never ending story

No matter in which enterprise you work, no matter when your company was founded, you will have the requirement to integrate your applications with each other to implement your business processes.

This includes many different factors:

Technologies (standards like SOAP, REST, JMS, MQTT, data formats like JSON, XML, Apache Avro or Protocol Buffers, open frameworks like Nginx or Kubernetes and proprietary interfaces like EDIFACT or SAP BAPI)Programming languages and platforms like Cobol, Java, .NET, Go or PythonApplication architectures like Monolith, Client Server, Service-oriented Architecture (SOA), Microservices or ServerlessCommunication paradigms like batch processing, (near) real time, request-response, fire-and-forget, publish subscribe, continuous queries and rewinding

Many enterprise architectures are a bit messy—something like this:


Every company needs to solve these spaghetti architectures. Depending on the decade, you either bought something like an ETL tool to build batch pipelines or an ESB to design a SOA. Some products also changed their names. Today, you are offered things like middleware messaging, an integration platform, microservice gateway or API management. The branding and product name do not matter. You always see the same picture as a solution to move away from your spaghetti architecture to a central integral box in the middle, like this:


This rarely worked well in practice, unfortunately. Most SOA projects in the last two decades failed. Instead of using an ETL tool or ESB for this, enterprises are now moving on to a streaming platform to solve this issue. Is this the next bubble on the market? Just a new term? Or, did something really change to allow successful integration across an enterprise—whether you integrate legacy mainframes, standard applications like CRM and ERPs, modern microservices built with any programming platform, or public cloud services? Why are companies now migrating to Apache Kafka to build this streaming platform? Why is everybody happy and talking about this at conferences, tech talks and blog posts? How does it compare to an ESB or ETL tool?

The next sections will answer all these questions, and explain the reason and differences between the ecosystem of Apache Kafka and other existing integration solutions.

Bạn đang xem: Service bus là gì

Bạn đang xem: Enterprise service bus là gì

Event-driven processing and streaming as a key concept in the enterprise architecture

An event streaming platform (you can also enter another buzzword here) leverages events as a core principle. You think in data flows of events and process the data while it is in motion.

Many concepts, such as event sourcing, or design patterns such as Enterprise Integration Patterns (EIPs), are based on event-driven architecture. The following are some characteristics of a streaming platform:

Event-based data flows as a foundation for (near) real-time and batch processing. In the past, everything was built on data stores (data at rest), making it impossible to build flexible, agile services to act on data while it is relevant.Scalable central nervous system for events between any number of sources and sinks. Central does not mean one or two big boxes in the middle but a scalable, distributed infrastructure, built by design for zero downtime, handling the failure of nodes and networks and rolling upgrades. Different versions of infrastructure (like Kafka) and applications (business services) can be deployed and managed in an agile, dynamic way.Integrability of any kind of applications and systems. Technology does not matter. Connect anything: programming language, APIs like REST, open standards, proprietary tools and legacy applications. Speed does not matter. Read once. Read several times. Read again from the beginning (e.g., add new application, train different machine learning models with the same data).Distributed storage for decoupling applications. Don’t try to build your own streaming platform using your favorite traditional messaging system and in-memory cache/data grid. There is a lot of complexity behind this and a streaming platform simply has it built-in. This allows you to store the state of a microservice instead of requiring a separate database, for example. Stateless service and stateful business processes. Business processes typically are stateful processes. They often need to be implemented with events and state changes, not with remote procedure calls and request-response style. Patterns like event sourcing and CQRS help implement this in an event-driven streaming architecture.

Benefits of a streaming platform in the enterprise architecture

A streaming platform establishes huge benefits for your enterprise architecture:

Large and elastic scalability regarding nodes, volume, throughput—all on commodity hardware, in any public cloud environments, or via hybrid deployments.Flexibility of architecture. Build small services, big services, sometimes still even monoliths. Event-driven microservices. Asynchronously connected microservices model complex business flows and move data to where it is needed.

Xem thêm: 'Hot Boy Xăm Trổ' Giờ Ra Sao, Lê Hoàng Anh Hot Boy Xăm Trổ

Openness without commitment to a unique technology or data format. The next new standard, protocol, programming language or framework is coming for sure. The central streaming platform is open even if some sources or sinks use a proprietary data format or technology.Independent and decoupled business services, managed as products, with their own lifecycle regarding development, testing, deployment and monitoring. Loose coupling allows for independent speed of processing between different producers and consumers, on/offline modes and handling backpressure.Multi-tenancy to ensure that only the right web5_user can create, write to and read from different data streams in a single cluster.Industrialized deployment using containers, devops, etc., deployed where needed, whether on premise, in the public cloud or in a hybrid environment.

These characteristics build the foundation of a streaming platform, the beginning of your successful digital transformation. With services implementing a limited set of functions, and services being developed, deployed and scaled independently, you get shorter time to results and increased flexibility. This is only possible with a streaming platform having the above characteristics.

Use cases for a streaming platform

Here are some generic scenarios for how you can leverage a streaming platform with the characteristics discussed above:

Event-driven processing of big data sets (e.g., logs, IoT sensors, social feeds)Mission-critical, real-time applications (e.g., payments, fraud detection, customer experience)Decoupled integration between different legacy applications and modern applicationsMicroservices architectureAnalytics (e.g., for data science, machine learning)

Producer and consumers of different applications are really decoupled. They scale independently at their speed and requirements. You can add new applications over time, both on the producer and consumer side. Often, one event is required to be consumed by many independent applications to complete the business process. For example, a hotel room reservation needs immediate payment fraud detection in real time, the ability to process the booking through all backend systems in near real time, and overnight batch analytics to improve customer 360, aftersales, hotel logistics and other business processes.

While some processes need real-time processing, you also need to be capable of supporting batch processes. You even need re-consumption of data more often than you would think in the beginning, such as in cases of an application being down for some time, A/B testing with different versions of an application, adding a new application that needs to consume the data from scratch, or building different analytic models via machine learning based on the same data sets.

Think about some more use cases that you can build easily with a real decoupled system that is still a well-integrated and scalable streaming platform:

Selling before the customer left the storeAborting a transaction before the fraud happenedReplacing a part of a manufacturing machine before it breaksInforming customers if a flight or train is late (plus sending updates, rebooking or a voucher) You name it—the list goes on.

Big bang from batch to real time?

Now, you understand the added value of a real decoupled, scalable streaming platform. So, do I have to introduce this as a central data platform for all of our applications?

The following shows the streaming maturity model that we use to identify the current situation and planning in large enterprises: