This post’s goal is to provide you with a deeper intuition behind two of the most common software messaging patterns—queues vs. pub/sub. This is a deep topic which deserves thoughtful attention when designing software. This article is structured in three parts:
- Develop an intuition for messaging, queues, and pub-sub through analogies.
- Share a short heuristic to help choose between queues or pub-sub.
- Discuss software implications for each paradigm.
Software Messaging — Thinking Abstractly Before Concretely
Software messaging is a topic where discussions weave in and out of the abstract and the concrete. Logical paradigms like queues and pub/sub are discussed in the same breath as physical technologies like Kafka and RabbitMQ.
For technical topics, carefully delineating between the abstract and the concrete provides you with mental clarity. An interface is implemented with a class, a stack is implemented with a linked list, an instruction set architecture is implemented with a processor. Virtual, logical, abstract. Implementation, physical, concrete. There will be times when you need to draw boxes on a whiteboard; there will be times when you need to tune obscure Kafka consumer settings.
Let’s begin by thinking abstractly.
What is the essence of communication? Communication is the sharing of information between multiple entities for a shared purpose.
What is the essence of messaging? Messaging facilitates communication. A message is any information passed between multiple entities.
Queues vs. Publish/Subscribe
Two of the most common software messaging patterns are queues and pub/sub. In the context of software development, the word “pattern” can be substituted with “paradigm” or “architecture.”
Analogies can be an effective way to build intuition.
Queue <> Restaurant
A restaurant is built around a queue. Waiters produce orders. The kitchen consumes orders and produces meals. The orders are kept in a queue of messy hand-written tickets. First order in is the first meal out. The restaurant queue is purposeful—we want orders to be processed by the kitchen and only the kitchen. There is no need to prepare an order twice. If orders get backed up, adding a few chefs may help. The restaurant (waiters, tickets, chefs) all work together as a unit.
In computer science, a queue is a collection of entities that are maintained in a sequence and can be modified by the addition of entities at one end of the sequence and the removal of entities from the other end of the sequence.Wikipedia
Publish/Subscribe <> Bulletin Board
A bulletin board is built around pub/sub. Looking to make some extra pocket money, you put up an ad for private programming lessons on the town bulletin board. You have no idea who will read it. You’re hoping a few students contact you, but you keep your expectations low—it’s OK even if no one reaches out. By default, the town takes down ads after 7 days. They tell you that if you pay more, they’ll build a bigger board to keep your ad up longer.
In software architecture, publish–subscribe is a messaging pattern where senders of messages, called publishers, do not program the messages to be sent directly to specific receivers, called subscribers, but instead categorize published messages into classes without knowledge of which subscribers, if any, there may be.Wikipedia
Which One Do You Need?
Based on your requirements, does your design need to be more like a restaurant or more like a bulletin board? Decide on the pattern (paradigm, architecture, etc.) before you decide on the implementing technology.
There is a shortcut to help you decide. What are you trying to communicate? There are two common message types—events vs. commands.
Events are customerNameChanged, orderPlaced, or jobPosted.
Commands are sendEmail, processLogs or cookFood.
Command-style messages lean towards towards queueing.
Event-style messages lean towards pub/sub.
This is only a heuristic.
People often prefix “queueing” with “message.” You hear the phrase “message queueing“, but you rarely really hear the phrase “message pub/sub.”
Messaging applies to both of these software paradigms. There is nothing wrong with either “message pub/sub” or “pub/sub messaging“, other than the fact that it may sound a little awkward.
Messages and messaging are implicit when discussing queueing and pub/sub. Don’t let extra words confuse you.
After arriving with your chosen messaging paradigm, it’s time to choose an implementing technology. Any single technology can achieve a variety of different messaging patterns. You can achieve queueing with either Rabbit or Kafka. You can achieve pub/sub with either Rabbit or Kafka. Give a programmer a database and a cache, and many, many software patterns can be achieved.
Despite the flexibility of various technologies, they are still built with specific use-cases in mind. Kafka is optimized for pub/sub while RabbitMQ is optimized for queueing. Both these technologies come with their own pros and cons. It’s impossible for any single messaging software to be optimized for every single messaging pattern.
Deeper discussions arise when evaluating a technology to be used for a paradigm that it is not optimized for. If your design calls for queueing, you may be debating whether or not Kafka is a sound solution—the answer may likely be yes and that is perfectly OK. Perhaps Kafka is a preferred technology based on your organization’s operational expertise. Perhaps the expected write throughput is so high that Kafka is the only feasible solution. Perhaps nothing off the shelf works and you’ll need to build it yourself! Implementation choice is always situational.
Software Design Implications
A choice in software paradigm will quickly be followed by a variety of software implications.
Extensibility & Decoupling
With pub/sub, producers and consumers are naturally decoupled, making separation of concerns easier to achieve. Introducing a new consumer is an isolated activity which does not affect message production.
With queueing, a new consumer represents a change to the overall system, both on the producer side, the consumer side, and everything in-between. In order to “activate” new consumer functionality, a new queue must be established with connections to a corresponding producer and consumer. Since a variety of components must be updated, the queueing paradigm is considered less extensible than the pub/sub pattern (in the context of new consumer behavior).
In the realm of queueing, technologies like AMQP can assist in making your life easier. When setting up a new consumer, a few changes to exchange and binding configurations may get most of the job done. However, regardless of AMQP or any home-grown queueing solution, adding a new consumer fundamentally requires a holistic change to the system.
With pub/sub, the ease of adding new consumers comes with a potential cost—a tight contract for messages. The convenience of introducing consumers assumes messages are produced under the same contract (payload, format, etc.) However, what happens if a consumer wants to consume an existing message type, but needs a little bit of additional data? Is this new functionality worth it for us to change how we fundamentally publish messages? How would this affect other consumers of the topic? Perhaps pub/sub leads to more coupling than we initially expected.
For queueing, message contracts are flexible. Restaurants and queueing systems behave as a unit—we know exactly what we are producing and exactly how it’s going to be consumed. We can change how we write tickets at our pizza shop without affecting the shop across town. Message contract flexibility also helps optimize consumers; we provide them exactly what they need.
Security must be top of mind when designing a pub/sub system. If a thief is snooping through the town bulletin board, you might think twice before putting your phone number on your ad. In a highly distributed setting, it may be difficult to keep track of all consumers for any given Kafka topic. Even when everything is internal, a consumer may be misbehaving somewhere.
Observability & Monitoring
One of the benefits of queueing is that you know exactly how your messages will be consumed. There is clear, expected behavior for both production and consumption. These expectations yield strong observability and monitoring. Systems which lean into pub/sub patterns require additional effort to properly observe and monitor.
Popular pub/sub solutions like Kafka allow extremely high throughputs for message production. Though I have never seen it myself, Kafka advertises that you can write up to 1M messages/sec to a topic! RabbitMQ gets no where close to this. If your system requirements call for this level of scale, there are only a few technologies that are viable.
More Responsibility For The Consumers
The pub/sub pattern puts extra onus on the consumer. A producer’s job is easy, but that doesn’t mean the overall work is easy—it just gets transferred to a different location.
Imagine a software team that is responsible for emitting important event-style messages when customer data changes. If that team is optimizing for the least amount of work possible, they may propose a pub/sub strategy. Fire and forget to the “CustomerEvents” topic is low effort and appears extremely extensible.
However, the customer analytics team isn’t a big fan of this approach. They are left with the burden of consuming a firehose of customer events, many of which are meaningless for them. Furthermore, important data is missing in the message payloads, forcing the analytics team to jump through hoops to resolve additional data, putting burden on other systems.
The customer analytics team makes a case for an architecture which promotes a tighter collaboration utilizing queueing paradigms rather than pub/sub. For large distributed organizations, it can be a risk when so much responsibility falls upon one side of the equation. Consuming a Kafka topic is easier said than done.
Software messaging is a huge topic. We’ve only scratched the surface here, but I hope this post has left you with a deeper intuition between two of the most popular messaging patterns.
Queuing is like a restaurant.
Pub/sub is like a bulletin board.
When making technical decisions, merge your thinking between abstract paradigms and concrete technologies. It’s not helpful to memorize how AMQP/Kafka/etc. work if you’re not aware of the software patterns they are designed to solve.