I recently answered a question on stackoverflow that revolved around this argument:
“Should I scale out my AMQP consumers, tweak the QOS values, or both?”
This is a broad question. The first thing to consider is the actual business process that facilitated by AMQP. Is your business process time-sensitive? For example, let’s say that you have a process which persist data to a database. How important from a business perspective is it that this data is saved immediately?
Quality of Service (QOS)
AMQP Channels contain a property called QOS. The value of this property determines the manner in which AMQP Consumers read messages from Queues. A QOS value of “1” ensures that only a single message at a time will be de-queued. The next message will not be processed until the current message has been handled. Consider the implications of this. Given a Queue that contains 5 messages, and a Consumer, with QOS set to “1”, that reads from that Queue, message #5 will not be processed until messages 1 – 4 have been de-queued and processed:
If it takes 200ms to process each message, then M5 will not be completely processed for 1 second. This figure will rise exponentially as the Queue grows in length. If a Publisher, or set of Publishers suddenly load 1,000 messages onto the Queue, our Consumer’s processing time now becomes minutes, let alone seconds.
So let’s increase our QOS value to “5”. Now our Consumer reads from the Queue at a rate of 5 messages at a time. Our Queue is now empty. But what’s happening at the Consumer? The Consumer manages its own Shared Queue in memory. This Shared Queue is a local representation of an AMQP Queue. When a Consumer de-queues a message, that message is cached in the Consumer’s Shared Queue, and processed accordingly.
Based on the above example, our Consumer’s Shared Queue now contains messages 1 – 5, and our actual AMQP Queue is empty. This offers a win; remember that our goal with RabbitMQ, and AMQP in general, is to keep our Queues empty, or near as possible, at any given time to reduce resource consumption. In this case, our Queue is empty.
Well, we’ve reduced RabbitMQ’s resource-footprint, but what’s the catch? Well, we’re still faced with the same problem, which is that our messages are processed sequentially, so we haven’t reduced the overall time that it takes to process the messages – we’ve just moved the problem from one context to another.
If we could process each message in parallel, then we could reduce or processing time. We can achieve this by adding multiple Consumers. Each message takes 200ms for a single Consumer to process, so 5 Consumers can process 5 messages in 200ms.
Actually, this won’t happen, or at least is not likely to happen, because we’ve left our QOS-value at “5”, so let’s follow the process flow, assuming that we’ve started each Consumer in order of 1 through 5. This is important because in situations where multiple Consumers are listening to a Queue, RabbitMQ will prioritise delivery of messages in the same order that the Consumers started listening. If Consumer #1 was the first Consumer to start, it will be the first to receive messages.
This is the problem, in our case. Remember, a QOS value of “5” means that the Consumer will read 5 messages at a time. So in this case, assuming that our Queue contains 5 messages, Consumer #1 will read all 5 messages, and process them sequentially. Consumer’s 2 – 5 will remain idle and wasted.
Our problem is now that our Consumers are not adequately balanced. To facilitate accurate round-robin style balancing among Consumers, we simply set the QOS value of each Consumer to “1”. This results in the following behaviour:
- Each Consumer reads 1 message at a time
- Older Consumers will receive messages first, assuming that they are not busy
We’ve reduced our overall processing time to 200MS. We can now process 5 messages in 200MS. So why not do this all the time? There are 2 answers to this. The first concerns the technical aspect of the design. Consider that running multiple Consumers costs resources, especially memory. Just remember this before you think about creating thousands of Consumers. The above problem is simplistic, but this design may not suit your infrastructure for real-world problems.
The second answer concerns the business needs of the actual process that we are managing. Let’s say that our process is a data-logging mechanism, and that our messages contain logging metadata. In this case, we may not care whether or not it takes several minutes to save this data. Logging data is rarely referenced until something goes wrong, so our initial single Consumer solution may be adequate. Now consider a business process that persists time-sensitive metadata to a database. In this case, our multiple Consumer solution may better suit our needs.
What if 1 second is an acceptable processing time? Then we can use a combination of both solutions. We can introduce multiple Consumers, and set their QOS values to “5”. The result is that no Consumer will ever process more than 5 messages at a time, and therefore won’t take more than 1 second to process. Assuming that we introduce 5 Consumers, we can process 25 messages per second. Why not just use 5 Consumers with a QOS value of “1”? Won’t this achieve the same result – 25 messages per second? Yes, or at least close enough, but likely not exactly, because there is more network traffic involved, as we’re now reading 25 messages individually as opposed to 25 messages in batches of 5.
It is critical to take into account the underlying business process when considering an AMQP solution. Generally, a one-size-fits-all approach for all business processes is too broad an approach.
Connect with me:
This is a really useful overview of the QOS behaviour. You mention “The Consumer manages its own Shared Queue in memory.” – I was wondering how this would work in a Round Robin scenario?
Example: 1 queue with 2 consumers – no QOS value, so dispatching with round-robin. If there are 1000 messages added to the queue, each one will be dispatched, in order, to the 2 consumers. If the consumer is only capable of processing 1 message a second, would that mean each consumer would have 499 messages backed-up in it’s Shared Queue – potentially leading to load/memory issues on each consumer?
Theoretically, yes. However, without a working example it’s just a hypothetical. But, yes, 499 backed-up messages should result in each queue. I wouldn’t worry too much about memory consumption, unless the consumer is particularly slow, which is another issue entirely.
The general rule of thumb with pub/sub systems is that consumers should return quickly as possible, and long-running processes should be segmented out-of-bounds, where long processing-time does not affect day-to-day operation of the rest of the application.