Monitoring RabbitMQ: Native Nodes vs. Independent Agents
by Ayanda Dube
Before you go any further, you should know that you can test WombatOAM out today with a 45 day free trial for WombatOAM 3.0.0beta.
RabbitMQ users are accustomed to monitoring and keeping track of the status of their RabbitMQ installations using the native RabbitMQ Management plugin, or, alternatively, using third party monitoring tools such as Sensu, which internally make use of the RabbitMQ Management API for metrics acquisition.
Regardless of which tool users prefer, a common aspect which cuts across most of these off-the-shelf tools is full dependency on the RabbitMQ Management plugin. In other words, for you to monitor and manage your RabbitMQ installation via a web interface, the RabbitMQ Management plugin has to be enabled at all times on the RabbitMQ nodes.
The downside of this approach lies mainly on the overhead the RabbitMQ Management plugin introduces per node on which it is enabled. The following image depicts the accompanying and required applications introduced on a node when the RabbitMQ Management plugin is enabled;
Fig. 1: Applications introduced per node when the RabbitMQ Management plugin is enabled
A total of 13 additional applications are required by the RabbitMQ Management plugin, which aren’t related to, or required to run any of the AMQP operations.
Internally, the RabbitMQ Management plugin creates multiple Erlang ETS tables, which are RAM based, in order to store, aggregate and compute statistics and various RabbitMQ specific node metrics. Unless the hardware has been dimensioned to take this into account, it can place a huge demand on the node’s memory, and could potentially contribute to a number of unknown side effects when traffic load patterns vary from peak to peak.
In an ideal world RabbitMQ nodes should be dedicated to delivery of the AMQP protocol i.e. queueing and message interchange logic between connected clients. All potential burdensome operations like UI monitoring and management should ideally be taken care of on a completely independant node; deployable on a separate physical machine from the RabbitMQ nodes, for all OAM functions. This is how the telecoms systems have addressed monitoring and operations for decades.
This inflexibility of the RabbitMQ Management plugin and other monitoring tools dependant on its API, bring to light the main strengths and advantages of using WombatOAM. Illustrated below, is the application overhead WombatOAM introduces on a RabbitMQ node;
Fig 2: Application overhead WombatOAM introduces on a RabbitMQ node
The WombatOAM approach to monitoring RabbitMQ
So what is different about the WombatOAM approach of monitoring RabbitMQ?
Firstly, only one additional application is introduced, which is the wombat_plugin, unlike the 13 additional applications which are introduced by the RabbitMQ Management plugin.
The reason behind this is the fact that WombatOAM only requires a single lightweight and highly optimised application, the wombat_plugin application, on the node it’s monitoring, to relay all metrics and events to a separate and independent node, responsible for carrying out all heavy computations and UI related operations.
This implies a much less, if not negligible, application overhead on the RabbitMQ node in comparison to that introduced by RabbitMQ Management plugin, which carries out all of its computations and operations on the RabbitMQ node itself. These computations compete for the same resource as the traffic which RabbitMQ is processing.
WombatOAM thus fully leverages distributed Erlang by carrying out its major operations and maintenance functions in a loosely-coupled manner, on an independent and separate node. This grants much more freedom to the RabbitMQ nodes to predominantly focus on AMQP functions, only, with very little or no chance at all, of experiencing any problems relating to UI operations and maintenance functions.
NOTE: For releases prior 3.6.7, the RabbitMQ Management Plugin attempts to reduce the amount of overhead when used in a cluster setup. Not all nodes in a cluster need the full RabbitMQ Management plugin enabled, but only one node. The rest of the nodes only need the RabbitMQ Management Agent plugin enabled, which is a single lightweight application with no additional dependencies. This implies that any problems arising from the heavy resource footprint of the RabbitMQ Management Plugin would only be experienced on one out of N cluster nodes.
However, as of 3.6.7, the RabbitMQ Management Plugin’s statistics collection is now distributed across the cluster, meaning, the statistics database and metrics aggregation are no longer carried out on a single node. This lessens the memory footprint which was imposed by the RabbitMQ Management Plugin on the single node it was enabled on, and is thus projected to alleviate most of the known / common problems which the legacy architecture was prone of manifesting. The architectural change of the Management plugin is however at an early stage (at the time of writing this paper), and yet to be fully proven by the community on whether it meets and solves the major problems it’s intended to solve.
Go back to the blog