Erlang

How data drives MongooseIM

by Jan Cieśla

To make decisions about how to steer your open source project, you need to know whether you’re on the right path or if you need to course-correct. That’s why you need to equip your project with a proper compass and barometer to help you navigate. Gathering the metrics of your project can be one of the tools that can help you gain more insight into how your project is being used. When used wisely, data can help open source maintainers understand how users are responding to new functionality. This allows you to prioritise the work you do and understand how the features are being used. You can identify less used functionalities that might not need as much support. Most importantly, data can turn suspicions and opinions into facts which help to improve your project, leading to more satisfied users.

In this blog post, we are taking a deep dive into the new MongooseIM feature that was introduced in version 3.6. Now, system metrics are gathered to analyse the trends and needs of our users, improve MongooseIM, and let us know where to focus our efforts. This blog post is devoted to explaining what the new metrics are, what they help us achieve, why we’ve done it, and how you can manage and customise them at your end.

Why do we do it?

Knowing how our product is used is critical for us to identify the core value it brings to the users. It points us in the direction in which to expand it and show us how to target our further efforts in developing it. The collected data only has statistical relevance and is automatically anonymised before it is processed any further. Each MongooseIM randomly generates a Cluster ID that is attached to the reports. A sample report showing mod_vcard backends usage from our CI builds can be found below.

load testing diagram

Such reports can show us how we approach testing different configuration scenarios. This can be contrasted with real-world metrics that are gathered.

load testing diagram

Based on these reports, we can see the frequency of different backends being used (or not used) with mod_vcard. These comparisons can tell us that the LDAP backend was not widely used for the past month; no user installation reported this configuration.
It is important to note that such reporting is not yet painting a full picture of the MongooseIM ecosystem. The metrics feature has just been introduced and is mostly showing fresh installations/upgrades. It might take some time to draw decisive conclusions from long-running deployments.

Where can you see the information gathered?

You can view all the information that is shared in two different ways. The log file system_metrics_report.json contains the most recent report that was sent. Additionally, you can configure the Tracking ID to use your own Google Analytics account and have a view of your MongooseIM status in that dashboard.

How can you configure this service?

To ensure full transparency, you will notice a log message that is generated on every MongooseIM node start (unless the metrics service is configured with the report option) to show that the functionality is enabled. We wanted to notify you that the metrics are gathered, and you have the right to withdraw consent at any time without limiting the functionality of the product. This feature is provided as a “service”. To be operational, it needs to be added to the list of services as shown below:

Example configuration

{service_mongoose_system_metrics, [
                                   report,
                                   {intial_report, 300000},
                                   {periodic_report, 108000000}
                                  ]
}

The metrics are first reported shortly after the system startup and later at regular intervals. These timers are configurable using the initial_report and periodic_report parameters. The default values are 5 minutes for the initial report and 3 hours for the periodic one. These reporting intervals can be changed depending on the configuration parameters. Removing the service_mongoose_system_metrics entry from the list of services will result in the service not being started. Metrics will not be collected and shared. It will generate a notification that the feature is not being used. The notification can be silenced by setting the no_report option explicitly. For more details regarding service configuration, please see Services section in our documentation.

What information are we gathering?

When introducing this feature, it is crucial for us to be fully transparent as to what information is being gathered. In general, we capture data on how MongooseIM is being used, its version and the chosen feature set. We only report the names of known modules and APIs that are part of the open source product. All additional customisations are simply counted without disclosing any specific details. The full list of information that is being gathered is listed below:

  1. MongooseIM node uptime.
  2. MongooseIM version.
  3. The number of nodes that are part of the MongooseIM cluster.
  4. Generic modules that are part of the open source project and are in use. Some modules report what database they use as a backend.
  5. Number of custom modules - without disclosing any details, we are just curious to see if there are any.
  6. Number of connected external XMPP components.
  7. List of configured REST APIs that are part of the open source project.
  8. XMPP transport mechanisms like, TCP/TLS, WebSockets or BOSH.
  9. Geographical Data - Google Analytics is providing several geographical dimensions, such as City, Country, Continent. These values are derived from the IP address the data was sent from. You can learn more about Googles Geographical Data here for more details.

How do I configure additional and private Tracking ID’s in Google Analytics?

The data is gathered and forwarded to Google Analytics. The user can add custom Google Analytics Tracking ID in the MongooseIM configuration and see all incoming events that are related to their own system metrics. For more details on how to create or sign in to the Google Analytics account, please see Get Started with Analytics.
The Tracking ID is a property identification code that all collected data is associated with. It determines the destination where the collected data is sent. To create a new Tracking ID, please follow the steps below:

  1. Go to the Admin tab of your user dashboard.
  2. Create a new account with + Create Account.
  3. Add new property with + Create Property.
  4. Within the new property go to Tracking Info > Tracking Code.
  5. The Tracking ID can be found in the top left corner of the section and has the following format UA-XXXX-Y.

Example configuration

A new Tracking ID can be added to the list of options as follows:

{service_mongoose_system_metrics, [
                                   report,
                                   {intial_report, 300000},
                                   {periodic_report, 108000000},
                                   {tracking_id, UA-XXXX-Y}
                                  ]
}

Summary

Getting equipped with knowledge for the journey of your open source project might be a crucial factor to develop a successful product. Collecting metrics about your software can help you plan your work, measure quality and gain more insight on how your project is being used. If you’d like to learn more about how MongooseIM makes scalable, customisable instant messaging easy head to our MongooseIM page, or get in touch.

For more information and MongooseIM System Metrics Privacy Policy, please see our documentation page.

You may also be interested in:

Our new online training

How MongooseIM is improving push notifications

Which new companies are using Erlang and Elixir

Is your Instant Messaging platform GDPR complaint?

Go back to the blog

×

Thank you for your message

We sent you a confirmation email to let you know we received it. One of our colleagues will get in touch shortly.
Have a nice day!