Performance Monitor for Drupal 7. Part 1. Motivation. Why Reinvent the Wheel?

When you are a small company that tries to establish long-term relationships with its customers, you want to provide as much integration with them after the launch as possible. This may include hosting or hosting reselling, a support and update plan, and a continuous growth/integration plan. This may also mean, that you will provide performance support as well. It is this last aspect of performance support that I have been thinking about recently, writing a proof-of-concept mechanism to implement it. I would like to bring out some thoughts into a wider community to discuss.

When providing ongoing support to multiple customers, there arises a need to centralize monitoring of their multiple web sites. Performance monitoring is one of those criteria that you want to keep track of. Sites taking too much time to load or exhausting memory on their reseller accounts are a situation that we want to handle before it becomes too bad and sites have to be put down for days in search of a solution. This leads to a need of a centralized monitoring system, a system, that will keep track of the current website performance status, and displays these statuses in one place, where a manager can keep an easy track of them. At the same time, performance tracking and aggregating can be quite taxing on the very aspect we are monitoring. Thus, performance tracking needs to be as subtle as possible.

There are a number of tools to handle performance monitoring, that I will cover here shortly.

On the Drupal modules side, there is a Performance Logging and Monitor. This module did not suit me for three reasons. First is that it does not have a centralized integration - you can’t specify an URL in it, where to feed the performance stats. But there is also a bigger problem - performance statistics is handled locally. Once performance data is gathered, it is then written in a local log. And third, it does not analyze the data and does not aggregate it for you. You simple have a log, and you have to study it and find narrow spots in it yourself.

Then, there are some really good paid tools. I have tried the New Relic and I absolutely came to love it. New Relic has a Drupal module, but it’s main work is at the server level. You install their php mod, and it gathers all kinds of data, from network activity to the hard drive throughput. Thus, it’s very extensive, and it’s also low-level. It then feeds the data to the New Relic servers, and it gets analyzed and aggregated there. So, it’s both powerful and gentle on your own site’s performance. However, it’s very expensive, and it is also an overkill. Unless you want to really study the performance on your php application and invest into it - you would think it’s an overkill. And its expense makes it unusable as a part of support plan for a group of the smaller and medium-sized web sites.

There are also some cheaper alternatives to the New Relic, that I have reviewed to an extent when I decided not to pursue - some had poor or no integration with Drupal, and yet some others were too expensive to be useful. I have also reviewed some free tools for IT performance management. There are some powerful tools around. However, these tools are not integrating with Drupal, usually they do not work with multiple hosts well.

One common problem of all these solutions, free or not, is that some extensions need to be installed on the server. This is a no-go for most shared hosting plans.

So, this has brought me to a realization, that I would have to write a custom solution, that would include two Drupal modules. One such module would be installed on the production servers. That module would be a client. It would gather the page statistics in a most unobtrusive way possible, gathering only the basic parameters initially, and send the stats to a remote host via a socket for analysis.

Then, the second module would be a bigger one. It would be the server. The server would gather the statistics, store it for a week, and analyze it. Then, it would accept connections from the client, so that when the client pings it for status via an AJAX request, it would respond with a short notice of the status and a short message if needed. “All is ok, you run well.” Or, “Performance is low, memory usage.” Web site admin would check for the warnings, and refer to the server’s account for more analytics and graphs. The server would have to handle multiple ‘apps’, each app referring to a single Drupal client, and organize these apps per user.

This task seemed clear and easy to implement, yet now, that I am on the final stages, I realize, that it has been less easy than it seemed. In the next blog posts, I will bring up the issues and decisions that I have been making in the process, and hopefully, will have some discussion as well.