From Palo Alto’s BigPanda comes a recent IT management report, The State of Monitoring, presenting survey results of 1500 IT professionals from various industries. Not surprisingly, the large majority of respondents (80%) listed their top operational and monitoring challenges to be the usual corporate suspects: budget, resources (staff), schedules, and responsiveness to service disruptions.
Also nestled in the list of the top five monitoring concerns, noted by 78% of respondents: “Reducing alert noise from the organization’s monitoring tools.” This happens to be a key business focus of BigPanda, but it also highlights a problem — and an opportunity — for equipment vendors striving to better meet customer needs.
Development Phase Review
IT product developers, working with their customer support counterparts, can greatly improve the customer experience by inserting three simple review steps into the release development cycle:
- Jointly scrub the system message catalog and SNMP MIB during the development phase. Is each message informative and valuable to the user, or does it state a vague internal condition of unknown significance?
- The severity and frequency of each alert or message should be assessed using a consistent rubric. Is the operational impact (or risk) of a problem correctly conveyed? Is terminology consistently applied, particularly compared to the UI? Can multiple related log notices be consolidated, or prioritized and selectively muted?
- For important alerts that do get passed on to the customer, ensure they are specific and actionable so the condition can be corrected without delay.
For UI developers, it is equally important to have a solid alert review process underneath. It is pointless to allow a steady stream of alerts and notices that push important messages from view in the UI. High impact issues should either remain visible or be easily recalled. But they could also be consolidated to efficiently account for recurring unacknowledged alerts.
Sooner is Better
One consequence of severe over-reporting of system messages is that customers sometimes have little choice but to ignore them all. Secondly, alerts and notices poorly done can become ‘case generators’ for the vendor’s support organization. With a large installed base, the cost becomes problematic. The product development team can help minimize such calls, and the sooner the better: Even when fixed, it’s not uncommon for OS updates to take many months to permeate the installed base.
Incident management tools such as Big Panda‘s help IT administrators deal with an ever-increasing volume of alerts received from their data centers, particularly by providing multi-vendor support and event correlation in a single tool. But it’s in the equipment manufacturers’ own interest to step into their customers’ shoes and recognize that there is competitive advantage to improving their products’ alert behavior. A streamlined and more actionable alert system can reduce calls to Support, improving customer satisfaction and reducing support costs. With better-quality data input, analytic monitoring tools such as BigPanda’s further enhance the sysadmin’s ability to cut through noise and distraction in day-to-day IT management and focus on the best service delivery possible.