Applying collective intelligence to IT outage notifications

Unless you’ve lived through the pain of trying to use an essential but unavailable application at work this will not be the most interesting topic. It’s a difficult problem that has been bugging me lately and this has got me interested in applying Enterprise 2.0 ‘harnessing collective intelligence’ to the challenge of effective IT application communication.

There are times of the day when I’m looking to use email to respond to questions, requests for assistance or sort out issues. Losing access to email has made me realise how very frustrating it is to lose access to an essential application that means you can’t really do your job for the duration of the outage, the frustration is amplified when there’s no notice that the outage is being worked on, let alone a possible resolution time. This has brought home the challenges my internal customers experience. I work in a team responsible for an application that our users use to meet their team targets, so when our application is unavailable they start missing targets: it’s quite an expensive event for them.

We have an excellent notification system for serious issues; it sends SMS and email alerts to the nominated representatives, and we have a skilled team who manage such incidents. The challenges are that this is a one-way communication channel and when it comes to email it is also a seriously overused channel.

Are we limiting ourselves to an outdated idea of one-way communication?

Can we apply the principles of Enterprise 2.0 to communicate with our users more effectively? Could something more unstructured, more informal and more open provide a complementary channel for communication from the IT teams to their users?

Applying the principles outlined by Andrew McAfee, in his Enterprise 2.0 paper, suggests we would need these features:

  • Search – The notifications need to be able to be targetted, that is, I can see availability communication about the applications I’m interested in. No spam, nothing that doesn’t matter to me. This will be through searchability of the details and also of the tags.
  • Authoring – Anyone can contribute, not just the official owner team (incident management or application team). This should mean earlier detection of incidents as well as richer feedback to the application team.
  • Links – If everyone can contribute information about an application’s availability and performance there will be a clearer view of what is important, that is, if 1000 users are complaining about our HR system that tells us something different than if 2 users are having problems.
  • Tags – Categorisation of the communication by anyone will lead to the community of users developing categories that are meaningful to them, the IT team that then adopts these will become good communicators. We may end up with categorisations like #out, #slowdown, #scheduled; or we may not, in which case we’ll learn to speak the language of our users.
  • Extensions – Increasing the intelligence of the channel so that if we are finding that most of the users who use the finance system are also using spreadsheets (and not vice versa) then when someone shows an interest in the finance application, they are prompted to also register their interest in Excel.
  • Signals – Selective alerting through RSS and effective aggregation will have benefits beyond the need for notifications. Organisational functions that need to manage at an aggregated level, eg pattern identification across teams, will have better data to take action on.

The seismic shift I believe we need to make is to open up our communication channels to become more effective. It’s a challenging proposition in an organisation that by necessity needs to have strong governance and control over our IT systems, are we willing to let go control of our communications? And is there a place for a collective intelligence tool that can complement more official channels with a large corporate?