When one of your IT services is on fire there is no time to waste. Especially if that fire is blocking your users from getting stuff done. Rapid resolution tends to eclipse all else during an incident, often causing your team to ignore or forget pieces of the incident response process \u2013 like keeping people in the loop.\nIt's one of those little problems that compounds into a big one if not handled correctly. Pretty soon, you're stuck in an endless loop of shoulder-taps and email threads, trying to explain to the CEO why things went wrong. While there's no shortage of tools to help your team detect, alert, swarm on, and resolve incidents, even the best tools can't replace clear communication to internal and external stakeholders.\nAnd let\u2019s be real: The stakes can be high, very high. Reputation, customer attrition, time spent on damage control, just to name a few.\nLuckily, downtime doesn't have to turn into a customer service nightmare. Informed users are happy users. But first you need to know who to communicate to, how to reach them, and how to do it with the least friction and fewest resources possible.\nCommunication during times like this is like ripples from a rock tossed into a pond. The circles closest to the incident get the biggest, most frequent and most immediate feedback. This is your core on-call team \u2013 AKA the folks who need to identify and fix the problem. It's a small circle, but the ripples (communication) need to be big, immediate, and frequent. As you move further from the core circle \u2014 to adjacent IT teams, managers, the organization as whole, end users and the general public \u2014 the audience gets bigger, but the ripples get smaller and less frequent.\nWhile every organization is different, in general it helps to think of these audiences as 5 distinct groups that need to be communicated with:\n\nCore on-call team: The first to know something is wrong, almost immediately upon impact (usually from monitoring and alerting tools).\nFront-line support team: Those who will be directly answering questions and giving customers updates during the incident. It\u2019s an incredibly important role, so this team must get the right information to pass along to end users.\nManagers and executive team: The core team needs to communicate with this group so they know what's going on, the potential impact on the following two groups, and hopefully an estimate of how long it could last.\nGeneral employee population: Employees need to be kept informed as services they rely on go down and up. Proactively communicating with these users means less "what's the status of this" questions, fewer duplicate IT support tickets, and more focus to fix the problem at hand.\nExternal customers: If the incident affects external customers some communication must be sent out to explain the problem and when they can expect a fix \u2013 or at least an update every nth amount of time. For issues that are still currently affecting your customers\u2019 ability to use your product, we recommend never going more than one hour without sending an update. You should also always indicate when to expect the next update. If it is a severe enough incident \u2013 especially one involving security or data loss \u2013 you will definitely want to expedite external comms and pull in the necessary other teams (legal, HR, security, etc.)\n\nxMatters and StatusPage are tools that have an interesting intersection between integrating solutions across your technology stack and then communicating status information out to drive workflow. With some of the biggest cloud companies as customers, we've seen how the highest performing IT teams are resolving incidents more efficiently while keeping users happier through a solid incident communication plan.\nCreating your own incident communication plan:\nBefore an incident:\n\nDefine priority\/severity levels (how many users are affected, how long the incident lasts, etc.)\nCreate incident templates for common issues to save time between detection to communication\nDocument defined roles during an incident (how to identify the incident commander, who owns the communication, etc.)\nDetermine how to communicate with affected users (what channels will be used for each priority level, etc.)\n\n\u00a0During an incident:\n\nCommunication with first responders: Alert those "on-call" and make sure they know where to go for more information about the problem. A tool like xMatters can help drive resolution by relaying data between systems while engaging the right people.\u00a0 This way, you never have to worry about keeping your technology infrastructure aligned with key resolution processes.\nCommunication with affected users (both internal and external) and other stakeholders (i.e. executives): Use your pre-determined channel(s) to tell users what's going on. This may be e-mail, a blog, Twitter, or a status page where they can subscribe to notifications about services they care about most.\u00a0 Whatever tool you choose to use, we recommend that you identify one as your primary communication vehicle and funnel everyone there from the other channels. For example, we have a\u00a0dedicated status page but we also tweet out updates and display a notice in our webapp during downtime. The tweets and in-webapp notices funnel users back to the status page for the full story.\n\nAfter an incident:\n\nHold a retrospective on the incident and figure out what (if any) post-incident comms are necessary -- as well as what you can do to prevent similar incidents from happening again.\nIf necessary, send out your postmortem to affected users. A good postmortem can actually generate a lot of goodwill with your customers. Ideally it will enable you to:\n\nApologize personally\nExplain exactly what happened and how your team was able to fix it\nTalk about your plan to avoid a similar situation in the future\n\n\n\nEven 99.99% of uptime means 52 minutes of downtime a year. Every IT team should be prepared for those 54+ minutes. Providing legendary service isn't just about resolving incidents quickly \u2013 it's also keeping users informed while you do. \u00a0Learn more about using xMatters for IT alerting and StatusPage for IT incident communication and see how they can work together to increase transparency.