Opinion, advice, and analysis by the TNW community

Handling a System Outage: 14 Things to Remember During Recovery

Scott Gerber
Story by
Scott Gerber

Scott Gerber is the founder of Young Entrepreneur Council (YEC), an invite-only organization comprised of the world’s most successful young entrepreneurs. YEC members rep… (show all) Scott Gerber is the founder of Young Entrepreneur Council (YEC), an invite-only organization comprised of the world’s most successful young entrepreneurs. YEC members represent nearly every industry, generate billions of dollars in revenue each year and have created tens of thousands of jobs. Learn more at yec.co.

YEC

When dealing with complex systems, anything can happen, which means that no matter how well you’ve tested something, problems like system outages are inevitable — and are often caused by events out of your control. It’s how you respond during a crisis that matters.

So if you find yourself experiencing an outage, you know that all of your efforts need to go toward identifying the issue and then recovering as quickly as possible, all while making sure the people impacted know what they need to. So what can you do in order to make sure things go as smoothly as possible? To find out, we asked a panel of Young Entrepreneur Council members the following question:

What is one crucial thing you need to do in order to troubleshoot or recover from a system outage as quickly as possible?

Here is what they advise:

1. Have A Plan In Place Prior To The Outage

If you go into a system outage with a plan in place, you’ll feel much more prepared for when things go bad. During the outage, you and your team can reference the protocol and go through the necessary steps to ensure you’re doing everything you can to get back up to speed. Outages are often out of our control, but having a plan going into it is a great way to help minimize the pain. – Joel MathewFortress Consulting

2. Assign A Manager To Handle The Outage

If multiple people try to come to the rescue, it’s counterproductive and wastes time. Having one leader in charge of the outage speeds up the process and makes communication faster and easier. – Jared AtchisonWPForms

3. Store Back-Up Data on an External Server

To minimize damage and recover quickly from a system outage, you should always have your data backed-up on an external server. Therefore, if your system crashes, the data will still be accessible. Safety does not have to be expensive; you can find a lot of good, inexpensive options for cloud storage to choose from. – Matthew PodolskyFlorida Law Advisers, P.A.

4. Use Collaboration Tools For Effective Communication

You should prepare for a potential outage before it occurs by having the right collaboration software and information readily available. You need to know how to get in touch with stakeholders and responders quickly for the best results. If you’re running around for this information after the outage strikes, this will cause chaos, stress and potentially lost revenue. – Thomas GriffinOptinMonster

5. Notify Clients

If a system outage directly affects any clients, it’s important to let them know as soon as possible. Address the issue, make sure they know that you’re trying to resolve the issue as quickly as possible and apologize for the inconvenience. – Stephanie WellsFormidable Forms

6. Diagnose The Issue Quickly

More likely than not, you and your team will be able to diagnose the issue immediately or in a short span of time. It’s still important to make this your No. 1 priority when you experience an outage so you can solve it right away. Make sure you have a solid knowledge base and diagnostic manual on hand to assess the issue. – Chris ChristoffMonsterInsights

7. Practice Your Recovery Plan Regularly

The key to fast recovery from outages is having a plan in place that is tested on a regular basis. Don’t make a plan and then put it on a shelf: It will be outdated! Have a disaster recovery plan, and test it at least twice a year to make sure the plan actually works. Have solutions planned for losses like electricity, phones/internet, physical building, key people and access to computers. – Jeremy BrandtWeBuyHouses.com

8. Document Everything

When a system outage occurs, it’s important that you document everything. Documenting everything will allow you to easily identify the problem and give you warning signs that will help you prevent the issue in the future. Document everything you did before the incident, what happened during and after, etc. onto a piece of paper or into a computer file so that you have a record of everything. – John TurnerSeedProd LLC

9. Utilize Continuity Software

Having a good continuity system that will manage potential outages at the network level will allow you to not only safeguard yourself but get the right people contacted as quickly as possible. This way it doesn’t matter when the outage actually occurs because the continuity software will reach them right away. – Nicole MunozNicole Munoz Consulting, Inc.

10. Get All Hands On Deck

During an outage, everyone capable should stop what they are doing and focus on recovery. You’re going to want to have a well-trained, knowledgeable team that can handle these types of situations. If only a few people are working on a fix, you may find that the troubleshooting and repair process takes much longer. – Syed BalkhiWPBeginner

11. Prioritize Recovery Tasks

During a system outage, it’ll get hectic, stressful, and confusing. Stakeholders will be phoning in asking why the system is down, even while you’re still troubleshooting. It’s important to prioritize your tasks to bring the system back up. Ask what the most important things to do are, then focus your time and energy on those instead of spreading yourself thin trying to put out all the fires. – Frederik Busslerbitgrit Inc.

12. Gather Data And Start Eliminating Variables

One best practice to troubleshoot and recover from system outages as quickly as possible is to gather data and begin to eliminate as many variables as possible to pinpoint or troubleshoot the outage or error. While this is going on be sure to communicate to your team or customers if the outage has caused delays or damage. The sooner you respond, the better people will feel about the resolution. – Jared WeitzUnited Capital Source Inc.

13. Identify The Severity

Identify the level of severity so you know how many staffers to assign to fixing the outage. Then, focus on data recovery as quickly as possible so as little of it as possible is actually lost. Make sure the assigned team gets to the root of the problem (and not just fix it) so you can put measures in place so it doesn’t happen again. – Andrew SchrageMoney Crashers Personal Finance

14. Stay Calm

Outages happen. It is Murphy’s law. The important thing is to plan for it and accept it. Contact your server administrator and your development team and convey the urgency. Most hosting companies keep backups for at least one day, so you have recourse. The important thing is not to freak out and make things worse. – Brian GreenbergTrue Blue Life Insurance

Published October 4, 2019 — 09:00 UTC