Last month, a faulty CrowdStrike software update affected 8.5 million Microsoft Windows devices using the CrowdStrike Falcon security product. It was one of the largest-ever IT outages, causing global disruptions. This included travel delays, media broadcasting interruptions, cancelled medical appointments, and halted payments. A rapid fix was deployed, but the incident underscores vulnerabilities.
Since the COVID-19 pandemic and the rise of cloud-hosted services, more companies rely on cloud-hosted services. Businesses rely on cloud-based services and cybersecurity solutions to protect their digital assets. This latest outage highlights the impact of a single failure point on businesses. It demonstrates the wider operational risks with third-party service dependencies.
A routine software update is a wake-up call to IT teams. It reminds CIOs of the vulnerabilities in the most robust systems. IT infrastructures nowadays are complex, with extensive dependencies and risks sometimes unknown to those managing them. CIOs and business leaders must protect their businesses and downtime from future IT outages.
Here are key recommendations for IT departments:
Assess exposure
Understand your reliance on individual providers to mitigate risks. Crashes, cyber hacks, and data breaches are a threat as businesses are more digitised and interconnected. CrowdStrike and Microsoft are both reputable. Yet, there is a risk whenever an organisation is too reliant on one provider.
Software and hardware concentration is in a few providers' hands. Amazon, Microsoft and Google account for two-thirds of the cloud provider market. CrowdStrike has close to a fifth of the cyber security market.
Vendor risk management
Check the risk management and disaster recovery capabilities of your main vendors. Do you have plans to manage outages and provide timely support during incidents? Safeguard contracts and include clauses in vendor contracts that address:
- service-level agreements (SLAs)
- response times
- penalties for prolonged outages.
This ensures accountability and can help mitigate risks.
Build redundancy
Build redundancy into your IT systems to minimise downtime. This includes:
- backup servers
- data replication
- alternative communication systems.
Diversify IT infrastructure with not just one cyber security, operating system, or cloud provider. IT departments may consider air gapping with large interconnected IT systems backed up by smaller, separate network-phased systems. With system updates, rollouts rather than company-wide updates could take place.
Skilled IT resources
Maintain a skilled in-house tech team with contractors and consultants on call for support. IT skills gaps have long been an issue for tech teams, so use this opportunity to bolster your in-house IT team with cloud and cybersecurity professionals. NU Concept Solutions is a provider of skilled IT professionals in these areas to help businesses build their in-house tech teams.
Collaborate with experts
Maintain partnerships with cybersecurity specialists and organisations to stay updated on the latest threats and defence technology.
Invest in company-wide staff training
Many risks businesses face are from employees. Employee vigilance and adherence to best practices are key. Offer cybersecurity training for all employees. Emphasise the importance of recognising phishing attempts and cyber threats. Malicious actors rely on staff not knowing the risks at key moments. Malware relies on employees not following the rules like password protection or robust authentication methods.
Robust disaster recovery
Ensure you have a comprehensive incident response strategy. This includes clear roles and communication channels to manage outages. A robust disaster recovery and business continuity plan is key to managing downtime.
- Maintain backup systems.
- Perform regular data backups.
- Define clear procedures for responding to IT issues.
Communication during an outage
Have a communication plan for internal and external stakeholders to stay informed during outages and manage expectations. Effective incident response needs communication between IT, security, and business teams. If employees and clients are kept up-to-date with events, they know you are working in their best interests.
Ten simple steps companies can take to protect themselves
Besides the key areas above, IT departments can:
- Utilise multi-factor authentication (MFA): Install MFA across critical systems. This adds an extra security layer against unauthorised access.
- Update software and hardware: Ensure all software and hardware is up-to-date with the latest security patches and updates. This protects against known vulnerabilities.
- Implement regular system audits: Conduct regular audits and assessments of your IT infrastructure. This ensures you identify vulnerabilities and ensure compliance with security protocols.
- Avoid single points of failure: For vital systems, build redundancy at the component level. In the CrowdStrike incident, Linux and Mac endpoints were not affected.
- Implement redundant systems: Establishing redundant systems and data backups maintains continuity.
- Review and adjust third-party rollouts: The CrowdStrike outage revealed operational risks with third-party dependencies. Review third-party rollouts.
- Carry out penetration testing: Test your systems for vulnerabilities. Penetration testing can identify and fix security gaps.
- Test and conduct drills: Test incident response plans. Update them to address shortcomings in testing scenarios.
- Monitor systems: Use advanced monitoring tools to watch your systems 24/7. Allow for rapid detection and response to issues.
- Develop contingency plans: This ensures essential functions can still work in the worst-case scenario.
Building IT safeguards
Working IT systems are a prerequisite for modern companies. This outage highlights that firms must be vigilant of the cloud infrastructures and systems they depend on. It reminds us that building resilience in our IT systems is essential.
Prevention is better than cure
By adopting the above measures, IT departments can better safeguard against potential IT disruptions. While this may come at a cost, firms are more resilient and protected against costlier threats.
In light of incidents like the CrowdStrike outage, NU Concept Solutions can support IT departments by sourcing talent to strengthen infrastructure resilience and manage risks. Email This email address is being protected from spambots. You need JavaScript enabled to view it. or call us on 0330 058 3400.