OVH, the largest hosting provider in Europe and the third-largest cloud computing company worldwide, suffered a global outage due to human error during scheduled maintenance.
OVH provides web hosting, cloud computing services, and dedicated servers. It has 32 data centers with over 300,000 servers on four continents and 20 Tbit/s global network capacity. It’s a significant provider, serving 1,300,000 enterprise customers worldwide.
The provider announced the planned maintenance on its status page, writing:
We will do maintenance on our routers on VIN DC to improve our routing. Maintenance is planned for 13/Oct/21, 9:00 AM to 10:30 AM (UTC+2). No impact expected; devices will be isolated before the change.
However, its services soon went down during the maintenance, and the Chief Executive Officer was fast to respond.
Octave Klaba, who serves the role of a CEO at OVH, had this to say:
Following a human error during the reconfiguration of the network on our DC to VH (US-EST), we have a problem with the whole backbone. We are going to isolate the DC VH then fix the conf.
Manager:: [ALL] Manager: Start time: 13/10/2021 07:20 UTC
Impact : Between 07:20 UTC and 08:22 UTC, the entire OVH network was unavailable. We were confronted with a network incident located in the United States. We are still experiencing disr https://t.co/4vWALCqk57 #ovh
— OVH Status Feed (@ovh_status) October 13, 2021
When visitors tried to enter the OVH site, it displayed “Error 503: Backend unavailable.” Furthermore, the outage affected OVH’s status page, saying, “The connection has timed out. An error occurred during a connection to status.ovh.com.”
All OVH clients were affected by the outage, including Rust, Lichess.org, and VeraCyrpt. It lasted about an hour. However, the servers are now gradually returning, according to OVH.
Furthermore, OVH will soon replace the old infrastructure with new technology to protect its clients against future outages.
The company announced the following:
On the morning of October 13 at 9:12 am (CET / Paris time), we carried out interventions on a router at our Vint Hill Datacentre in the United States, which caused disruptions on our entire network. These interventions were aimed at strengthening our anti-DDoS protection, attacks which have been particularly intense in recent weeks. OVHcloud teams quickly intervened to isolate the equipment at 10:15 am. Services have been restored since this intervention. We are currently running a verification campaign with our clients to confirm the restoration of all their services. We offer our sincere apologies to all of our impacted customers and will be as transparent as possible about the causes and consequences of this incident.