A bit after OVH got here to Canada I were given a modest devoted server with them (32 GB, SSD). Availability has been phenomenal, (ninety nine.996% measured with Pingdom over the direction of a 12 months). When it got here time to host a patron in isolation I decided to keep with OVH and seemed to look that that they had a brand new line of VPS services. Great, I thought, that’ll be best. After searching at each offerings I determined on the only with an SLA of 99.ninety nine% was better than the only with an 99.ninety five% SLA. Paying extra is a no-brainer, the web page needs to be up as it’s used to do work on and is important to the each day function of the business (with out it, it’s virtually tough to area calls – or do anything).
OVH VPS Cloud is a chunk of meaningless jargon, and so for those that do not need to look it up, what it’s jogging is VMware® ESXi sixty four-bit with a customizable kernel. I’ve heard right matters about ESXi, and I failed to 2d wager OVH’s abilities OR do my due diligence of studies and testing before deployment.
So a long way for the month of May the server is sitting at 98.05% availability, ordinarily as a result of 1 outage of ~5 hours.
Continued instability after that.
No observe-up on what’s going to be executed to save you this (the most important incident) from occurring in the future.
Now, The SLA would not cover any refund (eh, oh well – no longer stricken by means of that…). What bothers me is the occasion timeline that transpired (infection starts at #4). Part of the inflammation become at myself for now not reacting faster.
I turned into notified by using pingdom that the server changed into down.
My research started and I narrowed it down to something on status.ovh.com.
There wasn’t a good deal else I ought to do, it was in their palms. The project web page turned into open and being refreshed each 30 seconds for updates. After 5 minutes I commenced composing an electronic mail to customer. At 10 mins the e-mail was despatched.
No updates. Nothing for 15 mins, 30 minutes, 1 hr. The only records I had became that things have been “Malfunctioning”, that became the precis of the ticket. I DO need personnel to cognizance on solving the difficulty, but I also want to recognize at a high stage what it’s miles (DDOS? Disk Malfunctioning? System Corruption?) to help me.
At 3 hours I phoned, and I simply have to have phoned earlier or started mitigation in advance. The person could not inform me what was happening either, which turned into my sign to regulate my DNS to backup.
POST incident additionally annoyed me. I wanted some form of guarantee. You recognise, while some thing like this takes place for that length of time with that kind of SLA, I want to realize what they’re going to implement so that it would not show up once more. The respond?
“…Our SLA ninety nine.99% is our manner of pronouncing that we placed all of power on preserving the provider up and walking and fixing any issues as fast as feasible.
Please take word that we learn from the ones conditions and consequently adapt our infrastructures and/or networks in effects…”
When I requested if I might be updated on adaptions the specific venture I became advised it wasn’t possible.
Evidence of continued downtime indicates there’s been no adaptions. I’m now not quite positive what to do proper now asides from getting to know from my mistakes. This is a distinctive revel in to the Dedicated Server I even have with them which had one hundred thirty days of uninterrupted availability!
I got VPS cloud because at ninety nine.99% it was being offered as extra reliable, not certainly due to the phrase “cloud”. Without the foremost incident it’s searching like an average of 5 mins of unavailability according to day… which is not near 99.ninety nine% and in the direction of ninety nine.eight% in the first location.
Are there things I ought to have carried out higher? Of COURSE! Not having right redundancy is quite damn idiotic. End of tale.
Edit: Formatting a rewording a little bit.
I’ve been speakme to OVH, which is right considering the price of the VPS. I got escalated to the “OVH SWAT Team” and tried to isolate in which it is taking place thru Traceroutes provided by using pingdom and learned approximately mtr (how… did I no longer recognise about this device!). I’m still shaken by way of the downtime, however I have not visible any given that.
In the imply-time I actually have data being copied over daily (thinking about every 3 hours) to one in all their $2.99 VPS’s as study-write in order that as a minimum their offices may be open in case of a catastrophic failure. I’ve found out something perhaps. They are nonetheless a client, fortunately – although I even have discounted web hosting primarily based on availability over the last year (I’m that loopy).