Printable Version of Topic
Click here to view this topic in its original format
Initial D World - Discussion Board / Forums > Announcement Board > Explanation on Long Downtime (2012-1-29)


Posted by: Perry Jan 29 2012, 11:36 PM
As many of you have noticed, the forum was down for almost the entire Sunday. It first went down on 3:30am Pacific Time January 29th, 2012. I did not know until I wake up by 8:00am. I quickly sent several notices to the http://facebook.com/groups/idforums to spread awareness on the problem. I then submitted a ticket to the our host's tech support. They responded with the following within 15 minutes:

QUOTE (Reply from DreamHost (Jan 29 @ 2012 - 09:13:20 / #5393xxxx))
Subject: Re: Site Down
First, I'd like to apologize for the use of a canned response to this support issue. We have identified and are actively working on a fix for the problems you are experiencing with FTP and Web services. This issue is affecting a large subset of our customers and as such our system administration, data center operations, and development teams are all working on resolving these issues as quickly as possible.

We are making this information available, along with updates, on the http://www.dreamhoststatus.com/ website. Please follow this url to keep up to date with any future updates.

http://www.dreamhoststatus.com/2012/01/29/web-ssh-and-ftp-services-for-a-subset-of-vps-shared-and-dedicated-machines-down/ of-vps-shared-and-dedicated-machines-down/

Again, we do apologize for the lack of a personal response to this support request and if you have any further questions please let us know.

Thank you for your understanding in this matter,
JJ Galvez
DreamHost Technical Support


Basically, they were aware of the issue. As time progress through the day, many customers like me were getting angry at this long downtime. Finally by 11:00pm Pacific Time, the site is up. The entire downtime lasted 19.5 hours. Here is a message from the CEO of the host;

QUOTE
Update Jan 29th, 9:40pm PST:

From Simon Anderson, CEO, DreamHost: My sincere apologies for the downtime experienced today by many of our dedicated and VPS customers, plus some shared customers. I know that this has been a poor customer experience for you. Almost all services are back up after an intense effort from the DreamHost dev, admin, data center and support teams. I was involved in the coordination of our efforts today and now am able to share what happened, and what we’re going to do to reduce the risk that it happens again.

We run Debian OS and have used autoupdates to ensure security packages are installed as soon as they are available. We’ve had some breakage in the past from this approach, but nothing major. However last night’s autoupdate went badly wrong, removing essential packages from dedicated, VPS and some shared servers. Our monitoring and support team flagged the issue fast, and we scrambled our admin, dev and NOC teams to reinstall the packages that had been removed by autoupdate, reboot servers, fix package dependencies, and test that individual services were live. Given the number of services affected, this took a long time to complete. Rest assured we had all hands working on the issue, but I know it was still a frustrating experience for customers.

To mitigate the risk of anything like this happening again, we’re immediately switching off autoupdates, and moving to a manual process where we’ll only push out Debian updates after significant testing. There’s always a balance to be struck between speed, efficiency, security and issue prevention, but this event has shown us that we need to take a different approach. Again, my apologies for the downtime experienced today. We’re acutely focused on adjusting our processes and systems to ensure we do a better job going forward. – Simon


It was caused by a botched automatic update that somehow caused automatic deletion of some important modules on the server, which caused the service to go down shortly after 3:30am. I am currently in the process of getting some sort of compensation for such long downtime. If they refuse, I might be forced to switch to another host. This sort of downtime is simply unacceptable.

Posted by: razorsuKe Jan 30 2012, 12:40 PM
This is a good learning lesson but something that could have been totally avoided.
Surely they could foresee a problem like this arising with automatic updates.
The method which they are now suggesting is something that I would have expected them to have implemented from day 1.

I don't think they need "significant" testing before implementing an update, but at least have a secondary system where they can install it on first before applying it to the main servers.

Posted by: Spaz Jan 30 2012, 03:26 PM
A good friend of mine works for Atomic Data, I can get quotes for you if necessary.

Posted by: Perry Jan 30 2012, 03:46 PM
QUOTE (Spaz @ 19 minutes, 42 seconds ago)
A good friend of mine works for Atomic Data, I can get quotes for you if necessary.

I am looking for a managed or unmanaged VPS around 25GB stroage, 1TB/mo. bandwidth, 300MB RAM and my budget is around $15.00/mo. happy.gif Thanks Spaz!

Posted by: Newmanator Jan 31 2012, 11:26 AM
QUOTE (Perry @ Yesterday, 3:46 PM)
I am looking for a managed or unmanaged VPS around 25GB stroage, 1TB/mo. bandwidth, 300MB RAM and my budget is around $15.00/mo. happy.gif Thanks Spaz!

1 TB bandwidth month? lol wat Crazy !! w00t2.gif

Posted by: Spaz Jan 31 2012, 08:25 PM
QUOTE (Perry @ Yesterday, 6:46 PM)
I am looking for a managed or unmanaged VPS around 25GB stroage, 1TB/mo. bandwidth, 300MB RAM and my budget is around $15.00/mo. happy.gif Thanks Spaz!

Not gonna happen at $15/mo. Not even close, according to him.

He said they host servers for KickassVPS who provide hosting on the cheap, but even they won't do that kind of bandwidth for that low.

sad.gif

Posted by: Nerubian Feb 1 2012, 01:37 PM
My first thought was that the site got the same end as MegaUpload.

Posted by: Nomake Wan Feb 1 2012, 06:52 PM
QUOTE (Revolta @ 5 hours, 15 minutes ago)
My first thought was that the site got the same end as MegaUpload.

A quick visit to Slashdot would have alleviated those fears. wink2.gif

Posted by: Perry Feb 29 2012, 02:17 AM
I am still in search for a more reliable VPS with similar pricing structures. So far I haven't been successful in doing so. It seems Dreamhost has the best price for unmetered space + unmetered bandwidth VPS out there. Let's just hope they get their acts together. sleep.gif

Posted by: kiwi baby Mar 16 2012, 09:18 PM
QUOTE (Perry @ Feb 29 2012, 02:17 AM)
I am still in search for a more reliable VPS with similar pricing structures. So far I haven't been successful in doing so. It seems Dreamhost has the best price for unmetered space + unmetered bandwidth VPS out there. Let's just hope they get their acts together. sleep.gif

Yeah, they seem to be having a lot of issues lately. dry.gif

Powered by Invision Power Board (http://www.invisionboard.com)
© Invision Power Services (http://www.invisionpower.com)