Tag Archives: server

Server Outage

So I promised I’d tell you why my websites were offline for a couple days.

My server is a leased server that lives in a data centre in Houston, Texas that is owned by The Planet. On Saturday evening there was an EXPLOSION in the underground conduit that brings all the power into the data centre where 9,000 servers are housed. The explosion blew out 3 interior walls around the electrical room. Thankfully no one was injured. I chose this data centre because it has security, fire suppression systems, and back-up power generators. The fire department on the scene refused to authorize the start up of any back-up power generators because the explosion and fire was electrical in nature.

No one plans for a situation like this one. Who would have thought of an explosion!?! I do have my own back-up plan in emergencies. Every night I zip up all the accounts on my server and back them up to a server in another state. I knew, even if the servers were all damaged in Houston, that I would be okay because I had all my data up to the night before and I could purchase another server and restore it.

It was not clear during the beginning of the outage just how long we would be without power to our servers. We did find out early on that all the servers appeared to be fine and hadn’t incurred any physical damage.

I debated on restoring my sites on a new server, but if the outage is short, by the time I had everything set up to go, the DNS changes submitted, I might cause myself more down time by switching to another server. There is also the problem of having 2 versions of a database. Many of my websites run with a database back-end. Having 2 versions of a the same database on 2 servers is a horrible data management scenario.

I soon found out that my server was in Phase 2 of the data centre, and that is was actually on the second floor. This meant that my server, and the other 6000 on the top floor, would be the first back in business.

In a data centre, you can’t just switch on 6,000 computers as soon as the power kicks in. Because the electrical infrastructure coming into the building was destroyed, we would be relying on their built in generators. First that power has to be tested so that it is stable and clean power. Surges would damage more equipment. Then they have to restore the air conditioning systems. There is a lot of heat generated by 6000 servers! By 4am on Monday my server showed signs of life. It was up and down a bit as they restored power to the network, but my server started up fine with no issues. About 10% of the second floor had some troubles restarting. Sometimes computers don’t always restart well when their power had been abruptly cut off.

My server is still running on back-up power from their generators. It is being refuelled twice a day I hear. The 3000 servers on the first floor are taking longer to get started up due to the damage to the infrastructure. They had to use an external generator that turned out to be faulty so a new one had to be brought on site. Now the electrical conduit and electrical room has to be rebuilt before we can go back on regular power.

Some people are ranting and raving and making real fools of themselves. This wasn’t something predictable or scheduled. The world won’t end if my sites and my clients sites are down for a couple days over a weekend. If my sites were more important, then I would have to have a full back-up server in another facility that didn’t just store my nightly zip copies. It would have to be running a carbon copy that was synced to the first one so any changes and updates were replicated to the back-up server. It might be something I consider down the road, but not right now.

The company that monitors my server for me, HostGator, is rumoured to be providing an entire month free because of the downtime incurred. I am passing this along and giving my few hosting clients a free month for any inconvenience.

Many people are ready to jump ship to another company. I think the opposite. What data centre would be more prepared for an emergency after this than this one? I think I’ll stay.

One by one, squishing the bugs from the server move

Getting everything back up to speed!
The actual server migration itself went smoothly – just some big scares along the way!

First, why did I get a new server?

I’ve had this server for a few years. Everything needed upgrading, but I did only some things – afraid of causing any downtime. When the spam situation got out of control, I finally went to ConfigServer.com to get them to install their MailScanner software on my server. They were unable to complete the request – because I was still running Redhat 9 as an operating system.

I looked into getting the company I lease my dedicated server from – HostGator.com – to upgrade my operating system. They said it would cost $25 and I would incur 6-9 hours of downtime on my server. That was a deal breaker! I can’t have all my sites down for that many hours.

I’ve also had some space problems on my server – so I looked at the dedicated server package that was one up from the level I was leasing. For only $45 more a month, I could have a HUGE jump in server guts. Plus if I upgraded, then I could have the new one all configured first – with CentOS, the latest PHP and MySQL and Perl and then I could migrate my sites. I was sold so I ordered the new server. I think the deal was complete in less than an hour!

Instead of just having the ConfigServer people install their Mailscanner program, I also had them lock up my entire server. $149 and they work their magic to seal the holes and install their spam fighting wisdom. My Verkley.com email address people will be grateful to see FAR less spam in their inboxes this week. Plus I am about to provide more email addresses to a couple of recent clients and I just didn’t want them to deal with the level of spam that was coming through, so the timing was right.

After ConfigServer was finished with my server. I followed these directions to migrate all my domains from the old server to the new one: Guide for transferring all accounts to a new server – post #69

It involved putting my servers on a DNS cluster – basically teaching them to trust each other.

Then I shut down the mail services on my old server so I wouldn’t lose anyone’s mail by having it delivered to the old server after I moved their mailbox.

Then I used CPanel’s migration tool in my WHM control panel to move my sites. I moved the little ones first, and then moved the bigger database driven ones. It took probably 3 hours or so? I panicked at first – my satellite connection died 10 minutes into the migration and I thought everything would be completely messed up, but it wasn’t – I just skipped that domain when my connection was restored and then re-migrated it.

As the sites moved over, the DNS’s were updated. Everything seemed wonderful.

Until I started looking at the domains. Every site I have that is programmed in PHP was showing an error – with a message that included a path for a DIFFERENT domain and it showed the default email address of ANOTHER domain! 88| What a mess! My subdomains for one site were showing completely another website of mine. I started to move the DNS’s back to the old server fearing everything was messed up.

I tracked the error down to the open_basedir setting. I fiddled with the php.ini file. It seemed to be shut off. Hours later I found it also in a Cpanel security tweak! Once I disabled it, all my PHP sites were back in order.

I had a few other things lingering to fix up. Since MySQL was upgraded, some of my custom database scripts were showing some odd behaviour. My timestamp displays were way off – but then I realized that the newest version of MySQL saves timestamps in a different format, so I had to alter my code a wee bit. And then find all the scripts where I manipulated timestamps, and I fixed them too. I couldn’t log into any control panels of my websites individually. Turns out the server was configured to allow SSL connections to my cpanels only – so I fixed that up. Comments disappeared on my blog – but that was from the blog software upgrade I preformed at the same time – one of the blog scripts didn’t fully upload. That took a couple days to figure out. My brother Brian had some IMAP connection problems, but we seem to have those figured out too. Bit by bit, everything is being restored, and this new server is a ton faster!

By the end of the week I will have decommissioned my old server.

Comments aren’t showing

I know, comments aren’t showing below the posts right now. I’ve been trying to fix it, but I really need to sleep on it. All the comments still show up on the sidebar – Latest Comments. Something went wrong between upgrading b2evolution and my server upgrade. Hopefully tomorrow I can find a fix.

I will edit this post tomorrow with all the fun details of my latest server migration. What an adventure!


Update! Comment display fixed!