08/21/2013

I’ve owned BitRelay, my web-hosting company, for about twelve years now. For most of that time, I’ve sort of considered it a “side business”. My main source of income has always been web programming, not web hosting. Hosting is notoriously a low-margin business– in general, web providers are in a race to the bottom. My cheapest accounts run $20/month for simple static sites, and that same $20 would probably buy several years’ service at one of dozens of low-cost hosting providers out there.

In short, it hasn’t been about making money– it’s been about having access to dedicated servers and an environment that I know and control. Back in 2001 when I got started in the hosting business, there was no “cloud”, nor were there even virtual servers. $20/month was par for the course, and dedicated hardware was much more expensive than it is today. So it made a lot of sense for me to set up my own operation to support my main business of building web sites.

Fast forward to today, when I have four full racks with over a hundred servers, and probably well over $100,000 worth of hardware. I’m not quite sure how I ended up here, but this isn’t a garage operation any more. In addition to the growth I’ve seen on the technology side, I still feel like having a hosting company is a great thing. It gives me a lot of capability and flexibility as I work with clients who have very specific (and often quite complicated) needs. Perhaps the best thing about hosting is the servers just hum along, running 24/7 and earning money while I sleep.

That is, until something goes wrong. Then everyone is calling and emailing, asking why their web site is broken (or down entirely), or why they can’t get email, or what’s going on with the business-critical system we built for them. Those are tough conversations, and cause a lot of stress all around. Fortunately major incidents are very rare; I’ve had only a handful in twelve years.

One of them happened a couple of weeks ago, which unfortunately was when I was on Trek and completely out of communication with the technical world for four days. My team at Zing did an amazing job of working through the challenges, reassuring clients, and even considering alternative solutions for an apocalypse scenario. It didn’t come to that, but they were sweating. When I found out what had happened, I felt pretty bad about it. I hadn’t given them the tools or access they needed to properly troubleshoot, leaving them with some difficult options (one of which was to actually drive up to the middle of Wyoming to find me on the trail somewhere!).

In any case, I learned a hard lesson and realized that I’ve been treating all of this like a side business for over a decade, when in fact it needs to have much more of my attention. My clients absolutely depend on me. Sure, some of them have a little brochure web site that could be down for a day without a lot of impact… but others have mission-critical stuff that would easily cost them thousands or even tens of thousands of dollars if it was offline for a day. Ouch.

As such, I’ve been working for the past two weeks on planning all sorts of redundancies, failovers, and even alternate hosting solutions. I need to write documentation, set up access, and train my team so the next time I’m incommunicado (ahem, fall backpacking trip with Thom) they have what they need to battle the emergency. It’s been interesting to consider worst-case scenarios and figure out how to bring things back to life.

I cross my fingers that nothing goes so wrong that it requires these drastic measures, but I hope that within a month or so I’m in a much better position to offer these services to clients and feel confident they’ll be rock-solid. I guess we’ll see.