Even More Tales of Technology Terror: Personal Stories of Tech Disaster

From earthquakes to worldwide email disruption to business processes that won't stay dead, we round up personal tales of IT terror.

By CIO Staff
Sat, October 27, 2007
Page 4

The Day the E-Mail Stood Still, and the Man Who Caught the Blame…

Back in the mid-90s, Brad Knowles was senior Internet mail adminstrator at America Online, at the time the largest online service provider in the world. But with great power comes great responsibility…

It’s known as Black Wednesday, August 10, 1996, the day all of AOL's routers went down, and no one could get any packets to our systems—they all just got thrown away. But computers could still contact our backup name servers at ANS (a subsidiary of AOL that ran all of our external WAN connections), so they knew who all of our mail servers were and how many IP addresses we had listed.

Now, it's important to know that the Internet RFCs requires that that you wait at least two minutes when you start to set up a standard TCP/IP connection before you finally declare the other end to be dead. The standard practice is also to attempt to connect to each of the IP addresses you know for a given name, usually in the sequence in which you received them. At the time, the standard practice for mail servers was that you contacted all listed mail servers for a given domain before you gave up.

Now, step back and do the math for seven names with seven IP addresses each, and two minutes per IP address:

7 x 7 x 2 = 49 x 2 = 98

So, just making one delivery attempt to a single user at AOL was taking 98 minutes to time out. Then another 98 minutes to time out for the next user or the next message for a user at AOL.

At the time, most sites were running Sendmail. They were set to rerun their queue once an hour, and many sites would typically have just the one queue runner process. Each time you'd start up a queue runner, if you had even a single message queued up to a single person at AOL, that process would sit there and spin its wheels for at least 98 minutes trying to talk to the AOL mail servers before giving up—and it would block and not do anything else while it was spinning its wheels. But less than 60 minutes after that happened, another queue runner would get fired up—and would almost certainly hang on the same message going to AOL, or on another message going to AOL.

Do that often enough, and you get enough queue runners hung up to AOL that your queue is clogged and you're not getting mail through to anywhere else in the world. Do that long enough, and you've got so many queue runners hung up to AOL that you run out of RAM and swap space and your mail servers crash.

Well, that’s what happened, and I was personally blamed for taking out all Internet e-mail across the entire world. As a result, angry spammers publicly handed out my private telephone numbers and people were asked to complain directly to me. I was also told about at least one business that went bankrupt because it was waiting on a time-critical RFP to come in and it didn't get its bid into the system in time, so it lost the contract.

Once we finally did come back up, it literally took days for us to recover and to catch up to all the backlog that was created for us on the Internet—and it took the rest of the world a few more days beyond that to recover from the rest of their backlog.

Next: Dear Mom…>>

or...

Continue Reading

Custom malware frequently goes undetected. According to Forrester Research, the best way to reduce risk of breach is to deploy file integrity monitoring (FIM) tools that provide immediate alerts. This white paper has been brought to you by NetIQ, the leader in solving complex IT challenges.
This white paper describes the business challenges and opportunities that are driving interest in Identity Governance while discussing considerations your organization should make to help achieve project success.
This paper explores the concept of content-aware IAM, describes the integrated architecture for this new approach, and highlights the benefits that this approach provides.
One of the key strategies that IT teams are pursuing to reduce capital costs while boosting asset utilization and employee productivity is the transition to highly virtualized data centers. However, IDC finds that expectations for further boosts in IT asset use and operational efficiency often surpass the actual results for a variety of reasons. These problems can quickly overwhelm any hoped-for benefits as the scope of virtual server deployment expands.
For your IT organization to keep pace with the business, you need a new, faster approach to infrastructure deployment-an approach that increases agility and accelerates time to application value. That's HP Converged Systems. Built on Converged Infrastructure, these systems deliver the industry's first portfolio of pre-integrated, tested, and optimized infrastructure solutions for applications running in virtual, cloud, dedicated, or hybrid environments.
The nature of the blade platform makes system management, monitoring and provisioning easy and efficient. Access this resource to learn how blade migration will save your data center time and money while increasing performance.
Download this webcast to learn about the design considerations for virtualizing SQL workloads, performance and scalability information and high-availability options, as well as support considerations
Many enterprises have discovered that the use of virtualization to support desktop workloads creates a range of significant benefits. These benefits include price efficiencies, improved IT management and greater agility and choice for end users.

This VMware sponsored webcast with IDC will provide both quantitative measurement of the business value -- defined as the expected ROI -- and qualitative analysis associated with the use of VMware View™. IDC will also provide an analysis of the View Composer and ThinApp™ features of VMware View, including the business value of these solutions and an overview of how they work.

Attend this webcast to learn about:
- Challenges and barriers that might impede the adoption of desktop virtualization
- Navigating roadblocks to facilitate a strategic implementation
- Optimizing qualitative and quantitative benefits to IT and your business
Applications are changing - they're increasingly web-oriented, global in nature and run from multiple device types. Additionally, the volume of data is growing exponentially every year. How do you ensure your applications have fast, accurate, up-to-date information in this new world? Modern applications are data-intensive; delivering data the old way using monolithic databases isn't working. What's needed is a modern approach to data. One that scales-out as needed and delivers predictable high performance, but without sacrificing data consistency or integrity.
VMware View™ 5 simplifies IT management while increasing end user freedom by delivering desktop services from your cloud. Building upon VMware's leadership in desktop virtualization, VMware View 5 delivers a high-performance user experience while giving IT greater policy control.

View this webcast and find out how VMware View 5 can help you:
- Deliver the highest fidelity experience of desktop services across any device and any network
- Simplify and automate IT management, security and control of desktop services
- Reduce the costs associated with your desktop environment
IT professionals are being asked to deliver faster "time-to-value" than ever before. An IDG Research survey found that CIOs are eager to invest in technologies that will enable them to get new applications and services up quickly, achieving faster time-to-value.
Learn how to reduce IT management overhead, ease revision control, guarantee data security, scale systems more quickly and reduce server and software costs.
Newsletter Sign-Up »

Receive the latest news test, reviews and trends on your favorite technology topics

Choose a newsletter
  1. View all Newsletters | Privacy Policy
Sponsored Links
Resource Center