by Joe Liversedge
Virginia Tech.

tool of death or missing semicolon?

Although the inherent concentration in the field of computer science is focused on the formal theory behind the physical apparatus and component software and design, the responsibilities entailed with the application of such theories in safety-critical environments may be overlooked in the commercial world. Despite the horrific chain of events and the numerous lives lost, the story of the Therac-25 has largely escaped mass attention. Developed through a joint venture between Atomic Energy of Canada, Ltd. (AECL) and French-based GCR, this machine was an ill-fated computer controlled device designed to deliver calculated bursts of electrons accelerated to produce high-energy beams as a effective means of radiation therapy for the treatment of cancerous tumors. However, the promise of the Therac-25 was largely destroyed when, during a period between the summer of 1985 and early 1987, the machines delivered fatal doses of radiation to patients looking to the technology for salvation, not a quick end to their already tragic lives.

The history of the Therac machines
There were several variants of the Therac machine before the model 25. These first models were operated only by manual operation and never resulted in fatal or damaging accidents. Through the partnership of AECL and CGR, the companies produced two "linacs" [linear accelerators] that were, for the most part, guided by manual operation. The numbering scheme used by the Therac series of linear accelerators is based on the MeVs produced by the particular device, i.e., the Therac-6 could produce 6 million electron volts, etc. The Therac-6 and the Therac-20 were designs based on machines developed by CGR under the names "Neptune" and "Saggitaire", however, they were augmented by a moderate computer control mechanism. The operation of these devices was much like that of a flashlight: first switch it on, and, when done, switch it off. The first two designs were simply intended to accelerate electrons to a certain energy level and unleash the fury of the subatomic particles onto the affected area. However, the beauty of the Therac-25 concept was the notion that one could use the same machine to bombard the body with electrons AND X-ray photons. This was accomplished by tossing a piece of tungsten into the fire, so to speak, so that the protons would get bounced into the direction of the patient. The transfer of momentum, however, would reduce the MeV rating of the beam, lowering it to about 200 rads. With the added component and seeming versatility, the engineers responsible for the Therac-25 decided that the device had far too much complexity to be effective without the use of far more computer control.

The Therac-25 was an upgrade to the somewhat successful (i.e. non-fatal) Therac-20, which was 5 million electron volts less powerful, and featured independent hardware safeguards and interlocks designed to not kill patients. However, the Therac-25 was designed with more attention on software interaction with the operator, with software, not hardware providing the crucial safety precautions.

In the end, the massive design flaws resulted in the death or injury to six people receiving treatment for cancer. The costs, it seems, for safeguards independent of the Therac-25 were far too much to be considered for use in the final product.

Therac-25 software development process
Both the Therac-20 and the Therac-25 were based on the prototype Therac-6. The theory was to build around a successful product, thereby "assuring" a tried-and-true method of implementation, so it was thought. When it was decided that some of the code from this machine would be resused, many problems arose. Since the earlier Therac-6 was, in turn, based on a CGR [a French company] machine, much of the documentation was, indeed, in French. Therefore, the aging code could have been glazed over in the rush to deliver the product to market, without testing to ensure its safety. After the Therac-20 project, relations between the two companies were strained, and the did not agree to further work together. How this affected the documentation dilemma remains to be seen. Since at least one major software bug was found in the Therac-20 as well as the Therac-25, one may assume that some code re-use was taking place between the two, allegedly separate designs. However, due to the hardware safety interlocks, no injuries resulted.

The Therac-25 accidents
The Therac-25 accidentally delivered fatal doses of radiation to several patients. Throughout the United States and Canada, eleven Therac-25 were installed and in operation before the 1987 recall. Between 1985 and 1987, six patients were reported to have been injured by excessive radiation burns caused by rampaging Therac-25s.

After the July 26, 1985 incident at the Ontario Cancer Foundation in Hamilton, the manufacturer could not reproduce the problem and ultimately The overdoses have generally been attributed to the flaws in the software that would allow operators to override errors that would arise, many fatal to those patients being treated. The amount of the overdose was, more often than not, many times more that the recommended therapeutic dose that eventually culminated in severe trauma or death.

Fallout from the Therac-25 incidents
Largely due to the variations caused by human interaction with the system, reports of malfunctions with the Therac-25 were never replicated by the manufacturer, and therefore, no real solution was put forth. In the Hamilton case, AECL could not recreate the problem, but instead assume that the fault lay with a transient failure in the microswitch used to determine turntable position. The events here and at Yakima, Washington tend to show that this overdose was due to errant code, rather than a "microswitch failure".

Conclusion
Although technology has progressed to the point where many tasks may be handled by our silicon-based friends, too much faith in the infallibility of software will always result in disaster. The simple fact remains that software engineering principles have yet to evolve to the point where, much like the civil architects of our time are certain of the strength and security of a bridge over time, we, as computer scientists may sign a piece of paper certifying the functionality of a piece of code. If the simple chores such as making easy to understand error messages could prevent confusion, by all means it should have been there. Should the Therac-25 require independent hardware checks? Of course it should. The lives of those six people have been devastated (some more than others) by the audacity of the engineers to not question their craftsmanship. The tragedy could indeed have been avoided.


Last updated 98/06/17
Developed as part of a class assignment, CS 3604, Fall 1997, by Joe Liversedge