We all know Murphy’s famous law, “If anything can go wrong, it will.” I have to believe that Murphy was an automation engineer because I have encountered his law in action on every project I have ever worked on.
I have sat in HAZOPs where the group wanted to discount a scenario because it involved two simultaneous failures. I have also worked in a chemical plant that encountered FIVE simultaneous failures, blew up a vent line, and narrowly missed injuring an operator. Equipment breaks, and people make mistakes. Anticipate it, and design for it.
Concept: Simple systems work reliably. Complicated systems find new and interesting ways to fail. Whenever possible go for the simplest, most robust solution. As an automation engineer, the KISS concept (Keep It Simple Stupid) should be your mantra.
Whenever possible go for the simplest, most robust solution.
Details: Automation engineers love to create gloriously complex solutions. With so many computers and gadgets available, it is hard NOT to want to incorporate the latest and greatest into a design. However, the true purpose of automation is to control the process. Sometimes it takes a multivariable predictive control model to do that, but many times it can be done with a float switch and a solenoid. Try not to complicate a solution any more than necessary. When you are designing an emergency system to dump a quench chemical into a reactor, consider using gravity rather than special pumps and other equipment. Gravity always works (at least on planet Earth), while pumps and/or electricity can fail—especially under emergency conditions.
Anticipating every failure is difficult, but you must make every effort. What happens if the operator presses the wrong button? What happens if no button is pressed at all? If power is lost, might the instrument air and cooling water systems fail as well? What about steam and nitrogen? What are the ramifications of these multiple failures?
When you are designing a control panel, consider using dual 24VDC power supplies. Feed one with a UPS circuit and the other with a non-UPS circuit. Despite what their name might imply, an Uninterruptable Power Supply becomes an Interruptible Power Supply more often than not. Having dual feeds can allow a control panel to continue operating despite the failure.
Software design is particularly tricky because there are so many paths that the logic can traverse. Operators are forever using the equipment in ways that were never intended and if the software is not designed to handle it, the program can hang in unexpected places. During testing try hitting the wrong buttons and try to force the program to step through the sequence in a different way to see what happens. While this will drive the programmers crazy, the resulting system will be much more robust as a result. Finding and resolving problems in testing is always better than discovering them on start-up!
Watch-Outs: Never allow the final software quality control testing to be implemented by the same person who programmed it. A different person is much more likely to hit the sequences in a different way or throw the system a curve that the programmer had not anticipated. Avoid the temptation to use exotic controls and programming to patch a poorly designed process. You can program around poor mechanical designs, but the project will be more stable if the fundamental problems are resolved.
Exceptions: Sometimes a HAZOP group can lump a series of totally improbable scenarios together and reach outlandish conclusions. However, there ARE certain scenarios that can create a cascade affect. (A loss of power might trip the steam system and take out the cooling water supplies as well.)
Insight: Safety interlock calculations include a testing interval and incorporate the failure modes into the calculations for a very good reason. Untested interlocks have caused hundreds (and probably thousands) of accidents when they failed to perform their function. Be particularly wary of interlocks that involve multiple instruments and/or devices to sense a failure. The probability of failure on demand will be very high.
Rule of Thumb: If you are given an option, always choose the simpler solution. When you are designing a system, do not consider operator error and equipment failure to be isolated and unlikely events. They will occur … and usually at the worst time possible.