Infrastructure

After the storm

Expanding the concept of emergency power reliability

Jan 1, 2013 |

As mission-critical equipment, hospital emergency power systems are expected to provide power consistently to what they must, when they must and for as long as they must. This is a tall order, and the impact of an emergency power system failure when normal utility power also has failed is potentially severe for patient care.

The failure of some facility emergency power systems during and after last fall's superstorm Sandy already has spawned investigations, which ultimately will result in lessons learned and more knowledge upon which health facilities professionals can base best practices to reduce vulnerabilities.

Caregiver communication

Some clinical personnel believe emergency power is or should be uninterruptible. They believe it should never fail. Unfortunately, nothing is guaranteed under all circumstances, including uninterrupted power. Despite best efforts, emergency power systems sometimes fail, even when they are needed.

Part of this may stem from a misunderstanding about the operational aspects of emergency power equipment and related transfer devices. For instance, one medical journal article mentioned "usually less than 1-second" duration upon the loss of commercial power, which is substantially less than the 10-second maximum duration required by codes and standards.

In fact, because monthly emergency power load tests do not involve the loss of normal utility power, one wonders if the very short hot-to-hot power source transfer times during monthly tests themselves may lead to unrealistic expectations by some affected clinical personnel.

Although facilities professionals know that there are different types of electrical system failures that can occur, it appears that there has not been enough discussion of these issues with clinical professionals. Facilities professionals should take the time to educate physicians, nurses and others about the different potential failure modes of their electrical power systems.

The education should be comprehensive enough so that clinicians understand the three or four different potential types of electrical failure in any critical care clinical space, each with its own response, including:

• Failure of the normal power supply to the space or equipment with the emergency power system — also called the essential electrical system (EES) — still online;

• Failure of one or more of the EES branches (life safety branch, critical branch or equipment system) serving the space or powering equipment serving the space with the normal power system still online;

• Failure of one of the two critical branches with the other critical branch source still online in critical care spaces served by two separate critical branch sources as permitted by the National Fire Protection Association's NFPA 99, Health Care Facilities Code;

• Total electrical failure to the space, either simultaneously or as the result of cascading failures over a period of time, similar to some of the hospital emergency power failures in the Northeast after superstorm Sandy.

With each of these scenarios, the impact on procedures and required actions by caregivers may be different. For example, the expected response in most hospitals to the most commonly discussed potential electrical failure — failure of normal utility power — is to ensure that critical equipment is plugged into emergency power (red) outlets. However, if the failure is the critical branch serving a space instead of normal power, the appropriate response would be just the opposite — to ensure that critical equipment is plugged into the normal power (gray, white and brown) outlets. Quickly differentiating between the scenarios and their necessary responses can improve patient care and safety.

Many hospitals already have basic electrical utility failure procedures that were prepared many years ago. However, the ongoing enhancement of computerization in modern operating room (OR) suites may require regular reviews and updates to these earlier versions of utility failure procedure manuals. Regardless of the type of event, appropriate responses for each foreseeable failure mode should be covered fully in these manuals and the manuals should be accessible to those who need them. These responses for different types of failures also should be covered in training and regular exercises.

Expanding the concept

Bypass isolation transfer switches can be maintatined without turing off their loads, improving operational reliability.

Recent events may cause facilities professionals to expand the concept of emergency power to the issues of availability and dependability. Maintenance management, production scheduling and data center infrastructure experts often consider these additional attributes as well.

In systems engineering, dependability is a way to measure a system's availability, reliability and its maintenance support. Reliability can be considered the probability that a system operates and gives the same result on successive trials. Availability, on the other hand, can be considered the probability that a system will be able to function at any instant required, including within the next instant and for as long as required from that point.

Because no facilities system can guarantee 100 percent reliability, no system can assure 100 percent availability. The most commonly accepted metric of major commercial data center availability — a function of its power system design, among other factors — is a facility availability of "four nines" or 99.99 percent, and data center power systems tend to be more robust than many hospital systems.

The Joint Commission's Sentinel Event Alert Issue 37, titled "Preventing adverse events caused by emergency electrical power system failures," was published Sept. 6, 2006. The Joint Commission addressed the topic again in its Environment of Care News a year later. Most hospitals addressed elements of Sentinel Event Alert Issue 37 at the time. Recent events indicate that it may be time to address at least one of those recommendations — the power system vulnerability analysis — again and perhaps more comprehensively this time.

Though not addressed in Sentinel Event Alert Issue 37, an updated review of vulnerabilities also might include an analysis of the potential for common-mode failures, which are failures of two or more systems or components due to a single event or cause. A safety engineering concept states that once a failure mode is identified, it usually can be mitigated by adding extra or redundant equipment to the system. However, the existence of an uncorrected common-mode failure potentially removes the advantage of other redundancies.

One example of a potential source of common-mode failures is a single fuel oil storage tank containing fuel oil that serves multiple generators, including redundant ones. Fuel oil contamination could adversely affect all generators served by that system.

The 2013 revisions to NFPA 110, Standard for Emergency and Standby Power Systems, include numerous improvements and new recommendations that provide more ammunition to fight against the damaging effects of fuel oil contamination on emergency power reliability. The 2013 edition of the standard should be reviewed to ensure that both its new requirements and its recommendations, many of them best practices, are given due consideration in existing facilities.

An emergency power fuel oil storage tank located on the same building level as normal power equipment also may leave these two systems — both normal power and emergency power — subject to the same common-mode failure potential, whether it is flooding or any other cause that renders the systems or components unusable.

Examples of other fuel system common-mode failure causes are the fuel transfer system components such as pumps, controls and their power sources. The failure of fuel oil transfer pump power or controls can bring down an entire emergency power system unless the design, vulnerability analysis, inspection, testing, maintenance, operation and failure procedures all work together to prevent that occurrence.

Whether a duplex fuel pump skid has a single source of power or is located in an area where it is subject to the same event that takes out the utility power source, the result is potentially a full power outage. Pieces of equipment on an upper floor can be rendered unusable if their power feeders are located in an area that is subject to flooding or damage from other common-mode causes.

One effective approach to take when analyzing these and other potential vulnerabilities is to:

Consider each component that must operate;
Determine what scenarios will cause it to fail, including all "What if?" scenarios that could damage the power sources or feeders that keep it running;
Compare those scenarios with others that may take out other redundant components, redundant power sources or redundant feeders;
Investigate all the possible causes of those scenarios, including commonalities in power sources, feeders or controls;
Address the resulting common-mode failure modes that have been identified.

Automatic transfer switches are major components of most hospital emergency power systems. Normal power flows through them to critical equipment when it is available, they tell generators when to start if they sense a loss of normal power and then they switch to generator power when it becomes available. However, automatic transfer switches themselves may be a point of common-mode failure because both normal power and emergency power flow through the same point to the critical equipment.

A transfer switch failure likely will cause an outage of the critical equipment that it feeds. The good news with smaller transfer switches is that the impact of any single common-mode failure of this type probably will be limited to a smaller area or smaller grouping of equipment. Larger transfer switches, however, will feed more equipment and larger spaces. With a larger transfer switch, the impact of its common-mode failure will have a greater impact.

Hospitals should be cognizant of requirements from NFPA 110 such as "8.1.1 — The routine maintenance and operational testing program shall be based on all of the following: (1) manufacturer's recommendations, (2) instruction manuals, (3) minimum requirements of this chapter, and (4) the authority having jurisdiction." The necessity of maintainability is very important. Transfer switch failure potential can be affected adversely by the lack of maintenance, which should be performed to minimize the potential for wear out-based failures due to component aging and use.

Many hospitals still are burdened by being limited to automatic transfer switches that do not have a bypass isolation feature. This feature is not typically a requirement for hospitals, but it is a best practice, because it enables transfer switch maintenance to occur safely without shutting off the critical equipment that the transfer switch feeds.

A review of the recommended maintenance in consensus industry standards and manufacturers' operating and maintenance manuals indicates that certain tasks typically included in recommended annual maintenance only should be performed with the transfer switch removed from service or in the bypass mode.

If a facilities professional seeks to perform maintenance on a transfer switch that does not have the bypass isolation feature, it probably will be necessary to take the transfer switch out of service and turn off its loads, which is unacceptable in many hospitals. As a result, many hospitals do not perform full maintenance on these devices because they are unwilling to turn off the critical equipment. Testing company reports may show that certain recommended annual maintenance tasks simply were not performed if the owner did not allow the power to be removed from the transfer switch.

Another potential of common-mode failure in facilities with multiple paralleled generators is the paralleling switchgear, the complex equipment that ties those generators together. As unlikely as it may be, this equipment can fail. If a hospital has a paralleling switchgear and that switchgear fails, how will it get generator power to the loads? It is important to have analyzed this scenario ahead of time to create a failure procedure that addresses the necessary steps to take if it occurs.

As stated, NFPA 110 requires that the paralleling switchgear manufacturer's recommended maintenance be incorporated into the ongoing emergency power management process. NFPA 110 also includes the requirement that "8.3.6 — Paralleling gear shall be subject to an inspection, testing, and maintenance program that includes all of the following operations: (1) checking of connections, (2) inspection or testing for evidence of overheating and excessive contact erosion, (3) removal of dust and dirt, and (4) replacement of contacts when required."

Lessons learned

Health facilities professionals should be ready to learn from the lessons of recent failures, including those occasioned by superstorm Sandy and the several years of events that preceded it.

The health care industry undoubtedly will be exposed to new recommendations, perhaps asked to consider an array of new best practices and even may be subject to new regulations in the not-too-distant future.

David Stymiest, P.E., CHFM, CHSP, FASHE, is a senior consultant for compliance and facilities management at Smith Seckman Reid Inc., Nashville, Tenn. He can be reached at DStymiest@SSR-inc.com. Although Stymiest is chairman of the NFPA technical committee on emergency power supplies, which is responsible for NFPA 110 and 111, the views and opinions expressed in this article are purely his own and shall not be considered the official position of NFPA or any of its technical committees, and shall not be considered to be, nor be relied upon as, a formal interpretation of the discussed standards.

Sidebar - Resources on the Web

Need more information? These resources are among those used by the author in preparing this article.

» "Response to a Partial Power Failure in the Operating Room," Tammy Carpenter, M.D., and Stephen T. Robinson, M.D., Anesthesia & Analgesia, vol. 110, no. 6 (June 2010) 1644–46

» "Electrical Power Failure in the Operating Room: A Neglected Topic in Anesthesia Safety," John H. Eichhorn, M.D., and Eugene A. Hessel II, M.D., Anesthesia & Analgesia, vol. 110, no. 6 June 2010) 1519–21

» "Preventing adverse events caused by emergency electrical power system failures," The Joint Commission Sentinel Event Alert, Issue 37, Sept. 6, 2006

» "Sounding a Sentinel Event Alert on Emergency Electrical Power Systems" Environment of Care News September 2007

» "Averting Common Causes of Generator Failure (Part 1)," Darren Dembski and Sarah Escalante, Facilities Engineering Journal, September/October 2009

» "Averting Common Causes of Generator Failure (Part 2)," Darren Dembski and Sarah Escalante, Facilities Engineering Journal, November/December 2009

» "Generator Fan Failure Triggered AWS Outage," Rich Miller, Data Center Knowledge blog June 21, 2012

» "Multiple Generator Failures Caused Amazon Outage," Rich Miller, Data Center Knowledge blog July 3, 2012

» "Managing Hospital Emergency Power Systems – Testing, Operation, Maintenance and Power Failure Planning," David Stymiest, ASHE management monograph, 2006 (accessible only by ASHE members)

» NFPA 110-2013, Standard for Emergency and Standby Power Systems, Quincy, Mass.