10 years later: BP Texas City and the inevitable cost of an incident
To anyone working in process industries, it is a familiar narrative: On March 23, 2005, at the Texas City refinery, a geyser of flammable hydrocarbon liquid and vapor erupted from a blowdown stack, which ignited creating a huge fire. The result: 15 fatalities and 180 injuries. All the individuals who died were in or around a group of office trailers clustered near the blowdown drum. At the time, Texas City was operated by BP and was its largest refinery and the third largest in the U.S.
With 10 years of hindsight, what have we learned? This particular process safety incident was no momentary lapse in a facility that otherwise had a clean history with strong safety compliance and longstanding safety culture. On the contrary, this refinery had experienced 23 fatalities over the previous 30 years, including three in 2004 alone. This wasn’t even the first time that this specific safety compliance lapse had occurred with this very equipment. There had been eight serious releases of flammable material out of that stack in the years prior to 2005, but nobody had ever investigated the causes. Moreover, such overfilling incidents in distillation columns had happened in other refineries, and investigations cited in industry literature stressed the importance of accurate level measurements as part of overall safety risk management programs.
After the March 23rd fire, there were two more major incidents, resulting in more than $30 million in damage (although without fatalities) within a few months. After the March incident, OSHA investigators fined the company $21 million, the largest fine to that time in the agency’s history, for 301 willful violations. Ironically, OSHA did not undertake a comprehensive inspection of the other 29 process units at the site.
Following the March incident, accident analysis pointed to problems with human resources along with hardware. Follow-up investigation reports cited dozens of instances of deteriorating equipment, inadequate process control systems, and inadequate safety systems, along with poorly trained and inexperienced people. Those reports also pointed out that the company had cut operating budgets substantially in 1999 and again in 2005. Staffing at the site had also been reduced.
When the Chemical Safety and Hazard Investigation Board published its investigation report of this specific incident two years after the event, it cited a very insightful comment issued by the Columbia Accident Investigation Board after the space shuttle had been lost. It said, "Many accident investigations make the same mistake in defining causes. They identify the widget that broke or malfunctioned, then locate the person most closely connected with the technical failure: the engineer who miscalculated an analysis, the operator who missed signals or pulled the wrong switches, the supervisor who failed to listen, or the manager who made bad decisions. When causal chains are limited to technical flaws and individual failures, the ensuing responses aimed at preventing a similar event in the future are equally limited: they aim to fix the technical problem and replace or retrain the individual responsible. Such corrections lead to a misguided and potentially disastrous belief that the underlying problem has been solved." (CAIB, 2003)
In the case of this incident, it would be easy to write it off as a failure of level instrumentation and the SIS (safety instrumented system) in the raffinate splitter tower that caused inexperienced operators to continue pumping flammable feedstock into the tower even when they could not account for where all that volume was going. Pumping feedstock into the tower was contrary to normal startup procedures. Still, no alarms warned the control room of what was happening, and no SIS shut off the pump.
Focusing on the wrong safety metrics
When the corporate culture ignores safety, what do you think about going to work each day?
Looking at the 2005 Texas City fire with the benefit of a decade of reflection and much discussion of what was going on at the facility at the time, it is difficult to imagine how anybody working there could have had any sense of personal security. Those individuals who spend most of their waking hours surrounded by flammable and explosive products in enormous quantities must become desensitized to the ever-present danger or they would not be able to walk through the gate to begin a shift. Hopefully, rational individuals find a way to let their knowledge of the danger inform their actions and motivate them to work safely. BP had created a safety culture that stressed behavioral issues but largely ignored process safety. Perhaps to many of the 1,800 employees and 800 contractors on the site, the notion of behavioral safety was enough to keep them coming to work every day.
Investigations after the fire on March 23, 2005, (Calling it the 2005 incident implies that there was only one that year. In fact, there were several other incidents before and after.) offered numerous citations of safety culture failures at the company’s corporate level, along with safety management failures at its refineries in general and Texas City in particular. Quoting from the Chemical Safety and Hazard Investigation Board report:
- 1.6.2: The Board of Directors did not provide effective oversight of BP’s safety culture and major accident prevention programs. The Board did not have a member responsible for assessing and verifying the performance of BP’s major accident hazard prevention programs.
- 1.6.3: Reliance on the low personal injury rate at Texas City as a safety indicator failed to provide a true picture of process safety performance and the health of the safety culture.
- 1.6.7: Safety campaigns, goals, and rewards focused on improving personal safety metrics and worker behaviors rather than on process safety and management safety systems. While compliance with many safety policies and procedures was deficient at all levels of the refinery, Texas City managers did not lead by example regarding safety.
That’s not to say there was no discussion of safety at the facility, but the stress was on occupational and behavioral elements: wear your safety glasses, don’t leave trip hazards, and so on. In that regard, the environment was relatively safe and plant management encouraged participation at every level. A program launched in 2001, called the "Texas City Refinery Safety Challenge," informed employees that unsafe acts were responsible for 90% of injuries at the site. In general, employees got behind the program and turned in 48,000 safety observations in 2004. Unfortunately, as cited in 1.6.3 above, the program effectively ignored process safety systems and process safety related activities.
So where did that leave individual employees? On one hand they were pushed to report unsafe acts of individuals, but they ignored the fact that the sight glass on the bottom of the raffinate splitter tower was so dirty nobody had been able to read it for years.
How about the operator who was doing the unit startup for the first time in his career, and had to perform this critical procedure with nobody else around the control room who was experienced. Some years before, Amoco had calculated that safety incidents were 10 times more likely to happen during startups. Similarly, basic concepts of safety risk management have made it clear that the best and most experienced operators should have been on the boards when the raffinate splitter was going back online after a turnaround. If after pumping feedstock into the tower for three hours, the operator would have told a supervisor, "I really can’t figure out where all this liquid is going. I can’t account for it from any of the instrumentation readings. Something is wrong. We must be filling the tower," it may have produced different results. In addition, pumping feedstock into the tower was contrary to established startup procedures.
If a plant is in bad condition, such as this one was, the only way to compensate is to have very clever people who can create their own work-arounds for malfunctioning or missing equipment. A more experienced operator would never keep pumping feedstock into the tower if he or she couldn’t verify that an equal amount is going out. An experienced operator who understands the process can look at the instruments and say, "That can’t be right." Of course, that approach only goes so far. Eventually, even the most gifted people can’t make up the difference. Operators who are struggling don’t stand a chance.
Mechanical failures and misunderstandings
The raffinate splitter tower had no functioning level sensors-overflow was the hazardous result. But what were the specific events that led to the disaster?
On March 23, 2005, the raffinate splitter process was being restarted after a turndown. The unit had been shut down on February 21 so that the tower could be drained, purged, and steam cleaned to remove residues. Apparently nothing had been done to check level sensors installed in the tower, and a sight glass mounted at the bottom that had been too dirty to use for some years was untouched. Operators preparing for the restart had reported that a pressure vent valve on the reflux drum could not be operated from the control room, but this was not fixed either. According to BP procedures, that single valve malfunction alone should have prevented restarting the unit until it could be resolved.
Established safety management procedures also called for a thorough pre-startup safety review (PSSR), but this was skipped along with other requirements for appropriate staffing during the restart. A PSSR would have checked and verified all safety instrumentation and other related equipment, but that would have most certainly delayed getting the unit running again. The process was set into motion without the sign-offs normally required. Basic process safety and environmental protection procedures were ignored.
In daily operation, the raffinate splitter tower would have between 7 and 9 ft of liquid with reflux raining down from the trays above. Operators on the night shift began to fill the tower to a normal depth, although the level indicator was not functioning properly. The night lead operator stopped filling the tower when the DCS indicated the level was 8.95 ft (99%), although it was actually 13.3 ft. The inexperienced day shift operator started the process and began pumping in feedstock, although there was no indication that anything was going downstream. By 11:50 am, the level in the tower had reached 98 ft, but the DCS still said 8.4 ft (88%). In less than an hour, it was up to 140 ft, and at 1:14 pm it began to flow out of the overhead piping to the blowdown drum. At 1:20 pm, liquid was released out of the stack, formed a vapor cloud, ignited, and exploded.
Four level sensors were inoperative during this incident, which represented all the level sensors in the immediate system. Since the tower was a distillation column, it was not intended to be filled like a storage tank. The maximum level under normal operation was less than 10 ft. The basic process control level sensor was sending a reading to the DCS, but it ended up giving a terribly inaccurate reading. The normal way to perform a verification check on this sensor was the sight glass, but for years it had been too dirty to use. The high-level alarm sensor was worn out, causing the mechanism to lock up so it could not respond.
There was no other level sensor on the tower. Even if they had all been functioning normally, once liquid got above the high-level alarm sensor, the operator had no way to determine if the depth was 12, 50, or 150 ft.
The fourth level sensor that failed was on the blowdown drum. It had a damaged float and could not warn that the drum was going to overflow. This would not have prevented the incident, but it could have at least sounded an alarm to get people to evacuate the area.
Allowing the condition of the equipment to deteriorate to the point where none of the level sensors were functional, combined with the decision to restart the unit without adequate safety checks, suggest an environment where production concerns were placed ahead of safety.
Solving problems at Texas City would have likely required shutting down the facility for some extended period of time to make major renovations and updates, but nobody was going to force that to happen, the U.S. energy supply chain would have been hard pressed to accommodate such a loss of production (10 million gallons of gasoline daily), and shareholders would not have tolerated it. Such is reality.
Luis Durán is product marketing manager safety systems for ABB Process Automation Control Technologies.
- 10 years ago a major oil refinery fire caused 15 fatalities and 180 injuries.
- Lessons from this incident point to the importance of an integrated process safety culture.
A webcast with Luis Durán and Greg Hale: Anatomy of an incident—Safety best practices from lessons learned, is now available for on-demand viewing at:
Additional analysis on this and other safety incidents is available at Anatomy of an Incident
For more information, visit:
See the anatomy of an incident webcast.