
9 minute read
Securing control room culture
when its population already feels the pressure of rising prices and unsteady availability.
OT systems The challenge is that while electric infrastructure operators have had nearly two decades of practice with NERC CIP, pipeliners are playing catch-up when considering the threats to the world’s energy infrastructure today. Security is not just a concern for the major transcontinental pipelines and metropolitan utilities; bad actors will gladly exploit any vulnerability they can find. Some of the juiciest targets are the OT systems operators, big and small, used to monitor and control the physical assets. OT systems include a myriad of hardware, software and communications networks, and require specific expertise to be secured. Herein lies two challenges for pipeline operators (amongst others) that I would categorise ‘domain’ and ‘resources’.
Domain Let’s start with domain. To successfully design and implement a mitigative risk strategy to meet TSA’s expectations, the responsible party needs both the technical expertise and the ownership of the asset(s) or department(s) which must implement the mitigations. The expertise needs to be in one’s knowledge domain, and the assets that need protection must be in their responsibility domain. Likely, your company doesn’t yet have a Director of OT Security or similar, so there may not be a perfect ‘owner’ for operational cybersecurity preparedness.
Large pipeline operators will likely need the collaboration of departments such as corporate security, human resources, pipeline operations, control rooms, integrity management, and of course, IT, to address the breadth of the TSA guidance. Each of these departments will have its own plans, policies, procedures, and budgets, much of which may need to be adjusted to accommodate the cross-cutting cybersecurity mitigations. No single entity satisfies either the expertise or responsibility domain question.
For smaller operators, the opposite problem appears; the likely owner would be the owner of every other pipeline programme, the Manager/Superintendent/ Director of Gas Operations, who likely does everything from writing procedures to updating the NPMS. Accountability domain is almost guaranteed in these situations, but a lack of domain expertise (OT security specifically) is likely a hindrance.
Resources On to resources. While the largest operators are typically better resourced than the smallest operators, they also typically have the largest installed asset bases and more complex operations, policies, and processes. Frankly, there’s both more to secure, and it’s often more difficult to secure than for smaller operators. Security, like most cross-cutting risks, affects the business in a myriad of ways, from how we conduct employee background checks, to how we secure mobile devices, to how we configure PLCs. The more complex an operator, the more analysis, planning, and investment is likely required. At the smaller end of the spectrum, assessing an operator’s attack surface is likely a simpler task, however, the mitigations may carry an outsized price tag. The replacement of hardware, communications devices, or the procurement of services could present a significant cost. Because the ‘guidelines’ are not currently mandated for most operators, it may be difficult to convince ratepayers, interveners, or boards to bear the costs.
Explaining cybersecurity I believe that to lay out and execute a successful cybersecurity defence strategy, you should first solve the domain issue. Ensure that your team(s) focusing on this topic possesses adequate security expertise, knows the pipeline operations as it is done specifically at your company, and are granted the authority (not just influence) to implement mitigations where needed.
Resource constraints will always be present, for operators of all sizes, from investor-owned, to private, co-ops, and municipalities. The key here is that Rome wasn’t built in a day, and neither will your cybersecurity defences be; however, they should be built in a riskprioritised manner. Not all mitigations need to be expensive, either; often, there are security hygiene and process improvement opportunities that cost very little (Hint: ditch the thumb drives and airport Wi-Fi).
Once the internal team is in place and equipped for success, a third-party gap analysis against regulatory guidelines, such as the National Institute of Standards and Technology’s (NIST) ‘Cybersecurity Framework’ (CSF), NIST SP 800-82: ‘Guide to Industrial Control Systems (ICS) Security’, and industry-standard IEC 62443 will populate the areas in need of attention. Documentation is key; not all gaps need to be fixed simultaneously, but it is likely better to demonstrate due diligence in finding and road-mapping the gaps than to have nothing to show an auditor, regulator, or investigator should you need to.
The inclusion of cybersecurity threats on your risk registers and in your 49 CFR-driven emergency management plans and transmission integrity management plans are a must; PHMS has partnered with DHS/CISA to include specific questions in their integrated audit protocol related to these programmes. The next time you see PHMSA or your state programme for a programmatic audit, they will also ask you how you intend to operate your pipeline ‘manually’ should your IT or OT systems be compromised.
Conclusion Put simply, the threat of hacktivists, or terrorists, prodding our infrastructure, looking to cause damage, is significant. It spares no operator, and most are at least somewhat unprepared to tackle it alone. The best approaches, regardless of scale, require installing the right programme leadership, engaging expertise in both security and pipeline operations, and equipping the team with the authority and resources needed to implement.
Ross Adams, General Manager, EnerSys Corporation, USA, talks optimising the human element of alarm management to support control room performance.
ne of the most important aspects of control room management in pipeline operations is effective alarm management. Every control room should focus on proactively preventing escalation to abnormal operating conditions (AOCs) or emergency situations.
To achieve this goal, operators should look for ways to optimise human performance in alarm management. This includes building a system where control room operators can achieve situational awareness during all operating conditions. This way, your control room can operate safely and effectively.
What goes into an alarm management system? Pipeline operators need to strike the balance of accomplishing business objectives while doing no harm to people, property, or the environment. By implementing a system that equips human operators to respond to alarms, your operation can improve alarm response to any threats.
Ideally, an alarm management system should be designed to ensure that all alarms are received at the pipeline operator’s console in the appropriate time frame, and with the most relevant information. This way, the operator is equipped to act before an unplanned situation or shutdown occurs.
The ideal flow is that an alarm comes into the system, a pipeline operator performs activity to address the alarm, support personnel are mobilised, and a situation is prevented before an operator’s system needs to be shut down.
The goal of alarm management is not to never trip safety shutdowns. Rather, the idea is to prevent escalation to a shutdown by effectively responding to alarms. Alarm management focuses on how to equip the operator with notifications, graphics, and action plans to see and understand the abnormal operating condition, and provide them with the proper action steps to avoid tripping safety shutdowns.
1. Rationalisation The starting point to effectively operate without shutdown is ‘rationalisation’. This includes defining the severity of each alarm, the time available to respond, the possible root causes of the alarm, the methods of diagnosis, and the recommended actions. In many existing alarm management systems, it is common to find that most alarms aren’t designed to achieve this operating goal.
Often, the most challenging constraint is that there is very limited time available for human response. For example, when the alarm setpoint is at or near the safety trip, the operator typically has very little time to respond and avoid a shutdown. Providing adequate response time often requires redesigning the facility, the automation, and/or the process. This can place significant strain on operations. However, operators must find ways to improve the performance of their system so that human operators and support personnel have ample time to identify, analyse, and respond to each alarm. What’s the solution?
2. Analysis Pipeline operators should engage in a regular and intensive alarm analysis process in order to identify bad actors and prevent alarm floods that inhibit the pipeline operator’s ability to process the alarms they are seeing.
A key aspect of this analysis is defining effective alarms that will provide operators with adequate time to recognise and respond to the threat so that you can maintain safe operations without shutting down. It’s important to place operators in a position of strength instead of constantly reacting to alarms.
Operators need to translate the desire to accomplish their business objectives into operational instructions for the operator to be able to identify alarms and manage situations before they escalate.
3. The plan-do-check-act cycle Operational instruction is critical to provide control room operators and support personnel with the necessary information and guidance so that they can perform the plan-do-check-act cycle (PDCA). Following the PDCA cycle will ensure that personnel can achieve situational awareness in all operating conditions. • Plan: Do your personnel understand the alarm management plans, processes, and procedures that you have implemented? • Do: Are personnel following the proper sequence of steps in alarm response to formulate the proper response to each alarm? • Check: Are you checking to see whether personnel followed the steps when responding to each alarm so that you can drive continuous improvement? • Act: Are you acting on the check by holding personnel accountable for following the steps to help reduce the risk of safety incidents?
Implementing the full PDCA cycle in the control room carries additional benefits of strengthening your safety culture. It will positively influence how operators act, behave, and think during each operating condition.
The importance of culture to support alarm management If the purpose of alarm management is to enable you to operate without shutdown, then each alarm presented to a pipeline operator must be meaningful. If not, there is a risk that the operator will ignore or de-prioritise certain alarms.
Over time, a pipeline operator may become conditioned to mentally separate or filter alarms in their mind. Instead of treating each alarm as a threat that needs to be addressed, they might ignore alarms that do not appear to be a threat. But, what about when there is a threat that needs to be contained?
This culture of complacency was a contributing factor to the high-profile Enbridge Pipeline Oil Spill in Marshall, Michigan, USA, in July 2010. According to the official incident report, the control room operators did not take the threat seriously because they ‘misinterpreted’ the rupture and ignored warning signs.
This scenario captures the importance of control room culture. Pipeline operators must reinforce the value of treating each alarm with healthy concern when an alarm is displayed in their console. It’s not up to the operator to pre-determine whether an alarm is actually critical or not. Operators need to remain vigilant when analysing the alarm to formulate the right response.
There are many ways to support a healthy safety culture. Consider actions that should be included in your refinement of the alarm management system:
) Rationalise and re-rationalise the alarms.
) Perform monthly AOC response reviews.
) Utilise monthly AOC findings in lessons learned training.
) Maintain communication involving controllers in setpoint and descriptor reviews.