Planting seeds of recovery01 April 2006

There is a part of all of us that agrees with the notion 'if it ain't broke, don't fix it'. But, equally, when it comes to plant maintenance and operations, the purpose of this activity is to minimise the risk to the business of expensive downtime or outage, and perhaps prevent it happening in the first place. The management and mitigation of risk and probability of unwanted events occurring is key to successful business continuity.

In December 2005, an event occurred which has brought into sharper focus the need to implement effective emergency response planning - the disaster that took place at the Hertfordshire Oil Storage Limited (HOSL) site at Buncefield. According to the HSE: "The investigation on-site has been a complex operation given the devastation, but a wide range of specialists are being used to gather and analyse evidence systematically. In turn, this will ensure a full understanding of what happened, establish root causes and identify any lessons to be learned."

The investigating board have published a progress report, although the cause of the disaster has yet to be determined. While the investigation is ongoing, it is inevitable its findings will impact site- and business-specific disaster recovery, emergency planning, asset management and maintenance practices.

Among others, regulations such as the Dangerous Substances and Explosive Atmospheres Regulations (DSEAR) and Control of Major Accident Hazards Regulations (COMAH) apply to industry and affect the operation and management of plant. There are various best practice guides on maintenance regimes and - with support from manufacturers - techniques for disaster recovery, based around integration with good maintenance practices, are gaining ground.

Cost implications
Disaster recovery can be defined in many ways, and always has an impact on revenue. The important point is to reduce that impact to allow the continued operation of the plant and the business. Another definition from Alstom Power Service is 'the loss mitigation from an unplanned outage following a major or minor component failure resulting in significant revenue loss for a customer'.

Important to all plant managers and engineers, whatever the sector, is the availability of power, be it a small generating set or a large-scale installation, delivering power to a network of consumers or a manufacturing process. Having effective asset management of plant, monitoring the operational characteristics, symptoms and performance of components, from the turbo-alternator to the boiler feed pumps or flow meters, is key to both maintenance regimes and disaster recovery.

Similarly, integration of actions from third-party maintainers, on-site and key support staff is essential to minimise the business impact and cost in lost production of an unplanned outage. Time taken to carry out essential tasks at the start of an unplanned outage can run into days if not weeks, especially if it involves identifying what needs to be done, in what order, who should do what, where the spares and suppliers are, and other key factors.

With a plan in place, this time is reduced and recovery time minimised and, of course, the losses incurred because of the event are reduced.

Fast recovery
One approach, taken by Alstom Power Service, is demonstrating that effective emergency response planning (ERP) can reduce the direct costs of unplanned service outages by as much as 50% and recovery times by an equally impressive amount. This is a significant development and, while in monetary terms, for generating businesses, the cost saved may be in the order of tens of thousands per day, the principles are equally scalable between 50 and 500MW plant installations.

How do emergency or disaster recovery procedures affect plant maintenance practices?

When a failure occurs, high unplanned costs are incurred. With the ERP in place, risk analysis is undertaken, with all major risk scenarios and typical failure scenarios pre-planned and identified. This means that all resources, materials, consumables, tooling and people are in place to support and respond in a specific way, to aid and implement recovery.

Planning for an emergency, when combined with real-time equipment monitoring, can have significant beneficial impacts on the duration of planned and unplanned outages. Critical to the success of the ERP are four main principles:
- Committed, communicated and agreed plan, supported by resources and identified spares.
- Road map of what to do and who is responsible to make it happen.
- Pre-prepared parts lists, QA and repair procedures for each failure condition.
- Identified need for specialist equipment and resources.

In Alstom's ERP, this approach ? "simply extends routine maintenance best practice into unplanned outages".

The basic ERP uses flow charts, with key decision points, actions to be undertaken in priority order, personnel identified, organisation details, communication channels and any additional arrangements. These provide such benefits as the ability to make decisions quickly and efficiently, with predefined arrangements in place to support the failure.

In many industries, emergency and disaster recovery plans are routinely tested, with simulated scenarios. In essence, if ERP is an extension of proactive maintenance regimes into unplanned outages, then the testing of these plans must become embedded within planned outages and routine maintenance.

If the aim of maintenance strategies is to preserve the availability of a physical asset, there are two common strategies to achieve this, where the aim is long-term continuous improvement and not just a quick fix. These are Reliability Centred Maintenance (RCM) and Total Productive Maintenance (TPM). On the one hand, RCM can determine what maintenance needs to be done, based on the performance of the machine, while TPM is targeting maintenance in relation to the business processes.

What Alstom's ERP strategy does is blend the objectives of preventative maintenance programmes, including the definition of potential failure types and scenarios, with planned recovery options for the components that are, or would be, impacted.

As equipment ages, clearly the maintenance costs increase, along with the probability of a failure that affects both the operation of the plant, and the decisions that need to be made about retaining, refurbishing or replacing equipment. These decisions may be influenced by the high value of the plant or the spares stocks retained as part of the asset management strategy.

ERP payback
The decision to shut down and maintain, repair or replace plant equipment - especially generating equipment - is costly, and the ERP aims to reduce these costs. The condition and performance of individual assets are particularly important to the process and these are subject to a range of plant condition monitoring techniques, including on- and off-line periodic monitoring, as well as continuous real-time monitoring.

The ERP should complement existing asset management and maintenance regimes, identifying components and risks, together with detailed failure scenarios, integrating real-time condition monitoring with risk mitigation techniques. In turn, these are used to plan maintenance strategies that reduce the effects of unplanned outages.


Key to best practice
There is already a range of public standards and guides to help anyone who has to prepare a disaster recovery or emergency plan. These include:

- BSI PAS 56 Guide to Business Continuity Management - this publicly available specification was devised to provide guidance on business continuity, focusing on best practice
- ISO17799 Information Security Management - provides the basis for information security management
- The Civil Contingencies Act (2004) - provides a framework for dealing with emergencies locally and nationally
- Risk Management Standard - the Institute of Risk Management's standard is probably one of the key external best practice guides, with a wide ranging scope for reference

Key to best practice
- Having effective asset management of plant, monitoring the operational characteristics, symptoms and performance of components, from the turbo-alternator to the boiler feed pumps or flow meters, is key to both maintenance regimes and disaster recovery.

- When a failure occurs, high unplanned costs are incurred. With emergency response planning (ERP), risk analysis is undertaken, with all major risk scenarios and typical failure scenarios pre-planned and identified.

SOE

This material is protected by MA Business copyright
See Terms and Conditions.
One-off usage is permitted but bulk copying is not.
For multiple copies contact the sales team.