Please use this identifier to cite or link to this item:
http://bura.brunel.ac.uk/handle/2438/31895
Title: | Improving resilience in chemical plant under cyberattack by adversarial reinforcement learning |
Other Titles: | Operational resilience of chemical plant |
Authors: | Smith, Martyn |
Advisors: | Coletti, F Louvieris, P |
Keywords: | cybersecurity;operational technology;industrial control systems;wargaming;computational control |
Issue Date: | 2025 |
Publisher: | Brunel University London |
Abstract: | With chemical plant contributing $5.4 trillion to the global economy annually, frequently representing a Major Accident Hazard, and their control systems having an average age of 20 years, the prospect of a cyberattack resulting in full or partial breach by threat actors is of great concern to owners, customers and wider stakeholders (the latter frequently including anyone downwind). A great deal of frameworks to deal with this threat exist, including multiple uses of Machine Learning (ML). However, these only go as far as the detection of cyberattacks, leaving a research gap in automated responses or the use of ML agents to both generate novel threat vectors and respond to them. In this thesis, we cover some historical attacks and incidents involving chemical plant and their impacts, and what existing Laws and frameworks already cover how plant safety and cybersecurity should be considered. We then investigate the use of adversarial Reinforcement Learning (RL) for both generating and identifying threat vectors, then detecting and responding to intrusions of chemical plant. To address this gap, we have developed a customised version of the Tennessee Eastman process as example, suitable for use with existing interfaces for training RL agents. We then trialled this with a benchmark suite of test scenarios using different agents, one defending the plant (“Blue Team”) and one attacking the plant (“Red Team”), mimicking a common operational cybersecurity challenge. These agents were implemented with Deep Q learning and Deep Deterministic Policy Gradient algorithms dependent on the scenario, in order to model different Blue Team and Red Team capabilities, with Deep Deterministic Policy Gradient agents having Continuous manipulation of plant variables available. We additionally implemented a variant where the Blue Team was assisted with a digital twin in order to make enhanced predictions of future plant state, and scenarios which varied the Red Team’s goals, with the default being plant shutdown. The Blue Team learnt passive policies during periods of normal plant operation, so that it would not disrupt the plant. The Red Team was primarily effective at achieving its aims, most commonly producing a plant shutdown in slightly under three minutes by inducing an oscillation in reactor pressure, usually by manipulating both setpoints and sensor readings. Deep Q learning variants of the Red Team could also disrupt the plant, sometimes in as little as 25 seconds. This behaviour was, however, independent of the Red Team’s goals. When combined, the Red Team consistently performed well, with the Blue Team only occasionally extending plant uptime. The testbed developed is extensible for further testing against future Blue Team or Red Team agent implementations, however it is likely to require additional theoretical verification in order to be employed on plant. Observed behaviour such as the reactor pressure oscillation observed could be seen as Indicator of Compromise, as could a divergence between the Digtal Twin and physical plant before failure observed in the digital twin assisted version - these indicators would prove of immense value to Security Operations Centre operatives looking to safeguard a real plant. |
Description: | This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University London |
URI: | https://bura.brunel.ac.uk/handle/2438/31895 |
Appears in Collections: | Dept of Chemical Engineering Theses Chemical Sciences |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
FulltextThesis.pdf | 16.7 MB | Adobe PDF | View/Open |
Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.