PhD Defence | A data-centric approach towards identification and prediction of anomalies in industrial cyber-physical systems
As complexity of computing machinery, or any design for that matter, grows, the risk of
unwanted operational circumstances increases. This could mean that the design will not
function as intended, or the design will not function as efficiently as expected. In
extreme cases, the design will stop functioning altogether. In other words, the design
will demonstrate anomalous behaviour, while normal behaviour being what the designer had
intended to achieve. As complexity grows, it is harder for the designer to consider every
possible operational corner case, especially when the design is interacting with the
physical realm.
One of the main sources of complexity growth is the computerisation of digital systems,
controlling machinery. The term computerisation refers to the dominance of software in
providing and diversifying new operational capabilities. As an example that we all can
relate to, think of the evolution of cell phones into modern smart phones. As software is
a cyber entity and not directly bound by physical limitations, its growth has been
exponential through the years.
This thesis takes a step towards the detection and identification of anomalous behaviour
within a specific subset of industrial machinery, namely, industrial Cyber-Physical
Systems (CPS). In this endeavour, CPS are considered a high-value target, as their
applications in high-tech industry and infrastructure are numerous.
As a prelude, techniques on the generation of a high-level view of the system, using
communication-centric monitoring and modelling, have been elaborated. This approach
intends to cut through the system’s complexity and capture the essence of its behaviour,
as well as to generate a simplified digital twin. The composed solution takes advantage
of fingerprinting and Machine Learning (ML) techniques and algorithms in tandem, blending
them in a single data-centric pipeline. Here, sensory data revealing Extra-Functional
Behaviour (EFB), is considered as the sole source of the information, revealing the
ongoing behavioural patterns of the system under scrutiny. Such behavioural patterns are
efficiently represented in what we call as behavioural signatures, constructed using
data transformations and regression techniques. Similarly in this context, we consider
reference behavioural signatures as behavioural passports. With such constructs in
place, one can perform statistical tests and quantitatively detect deviations.
The role of Artificial Intelligence (AI) is twofold here. On the one hand, traditional ML
algorithms were deployed with the intention of achieving highest possible accuracy. On
the other hand, deep learning through Convolutional Neural Networks (CNN) has been
utilised as an alternative to achieve identification from a poor information position.
That is, when the scarcity of information related to the design’s specifics, or the
restriction of access to the system’s internals, has to be negotiated. The thesis also
delves into the qualitative differences of the two AI approaches.