Chapter 4 Modeling and diagnosis of EEG signals in BCI Applications

Transcription

i
Università Politecnica delle Marche
Scuola di Dottorato di Ricerca in Scienze dell’Ingegneria
Curriculum in Ingegneria Informatica Gestionale e dell’Automazione
Modeling and Diagnosis of Complex
Systems Dynamics by Data-Driven
Approaches
Ph.D. Dissertation of:
Francesco Ferracuti
Advisor:
Prof. Giuseppe Orlando
Coadvisor:
Prof. Gianluca Ippoliti
Curriculum Supervisor:
Prof. Sauro Longhi
XII edition - new series
Curriculum in Ingegneria Informatica Gestionale e dell’Automazione
Modeling and Diagnosis of Complex
Systems Dynamics by Data-Driven
Approaches
Ph.D. Dissertation of:
Francesco Ferracuti
Advisor:
Prof. Giuseppe Orlando
Coadvisor:
Prof. Gianluca Ippoliti
Curriculum Supervisor:
Prof. Sauro Longhi
XII edition - new series
Facoltà di Ingegneria
Via Brecce Bianche – 60131 Ancona (AN), Italy
To Arianna, for her support and understanding
Acknowledgments
I would like to thank Prof. Longhi for giving me the chance to investigate
different aspects of research, granting me at the same time freedom of choice,
and for his significant contribution to all my Ph.D. course and for his time and
effort spent for my research activity.
I also wish to thank Dr. Gianluca Ippoliti, whose support has been determinant for the results obtained in these years.
My thanks to Prof. Orlando for its cordiality and availability.
Finally, a special thank goes to all my colleagues and friends, whose company
has made enjoyable the academic job.
Ancona, Gennaio 2013
Francesco Ferracuti
ix
Abstract
Complex systems are found in almost all field of contemporary science, and
are associated with a wide variety of financial, physical, biological, information
and social systems. Complex systems consist of a large number of nonlinearly
interacting components which display collective behaviour that does not follow
trivially from the behaviours of the individual parts. Although complex systems have many different properties, the most important are: dimensionality,
uncertainty, nonlinearity and coupling between components. The procedures
to obtain analytical models are usually classified into physical modeling and
identification. These procedures cannot be implemented easily if applied to
complex systems because the properties, which arise from these systems, make
difficult the modeling. Since a mathematical model is a description of system
behaviour, accurate modeling for a complex system is very difficult to achieve
in practice. Furthermore, sometimes, it may even be impossible to describe the
system by analytical equations.
The present dissertation tries to address two issues regarding the modeling
and diagnosis of complex systems. The first one deals with the issue of modeling
a complex system, in the case the analytical model is not obtainable. The
second one deals with the issue of diagnosing the system behaviour. Diagnosis
should detect if the complex system is normal or a change is occurring due to
abnormal events and in addition, the probable causes of the abnormal events
should be detected by means of the diagnosis.
Modeling of complex systems is addressed developing data-driven procedures,
which are able to learn the complex system dynamics from data that are provided by installed sensors on the system in order to monitor the physical system
variables.
Diagnosis of complex systems is addressed developing machine learning procedures in order to classify the probable causes of deviations from system normal events.
A large amount of attention is paid to the issue of modeling and diagnosis
for complex systems with particular attention to real systems, for this different
applications are discussed as many case studies. The first application deals
with the issue of modeling and diagnose the defects and faults in a Quality
Control scenario for electric motors. The second application deals with the
issue of modeling and diagnose a complex industrial system as a paper mill
xi
plant. The third application deals with the issue of estimating the residual
useful life of a turbofan engine and the last deals with the issue of modeling the
Electroencephalography signals by data-driven algorithms in order to diagnose
the user intentions. This solution addresses the modeling problem in the Brain
Computer Interface context. Since the modeling and diagnosis problem is faced
by data-driven procedures, the developed algorithms can be applied to a wide
class of rotating electrical machines and complex industrial systems, and not
only to those mentioned.
xii
Sommario
I sistemi complessi possono essere rinvenuti in quasi tutti i campi della scienza
contemporanea, e possono avere diversa natura: finanziaria, fisica, biologica,
informativa, sociale, ecc. I sistemi complessi consistono di un gran numero di
componenti che interagiscono non linearmente tra loro e che mostrano un comportamento collettivo non derivante semplicemente dal comportamento delle
parti individuali. Sebbene tali sistemi godano di numerose proprietà, le più
importanti sono: dimensionalità, incertezza, non linearità e accoppiamento tra
i componenti. Le procedure per ottenere modelli analitici di determinati sistemi
sono solitamente classificate in modellazione fisica e identificazione. Queste procedure possono essere difficilmente implementabili se applicate a sistemi complessi perché le loro caratteristiche peculiari rendono difficile la modellazione.
Poiché un modello matematico è una descrizione del comportamento di un sistema, una modellazione accurata per i sistemi complessi è molto difficile da
ottenere in pratica. Per di più, a volte, potrebbe addirittura essere impossibile
descrivere il sistema attraverso equazioni analitiche.
Alla luce di quanto emerso, la presente trattazione si propone di affrontare
due problemi riguardanti la modellazione e la diagnosi dei sistemi complessi: il
primo riguarda specificamente la modellazione di un sistema complesso, nel caso
in cui il modello analitico non sia ottenibile; il secondo si riferisce alla diagnosi
del comportamento del sistema. Quest’ultima attività dovrebbe rilevare se il
sistema complesso è normale o se sta avvenendo un cambiamento dovuto a
eventi anomali, nonché le cause probabili di tali eventi.
La modellazione dei sistemi complessi viene affrontata sviluppando metodi
data-driven, che sono capaci di apprendere le dinamiche del sistema complesso
direttamente dai dati forniti da sensori installati sul sistema, al fine di monitorarne le variabili fisiche.
La diagnosi dei sistemi complessi viene invece affrontata sviluppando metodi
di apprendimento automatico in modo da classificare le probabili cause di
scostamento da eventi normali del sistema.
Nella trattazione ampia attenzione è posta al problema di modellazione e
diagnosi di sistemi complessi con riferimento a sistemi reali, presentando diverse applicazioni pratiche e casi di studio. Il primo caso di studio riguarda la
modellazione e diagnosi di difetti e guasti di motori elettrici in uno scenario di
controllo qualità mentre il secondo si riferisce ad un sistema complesso indusxiii
triale, quale quello di una cartiera. Nel terzo caso viene affrontata la questione
di stimare la vita utile rimasta di un motore turbofan e l’ultimo tratta il problema di modellare segnali elettroencefalografici attraverso algoritmi basati sui
dati. Dato che il problema di modellazione e diagnosi è affrontato attraverso
procedure basate sui dati, gli algoritmi sviluppati possono essere applicati ad
un’ampia classe di macchine elettriche rotanti e sistemi complessi industriali, e
non solo a quelli riportati.
xiv
Contents
1 Introduction
1.1 Data-Driven approach to Modeling and Diagnosis of Complex
Dynamic Systems . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.1 Data-Driven Modeling of Complex Systems Dynamics .
1.1.2 Data-Driven Diagnosis of Complex Systems Dynamics .
1.2 Data-Driven Modeling and Diagnosis of Complex Systems Dynamics: Applications . . . . . . . . . . . . . . . . . . . . . . . .
1.2.1 Rotating Machines . . . . . . . . . . . . . . . . . . . . .
1.2.2 Industrial Systems . . . . . . . . . . . . . . . . . . . . .
1.2.3 Brain Computer Interface . . . . . . . . . . . . . . . . .
1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
3
3
4
4
4
5
6
6
2 Modeling and Diagnosis of Electric Motors in a Quality Control Scenario
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Quality Control . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 Data-Driven Modeling and Diagnosis of Induction Motor by Current Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.1 Recalled Results . . . . . . . . . . . . . . . . . . . . . .
2.3.2 Developed Algorithm . . . . . . . . . . . . . . . . . . . .
2.3.3 Case Study . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4 Data-Driven Modeling and Diagnosis by Vibration Signals . . .
2.4.2 MSPCA Formulation . . . . . . . . . . . . . . . . . . . .
2.4.3 Developed Experimental Setup . . . . . . . . . . . . . .
2.4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
11
13
15
15
19
26
32
33
34
3 Modeling of Complex Systems with FDI and Prognosis Applications
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Modeling and Diagnosis of a Paper Mill Plant: FDI Application
3.2.2 Case Study . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . .
41
41
42
44
44
46
7
7
9
xv
Contents
3.3
Modeling and diagnosis of a Turbofan Engine: Prognosis
cation . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.1 Problem Definition and Process Model . . . . . .
3.3.2 Hidden Markov Model and Prognosis Procedure
3.3.3 Features Extraction . . . . . . . . . . . . . . . .
3.3.4 Implementation and Results . . . . . . . . . . . .
4 Modeling and diagnosis of EEG signals
4.1 Introduction . . . . . . . . . . . . .
4.2 Auditory BCI . . . . . . . . . . . .
4.3 Spatial Audio . . . . . . . . . . . .
4.4 Testing Methodologies . . . . . . .
4.4.1 Participants . . . . . . . . .
4.4.2 Data Acquisition . . . . . .
4.4.3 EEG Signals Modeling . . .
4.4.4 Information Transfer Rate .
4.5 Results . . . . . . . . . . . . . . . .
4.5.1 Classification Performance .
4.5.2 ITR Performance . . . . . .
in BCI Applications
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
Appli. . . .
. . . .
. . . .
. . . .
. . . .
51
56
58
62
63
.
.
.
.
.
.
.
.
.
.
.
71
71
72
74
76
76
76
76
79
79
79
81
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5 Concluding Remarks
5.1 Modeling and Diagnosis of Electric Motor in a Quality Control
Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Modeling of Complex Systems with FDI and Prognosis Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3 Modeling and diagnosis of EEG signals in BCI Applications . .
xvi
85
85
86
86
List of Figures
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
2.10
2.11
2.12
2.13
2.14
2.15
2.16
2.17
2.18
2.19
Efficiency characterization of tested induction motors . . . . . .
Interpolated PDFs of a finite element heathy motor in the twodimensional principal components space estimated by KDE. . .
Interpolated PDFs of a finite element motor with one broken bar
in the two-dimensional principal components space estimated by
KDE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Interpolated PDFs of a finite element motor with one broken
connector in the two-dimensional principal components space
estimated by KDE. . . . . . . . . . . . . . . . . . . . . . . . . .
K-L divergence in the case of a finite element healthy motor . .
K-L divergence in the case of a finite element motor with one
broken bar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
K-L divergence in the case of a finite element motor with one
broken connector . . . . . . . . . . . . . . . . . . . . . . . . . .
Interpolated PDFs of real healthy motors in the two-dimensional
principal components space estimated by KDE. . . . . . . . . .
Interpolated PDFs of real motors with cracked rotor in the twodimensional principal components space estimated by KDE. . .
Interpolated PDFs of real motors with wrong rotor in the twodimensional principal components space estimated by KDE. . .
K-L divergence in the case of real healthy motors . . . . . . . .
K-L divergence in the case of real motors with cracked rotor . .
K-L divergence in the case of real motors with wrong rotor . .
Single-phase 25W motor for kitchen hoods mounted on pallet;
PCB accelerometers are installed on pallet. . . . . . . . . . . .
Impeller with backlash: contribution weights of approximation
matrix A3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Misaligned impeller: contribution weights of approximation matrix A3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Misaligned impeller: contribution weights of D3 scale matrix. .
16
17
18
18
19
20
21
21
23
23
24
24
25
34
35
36
36
37
37
xvii
List of Figures
2.20 Unbalanced impeller: contribution weights of approximation matrix A3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.21 Contribution weights of A7 scale matrix for the first accelerometer in the case of unbalanced impeller. . . . . . . . . . . . . .
2.22 Contribution weights of A7 scale matrix for the first accelerometer in the case of unbalanced impeller. . . . . . . . . . . . . .
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
3.10
3.11
3.12
3.13
3.14
3.15
3.16
3.17
3.18
4.1
4.2
4.3
xviii
38
39
39
Simulated abrupt fault. (a) SPE of reconstructed PCA; (b) SPE
of approximation matrix A3 . . . . . . . . . . . . . . . . . . . .
46
Simulated abrupt fault. (a) SPE contribution of approximation
matrix A3 ; (b) SPE contribution of reconstructed PCA . . . .
47
49
Simulated harmonic fault. SPE of detail matrix D1 . . . . . .
Simulated harmonic fault. SPE contribution of detail matrix D1 49
Signals of broke handling with fault. (a) Sensor 5, inlet pressure;
(b) Sensor 13, motor broke deflacking current . . . . . . . . . .
51
Real fault. (a) SPE of reconstructed PCA in logarithmic scale;
(b) SPE of approximation matrix A3 in logarithmic scale . . .
52
Real fault. SPE contribution of reconstructed PCA . . . . . . .
53
Real fault. (a) SPE contribution of approximation matrix A3 ;
(b) SPE contribution of detail matrix D3 . . . . . . . . . . . .
54
Real fault. (a) SPE contribution of detail matrix D2 ; (b) SPE
contribution of detail matrix D1 . . . . . . . . . . . . . . . . .
55
Simplified diagram of turbofan engine . . . . . . . . . . . . . .
58
Fault progression process described by a HMM . . . . . . . . .
61
Faultless features extracted from ANN training data . . . . . .
64
Bayesian information criterion of fan engine 1 . . . . . . . . . .
65
Cluster of training data . . . . . . . . . . . . . . . . . . . . . .
65
(a) Turbofan estimated RUL in presence of FAN fault; (b) Turbofan health states sequence in presence of FAN fault . . . . .
66
(a) Turbofan estimated RUL in presence of HPT fault; (b) Turbofan health states sequence in presence of HPT fault . . . . .
67
(a) Turbofan estimated RUL in presence of HPC fault; (b) Turbofan health states sequence in presence of HPC fault . . . . .
68
(a) Turbofan estimated RUL in presence of LPT fault; (b) Turbofan health states sequence in presence of LPT fault . . . . .
69
Spatial hearing . . . . . . . . . . . . . . . . . . . . . . . . . . .
Electrode set for recording and analysis . . . . . . . . . . . . .
Selection scores, for auditory stimuli (stimulus duration 1500
ms, ISI 250 ms), plotted as a function of the number of iterations
for the users 1, 5, 6, 8, 9 and 13. . . . . . . . . . . . . . . . . .
75
77
80
List of Figures
4.4
4.5
4.6
4.7
4.8
Mean selection accuracy, for auditory stimuli (stimulus duration
1500 ms, ISI 250 ms), plotted as a function of the number of
iterations for fourteen participants. . . . . . . . . . . . . . . . .
Selection accuracy boxplot, for auditory stimuli (stimulus duration 1500 ms, ISI 250 ms), of all participants. Boxplot is
evaluated with all iterations. . . . . . . . . . . . . . . . . . . .
ITR for auditory stimuli (stimulus duration 1500 ms, ISI 250
ms), plotted as a function of the number of iterations for the
subjects 1, 5, 6, 8, 9 and 13. . . . . . . . . . . . . . . . . . . . .
Mean ITR, for auditory stimuli (stimulus duration 1500 ms, ISI
250 ms), plotted as a function of the number of iterations for
fourteen participants. . . . . . . . . . . . . . . . . . . . . . . . .
ITR boxplot for auditory stimuli (stimulus duration 1500 ms,
ISI 250 ms), of all participants. Boxplot is evaluated for all
iterations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
81
82
82
83
83
xix
List of Tables
2.1
2.2
2.3
2.4
Classification accuracy in the case of finite element motor
Classification accuracy in the case of real motors . . . . .
TEP results with improved detection index . . . . . . . .
FDD results for motors installed in kitchen hoods. . . . .
.
.
.
.
.
.
.
.
.
.
.
.
22
25
28
38
3.1
3.2
3.3
3.4
3.5
3.6
Stock preparation sensors . . . . . . . . . . . .
Fault diagnosis with simulated abrupt fault. . .
Fault diagnosis with simulated harmonic fault.
Fault diagnosis with real fault. . . . . . . . . .
C-MAPSS inputs . . . . . . . . . . . . . . . . .
C-MAPSS outputs . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
44
48
50
50
58
59
4.1
Classification accuracy, target accuracy and non-target accuracy
for auditory stimuli . . . . . . . . . . . . . . . . . . . . . . . . .
80
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
xxi
Chapter 1
Introduction
Complex systems are found in almost all field of contemporary science. Typically, complex systems belong to different systems classes such as natural (e.g.,
biological evolution), financial (e.g., stock markets and economies), social and
belonging to industrial systems. Complex systems consist of a large number of nonlinearly interacting components, often called agents, which displays
collective behaviour that does not follow trivially from the behaviours of the
individual parts. These systems are open, they interchange information with
environment and constantly modify their internal activity structure and patterns in the self-organization process [1]. Complex systems have several characteristics of the structure and behaviour. In the case of complex industrial
systems one of these properties is the hierarchical structure, which is related
to multilevel structure organization of the system. Another property is the
strong coupling between agents defined in the complex system. Computational
problems, which arise from the dimensionality property, and uncertainty are
other typical characteristic that complex systems possess.
Technological advances in the process industries in recent years have implied in increasingly complicated systems, processes and products that pose
the attention on considerable challenges in their modeling, analysis, design,
manufacturing and management for successful operation and use over their
life cycles. The increasing of complexity implies the research of more efficient
mathematical frameworks which are able to model these systems. A model is
a mathematical representation of a physical, biological or information system.
Models of real systems are of fundamental importance in virtually all disciplines. Models can be useful for different purpose such as analysis, i.e., for
gaining a better understanding of the system, predict or simulate a system’s
behaviour, control and analysis. In engineering, models are required for the
design of new processes and for the analysis of existing processes [2].
The procedures to obtain analytical models are usually classified into physical
modeling and identification. However in many contexts (like industrial, natural,
financial) the development of a mathematical model of the process is a difficult
task due to the complexity of the processes that are involved. Modeling may
1
Chapter 1 Introduction
demand considerable engineering efforts and thus becomes not practical for
complex processes. Modeling complex systems with physical models could be
very difficult or not possible, on the other hand mathematical models that
describe the complex systems dynamics could not be easily manipulated.
Complex systems, of which the analytical model is not available, could be
modeled by data-driven approaches. Data-driven approaches, simply speaking,
are based on data matrices, which usually contain measurements of physical
process variables, and computational intelligence and machine learning algorithms. Machine learning is an automatic computing based on several operations to learn models from data. Particularly, it’s the study of computational
methods and algorithms in order to improve the performance of machines by
automating the acquisition of information and knowledge from experience or
data [3].
Different machine learning paradigms include artificial neural networks (e.g.,
multilayer perceptrons, self-organizing maps, radial basis function neural networks); instance-based learning (e.g., case-based reasoning, nearest neighbour
methods); rule induction; genetic algorithms, where knowledge is typically represented by Boolean features, sometimes as the conditions and actions of rules,
statistics and analytical learning. Machine learning algorithms are of fundamental importance in virtually all disciplines, with a lot of application in engineering contexts. In this field often the modeling objective for complex systems
is that of monitoring the processes in order to evaluate and analyze the performance of these systems and diagnose the system behaviours.
The strong coupling between processes is the cause of cascading failures in
complex systems as industrial processes. Due to the strong coupling between
agents, a failure in one or more components can lead to cascading failures
which could have catastrophic consequences on the functioning of the system.
Moreover a fault in one or more agents can lead to cascading faults which may
induce the system in failure. A basic diagnosis should detect if the complex
system is normal or a change is occurring due to abnormal events. This objective is related to Fault Detection and Isolation applications, in which the
systems are modeled by model-based or data-driven approaches, in order to
detect the location and time of a fault. In addition, the diagnosis should detect
the fault probable causes to enable appropriate supervisory control decisions
and actions to bring the process back to a normal, safe operating state become
all the more important.
2
1.1 Data-Driven approach to Modeling and Diagnosis of Complex Dynamic Systems
1.1 Data-Driven approach to Modeling and
Diagnosis of Complex Dynamic Systems
In this section a brief description of modeling and diagnosis issues are presented
with some details and considerations concerning the described case studies.
1.1.1 Data-Driven Modeling of Complex Systems Dynamics
Data-driven modeling may not have the mathematical information of the system but it needs of data matrices, which are relevant of the system dynamics.
The data-driven construction of models for complex systems can be cast in
a general framework consisting of a number of different mathematical units.
These include a data matrix representing the process or system being monitored. Data are processed to extract a relevant feature matrix from the data
matrix. Finally, machine learning algorithms are applied to feature matrix to
build a data-driven model that describes the dynamics of the considered complex system. Particularly, the model must be able to simulate and predict the
system dynamics and generalize the dynamic learnt by the feature matrix used
for the modeling. The data matrix usually contains measurements of physical process variables, which can be of different classes such as discrete, logic,
acquired at different sampling frequency and also qualitative.
Data-driven models would be able to model the dynamics of a particular
system or process if a considerable amount of data describing the dynamics is
available and if there are no considerable changes to the modeled system during
the period covered by the model. However this last case may be solved using
several data-driven models, which are able to cover the whole dynamics of the
system.
In this dissertation the modeling issue is faced by data-driven approaches for
different complex systems. The choice of the artificial intelligence algorithms
and relevant features for the considered system is a problem-dependent. For
example in the case of FDI for a paper mill plant, which is a critical process monitored by many sensors, Principal Component Analysis algorithm is
considered because is computationally fast and is able to describe easily the
correlation between process variables. In the case of FDI for electric motor,
Wavelet Transform is considered because the faults can be described by vibration signals in frequency and time domain. Moreover, the solution based
on Kernel Density Estimation and Kullback-Leibler divergence, which are described in section 2.4, is able to map the frequency features of electrical signals
in a 2-D space.
3
1.1.2 Data-Driven Diagnosis of Complex Systems Dynamics
In the context of complex systems the objective of diagnosis is to know the
system conditions and behaviours by the measurements of physical variables
and the model provided by the data-driven modeling. Data-Driven diagnosis is
linked to the concept of classification. In machine learning theory a distinction
can be made between supervised, unsupervised classification. In unsupervised
learning problems, unlabelled data are used, which is determined by a cost
function to be minimized. Supervised algorithms give better classification results but they require more information than unsupervised classificator. In
supervised learning, the training data consist of a set of data each of which is
a pair comprising an input and an output vector so labelled data is required to
train supervised classificators. If the output is continuous, a regression function
is learnt, otherwise, if the output is discrete, a classification function is learnt.
Both supervised and unsupervised algorithms provides a parameterized model,
which gives in output the labeling of input data.
As for the modeling, the diagnosis issue is also a problem-dependent. For example in the case of the Quality Control application, a supervised classification
is considered because the defects knowledge is available. In the case of a paper
mill plant a classificator is not considered for diagnosis because obviously the
faults knowledge is not available apriori, so the monitoring algorithm is able
only to detect and isolate the faults.
1.2 Data-Driven Modeling and Diagnosis of
Complex Systems Dynamics: Applications
Several complex systems are described in this dissertation. The modeling and
diagnosis issues are faced by data-driven approaches for industrial complex
systems such as electric motor, a paper mill plant, a turbofan engine. The
complex system, which belongs to biological systems and faced in chapter 4,
concerns the Brain Computer Interface.
1.2.1 Rotating Machines
Rotating machines are among the most important devices in many industrial
applications and are frequently integrated in commercially available equipment
and industrial processes. Rotating machines are well known dynamical systems
with accurate analytical models and extensive results in literature. Although
these models, which are mainly nonlinear, describe accurately the dynamics,
not all intrinsic dynamics are described by these models. Unmodeled dynamics
can be vibrations, thermal drift, which change significantly the electrical and
4
1.2 Data-Driven Modeling and Diagnosis of Complex Systems Dynamics: Applications
mechanical parameters of the machines, and faults. Mechanical faults in rotating machines, such as misalignment, broken bars, gear and bearing defects are
not simple to model by analytical way for this the modeling and the diagnosis, with Fault Detection and Isolation applications, is dealt with data-driven
approaches. Another reason to support data-driven methods is that the use
of models require the knowledge of parameters. In manufacturing industries,
where a wide kind of different rotating machines could be produced with many
different models and parameters, the accurate knowledge of model parameters
is not available. In spite of this, data-driven approaches require to monitor a
sample of faultless reference machines, for tuning the necessary parameters. In
this dissertation both electrical and mechanical machines are faced. Chapter 2
deals with the issue of modeling and diagnose defects by Fault Detection and
Isolation applications in a Quality Control scenario for electric motors. DataDriven modeling and diagnosis solutions are proposed in order to model and
diagnose mechanical faults, which could not be diagnosed by analytical models. Chapter 3 deals with the monitoring problem of a complex system as a
turbofan engine. In this case a prognosis application is proposed.
1.2.2 Industrial Systems
A paper mill plant is a classical example of a complex systems, which consists
of several coupling nonlinear systems (e.g. paper machine, stock preparation)
and subprocesses (e.g, fun pump, tine unit, mixing unit, pope reel, pulping,
broke handling machine, deflaking). Paper mill plant is often monitored by
a lot of sensors, which measures many different variables, which can be discrete, logic and acquired at different sampling frequency. These variables could
also describe qualitatively the states of the systems. One of the problems in
complex industrial systems as a paper mill plant concerns the high number
of variables that are monitored. This is a classical example of dimensionality
problem in complex system. Chapter 3 proposes a Fault Detection and Isolation application for monitoring processes in a paper mill plant, which helps
the early detection and identification of faults. The modeling and diagnosis
algorithm uses the correlation among sensors to transform the original multivariate variable space into a subspace which preserves maximum information of
the original space. By means of algorithm the main information is maintained
and that not relevant is discarded. However, this transformation fails to make
use of correlation within the sensor along the time line so another data-driven
algorithm is used with the first to capture also the correlation within a sensor.
5
1.2.3 Brain Computer Interface
The brain is assumed to be a classical example of a complex, self-organized system. As such, it exhibits hallmarks of nonlinearity, multistability, and “nondiffusivity”. These oscillations are produced by large ensembles of synchronized
neuronal activity and the resulting electrophysiological signals in the different frequency bands are associated with different functional states (e.g. sleep,
wake, perception and attention). Computational studies adopt a variety of
abstractions in order to deal with complex dynamical systems like the brain.
Brain-Computer Interfaces are devices which translate the brain activity of the
user into specific signals, which may be used for communicating or controlling
external devices [4, 5] without the use of peripheral nerves and muscles [6].
Brain-Computer Interfaces represent an interesting option to people affected
by neuromuscolar disorders, but whose brain activity is normal, such as in
patients affected by Amyotrophic Lateral Sclerosis. In this context Electroencephalography signals are monitored and modeled by data-driven algorithms
in order to diagnose when the user focuses the attention on auditory stimuli
and when he doesn’t.
1.3 Thesis Outline
The thesis is organized in 4 chapters.
Chapter 2 - In this chapter are proposed two data-driven Fault Detection and Diagnosis algorithms based on stator current and vibration signals in
a Quality Control scenario. Several experimentation are carried out on real
electric motors in order to model the faults dynamics and diagnose the fault
probable causes.
Chapter 3 - In this chapter are proposed two solutions to model and diagnose two different complex systems: a paper mill plant and a turbofan engine.
These solutions are applied in order to monitor these complex systems in Fault
Detection and Isolation and Prognostic context. A brief description of the two
complex system is provided as well.
Chapter 4 - This chapter deals with the issue of modeling the Electroencephalography signals, which measures the brain electric activity, and diagnose
from these signals the user intention. The proposed solutions is an auditory
Brain Computer Interface paradigm for systems based on P300 signals which
are generated by auditory stimuli characterized by different sound typologies
and locations.
Chapter 5 - The final chapter summarizes the obtained results, providing
a critical analysis of them and giving an insight of possible future works.
6
Chapter 2
Modeling and Diagnosis of
Electrical Motors in a Quality
Control Scenario
2.1 Introduction
Electric motors are among the most important electrical machines in many
industrial applications and are frequently integrated in commercially available
equipment and industrial processes. Electric motors equipment often provide
tools and solutions to monitor the machines in order to insure reliability and
safety for equipment and personnel. In spite of these tools, many companies are
still faced with unexpected system failures and reduced machine lifetime. Environmental, duty, design errors and installation issues may combine to reduce
residual useful life far sooner than the designed electrical machine lifetimes.
Advances in sensors, algorithms, and architectures should provide the necessary technologies for effective incipient failure detection. Fault Detection and
Diagnosis (FDD) algorithms could be used not only with the fault diagnosis
purpose but also to improve the Quality Control (QC) of these machines. These
algorithms can be integrated in test benches at the end or in the middle of the
production line in order to test the machines quality. When the electric motors
reaches the test benches, the FDD procedures acquire sensors measurements
and detect if the motor is normal or defective, in this last case further inspections can diagnose the defect type in order to improve the production efficiency,
the machines reliability and the customer satisfaction. In this context, FDD
algorithms have the advantage, compared to the standard FDD algorithms, to
be implemented offline because it could be not necessary to detect the defects
early.
Particularly in a FDD scenario for electric motor, several solutions have been
proposed which are mainly based on different measures: stator current and
vibration signals. Electric motors are well known dynamical systems with accurate analytical models and extensive results in literature. Model-Based Fault
7
Chapter 2 Modeling and Diagnosis of Electric Motors in a Quality Control Scenario
Detection and Isolation (FDI) algorithms for electric motors are fast, online,
reliable and allow to protect the power system and the machine from incipient faults, but in literature there are a few diagnosis contributions for electric
motors. Model-Based solutions are very limited because the analytical models,
which describe the faults dynamic that occurs in the electric motors, don’t exist
for vibration and current analysis therefore FDD solutions are mainly based
on data-driven algorithms. Failure surveys report that faults, in induction motors, are: stator related (38%), rotor related (10%), bearing related (40%) and
others (12%) [7]. Typical faults, which occur in rotating electrical machines
are:
• Unbalance.
• Bent shaft.
• Eccentricity.
• Misalignment.
• Backlash.
• Gear defects.
• Bearing defects.
• Electrical faults.
• Shaft cracks.
These faults and defects are not modeled by analytical models, however they
can be described by signal-based approaches. Another reason to support the
signal-based methods is that the use of models require the knowledge of parameters. In manufacturing industries, where a wide kind of different electric
motors is produced with many different models and parameters, the accurate
knowledge of model parameters is not available. In spite of this, signal-based
approaches require to monitor a sample of faultless reference motors, for tuning
the necessary parameters in a fast training stage.
This chapter proposes two data-driven FDD algorithms based on Motor Current Signature Analysis (MCSA) and vibration signals in a QC scenario, in
order to develop a monitoring system and improve the reliability of electric
motors. This procedures allow to diagnose defects and faults of electric motors at the end of the production line in a motors production plant. In the
first algorithm, the FDD algorithms process the three-phase stator current and
the vibration signals in order to model patterns related to healthy and faulty
motors, which can be used as typical features of each motors condition. After
the modeling stage, a distance is used as an index to identify the dissimilarity
8
2.2 Quality Control
between two patterns. Such index allows the automatic identification of each
fault and defect. Several simulations and experimentations are carried out in
order to verify the effectiveness of the proposed methodology: broken rotor
bars and connectors are simulated, while experimentations on real induction
motors at the end of production line are presented. In the case of vibration
signals a probabilistic algorithm is used for fault diagnosis on the residual signals. Several simulations and experimentations are carried out by a test bench
of single-phase electric motors which are mounted on kitchen hoods.
This chapter is based on the problem and the results presented in [8, 9, 10].
2.2 Quality Control
In industry, QC is a collection of methods that are able to improve the quality and efficiency in processes and productions and in many others industry
aspects. In 1924, Walter Shewhart designed the first control chart and gave a
rationale for its use in process monitoring and control [11]. The main concept
of QC is the “proactiveness”, in order to ensure the product quality, monitoring
processes and related signals to detect when they “go out of control”. In the
last years, manufacturing industries are reversing many attentions and efforts
for the introduction of QC in the production lines and with large volumes of
low-tech products are concentrating many investigations on the efficient introduction of QC in their production lines.
One of the major problems, in which these manufacturing industries are
involved, is the customers satisfaction, because they usually purchase lots of
products with some unwanted defective component. In order to satisfy customers, manufacturing industries carry out some spot checks at the end of
product lines. This method does not ensure the quality of products and total
defective products removal. If, for example, a company produces electric motors for kitchen hoods in a −4σ defect level [12], the total number of defective
motors could be about 30 considering 10000 motors produced per day, each
one built by 30 assembling operations. The number of checks needed to find
all the possible defects are too expensive and too high when compared with
the product cost. A desirable QC solution for these manufacturing industries
should be minimally invasive, effective and with a low payback period. In addition, the testing could be made systematic using a low-cost system base on
a reduced set of sensors integrated in the test bench.
When the electric motor reaches the end of production line the FDD system
acquires sensors measurements and detects if the product is defective or not.
Moreover, by isolating and identifying the defective type, the FDD procedure
helps to estimate in which subprocess the defect is introduced, and allows
to remove the defective products and to improve the quality of processes as
9
a proactive measures for the QC methodology. Statistics can be computed
in order to investigate about the production subprocesses efficiency and then
to improve those subprocess which introduce more defects in the products. In
this way, the tests performed at the end of production lines allow to remove the
defective products, estimate the efficiency drop of subprocesses which introduce
defects in the products and improve the quality of processes as a proactive
measures for the QC methodology.
2.3 Data-Driven Modeling and Diagnosis of
Induction Motor by Current Signals
In induction motors, faults and defects are often correlated to the three-phase
stator current signals, which can be processed to model the motors behaviour
by patterns that represent the normal and abnormal motor conditions. MCSA
is a noninvasive, on-line monitoring solution based on current measurements,
which are available by inverter, for the diagnosis of faults in induction motors [7,
13, 14]. MCSA techniques use data-driven procedures to model patterns by
current signals which are indicative of normal and abnormal motor conditions.
Moreover MCSA procedures can be used to detect and diagnose not only classic
motor faults (i.e. rotor eccentricity), but also gear faults (i.e. tooth spalls) [15].
In order to model the three-phase stator current signals, Principal Component Analysis (PCA) and Kernel Density Estimation (KDE) are taken into
account. PCA is used in data pre-processing to reduce the currents space in
two dimensions. The Probability Density Function (PDF) of PCA-transformed
signals is estimated by KDE, which is a non-parametric method useful to assess
the data distribution [16]. PDFs are the models that can be used to identify
each fault and defect. The advantage of non-parametric approaches, respect
to parametric ones, is that they offer greater flexibility in modeling a given
dataset, and they are not affected by problems as stated in [16] (and reference
therein). In the test KDE with Gaussian kernel function is considered and the
plug-in bandwidth selection procedure is applied [17].
Diagnosis has been carried out using the Kullback-Leibler (K-L) divergence,
which measures the difference between two probability distributions. This divergence is used as a distance measure between classified statistic signatures
obtained by KDE. K-L is an index that allows to identify the dissimilarity
between two determined probability distributions (that can also be multidimensional): one concerns the modeled signatures and the other concerns the
acquired data samples. By K-L divergence, the classification of each motor
condition is performed.
10
2.3 Data-Driven Modeling and Diagnosis of Induction Motor by Current Signals
2.3.1 Recalled Results
Principal Component Analysis
PCA is a dimensionality reduction technique that produces a lower dimensional representation in a way that preserves the correlation structure between
the process variables capturing the variability in the data [18]. By PCA, the
correlation among sensors is used to transform the multivariate space into a
subspace which preserves maximum variance of the original space in minimum
number of dimensions. In other words, PCA rotates the original coordinate
system along the direction of maximum variance. Considering a data matrix
X ∈ RN ×m of N sample rows and m variable columns that are normalized to
zero mean, with mean values vector µ, the matrix X can be decomposed as
follows:
X = X̂ + X̃,
(2.1)
where X̂ is the projection on the Principal Component Subspace (PCS) Sp ,
and X̃, the residual matrix, is the projection on the Residual Subspace (RS)
Sr [19]. Defining the loading matrix P , whose columns are the right singular
vectors of X, and selecting the columns of the loading matrix P ∈ Rm×p , which
correspond to the loading vector associated with the first p singular values, it
follows that:
X̂ = XP P T ∈ Sp .
(2.2)
The residuals matrix X̃, is the difference between the data matrix X and its
projection into the first p principal components retained in the PCA model:
X̃ = X(I − P P T ) ∈ Sr ,
(2.3)
therefore the residual matrix captures the variations in the observations space
spanned by the loading vectors associated with the r = m − p smallest singular
values. The projections of the observations in X into the lower-dimensional
space are contained in the score matrix:
T = XP ∈ RN ×p .
(2.4)
PCA is here applied to the currents of three-phase induction motor in order to
reduce the inputs space from the three original dimensions to two because the
currents are highly correlated. Indeed for healthy motor, with three-phase
without neutral connection, ideal conditions for the motor and a balanced
voltage supply, the stator currents are given by Eq. (2.5), where ia , ib and ic
denote the three stator currents, Imax their maximum value, f their frequency,
φ their phase angle and t the time. Then it is known that each stator current
11
is given by the combination of the others:



ia (t) = Imax sin(2πf t − φ)
ib (t) = Imax sin(2πf t − 2π/3 − φ)


i (t) = I
c
max sin(2πf t − 4π/3 − φ).
(2.5)
The PCA transform ( 2.4), applied to the signals in Eq. (2.5), makes the smallest singular values equal to zero. This implies that the information of the
principal component, captured by the smallest singular values, is null then the
last principal component could be deleted and the original space reduced from
three to two without losing information. This is justified by the fact that in
Eq. (2.5), each stator current is perfectly correlated to the sum of the others.
Adding Gaussian white noise, with standard deviation σ, to the stator current
signals ( 2.5), the smallest singular values will not be equal to zero, but it will
depend by the ratio between Imax and σ.
Kernel Density Estimation
Given N independent and identically distributed (i.i.d.) random vectors X =
[X1 , . . . , XN ], where Xi = [Xi1 , . . . , Xid ], whose distribution function F (x) =
P [X ≤ x] is absolutely continuous with unknown PDF f (x). The estimated
density at x is given by [20]:
4
3
N
1 Ø 1
x − Xi
f (x) =
,
K
N i=1 |H|d
|H|d
(2.6)
In the present study a two-dimensional Gaussian kernel function is used so d
is 2 and a further simplification, which follows from the restriction of kernel
*
)
bandwidth H = h2 I : h > 0 , leads to the single bandwidth estimator so the
estimated density f (x) becomes [21]:
f (x) =
N
ëx−Xi ë2
1 Ø
1
−
2h2
e
.
N i=1 (2πh2 )1/2
(2.7)
where x ∈ Rd whose size ngrid is the points number in which the PDF is estimated. It is well known that the value of the bandwidth h and the shape
of the kernel function are of critical importance [22]. In many computationalintelligence methods that employ KDE, the issue to find the appropriate bandwidth h is the issue [22, 23, 24]. In the present work the Asymptotic Mean
Integrated Squared Error (AMISE) with plug-in bandwidth selection procedure is used to choose automatically the bandwidth h [17]. In the proposed
algorithm, KDE is used to model a specific pattern for each motor condition,
12
indeed the features of the current signals are mapped in the two-dimensional
principal components space, representing specific signatures of the motor conditions.
Kullback-Leibler Divergence
Given two continuous PDFs f1 (x) and f2 (x), a measure of “divergence” or
“distance” between f1 (x) versus f2 (x) is given by [25]:
I1:2 (X) =
Ú
f1 (x) log
Rd
f1 (x)
dx,
f2 (x)
(2.8)
f2 (x)
dx.
f1 (x)
(2.9)
and between f2 (x) versus f1 (x) is given by:
I2:1 (X) =
Ú
f2 (x) log
Rd
Therefore the K-L divergence between f1 (x) and f2 (x) is:
J(f1 ; f2 ) = I1:2 (X) + I2:1 (X) =
Ú
f1 (x)
=
(f1 (x) − f2 (x)) log
dx.
f2 (x)
Rd
(2.10)
The above equation is known as the symmetric K-L divergence, which represents a non negative measure between two PDFs. In the present work d is 2
and a discrete form of K-L divergence is adopted:
ngrid
J(f1 ; f2 ) =
d
ØØ
i=1 j=1
(f1 (xij ) − f2 (xij )) log
f1 (xij )
.
f2 (xij )
(2.11)
The K-L divergence allows to define a fault index: if fΩ is the PDF in the PCs
space estimated by KDE of the oncoming current measurements, the present
motor condition is that which minimizes the K-L divergence between fΩ and
fi that is the i-th PDF related to each modeled motor condition:
c = arg minJ(fΩ ; fi ),
(2.12)
i
where c is the classification output.
2.3.2 Developed Algorithm
The developed FDD procedure based on KDE consists of two stages: training
and FDD monitoring. In the first, one KDE model is computed by feature
signals for each motor condition, in order to have one KDE model in the case
13
of healthy motor and one KDE model for each faulty case. The training steps
are summarized below:
T1. Stator current signals for each motor condition are acquired;
T2. Data are normalized;
T3. PCA transform ( 2.4) is applied to stator current signals, which are projected into the two-dimensional principal components space;
T4. The matrices P and µ are stored;
T5. KDE is performed on the lower-dimensional principal components space
( 2.4) using a grid of ngrid points and a bandwidth h for the Gaussian
kernel function ( 2.7);
T6. PDFs estimated by KDE ( 2.7) are stored.
In diagnosis step, the previously obtained models are compared with the new
data and a fault statistical index is calculated. The diagnosis steps are summarized below:
D1. Stator current signals are acquired;
D2. Data are normalized;
D3. The matrices P and µ, previously computed (T3), are applied to signals;
D4. KDE is performed on the lower-dimensional principal components space
( 2.4) using the same points grid ngrid and bandwidth h used in the
training step (T5);
D5. Symmetric K-L divergence ( 2.11) is computed between the estimated
PDF by KDE ( 2.7) using the acquired current signals, and those stored
in the training step (one for each condition) (T6);
D6. Diagnosis is evaluated using Eq. (2.12).
Faults are identified using the Eq. (2.12) where fΩ is the PDF, estimated by
KDE, in the PCs space of the oncoming current measurements and fi is the
i − th PDF related to each modeled motor condition. K-L divergence is used as
an input for fault decision algorithm allowing to take decision automatically on
the operating state and condition of the machine and detecting any abnormal
operating condition.
14
2.3.3 Case Study
In order to verify the effectiveness of the proposed methodology several simulations are carried out using one benchmark and some experimentations using real
asynchronous motors. The benchmark uses a Time Stepping Coupled Finite
Element-State Space modeling approach to generate current signals for induction motors as described in [26]. The simulation dataset consists of twenty-one
different motor conditions, which are: one healthy condition, ten broken bars
conditions and ten broken connectors conditions. Twenty time series are generated for each motor condition. Each signal consists of 1500 samples. The
dataset can be download from UCR time series data mining archive [27]. The
characteristics of the three-phase induction motors are: input voltage 208 V ,
frequency 60 Hz, number of rotor bars 34, pole number 2 and power 1.2 hp.
The sample rate is 33.3 kHz and the processed data, for each test, are related
to 0.5 s of acquisition. White noise with standard deviation σ = 0.2 is added to
the simulated current signals. The results are the average of 200 Monte Carlo
simulations where the training and testing data sets are randomly changed.
The real tests are carried out using three phase induction motors whose
parameters are: input voltage 380 V , frequency 50 Hz, power 0.75 kW , sample
rate 20 kHz. Two different faults are tested: wrong rotor and cracked rotor.
Wrong rotor refers to a non compliant rotor, in particular a single phase rotor
is assembled instead of a three phase rotor. Ten motors for each faulty and
for the healthy case are tested. The acquisition time is 14 s. The processed
data, for each test, are related to 0.7 s of acquisition. In this study case the
results are the average of 2000 Monte Carlo simulations where the training and
testing data sets are randomly changed. The motors, with a defective rotor
installed, have at the operating point of 2800 RP M about 3% of efficiency
drop, as shown in Fig. 2.1. So it is important to detect this defect in the
energy efficiency context and QC.
2.3.4 Results
The proposed approach processes three-phase stator currents in order to perform defects detection and diagnosis as described in section 2.3.2. The following
two subsections show the results related to the two cases described previously.
From Fig. 2.2 to 2.13 the simulation and experimentation results are shown.
The classification accuracy is considered as an index to evaluate the performance of the proposed algorithm as shown in tables 2.1 and 2.2. This index
is obtained using the K-L distances probability distributions of each class, approximated as normal distributions and estimated by Monte Carlo trials. The
simulations are carried out changing ngrid , the points number in which the
PDF is estimated, and the current signals acquisition time in steady-state.
15
80
75
70
65
60
%
55
50
45
40
35
30
25
20
2000
2100
2200
2300
2400
2500
2600
Speed [RPM]
2700
2800
2900
3000
Figure 2.1: Efficiency characterization of tested induction motors. Blue solid
line refers the healthy motor, red dashed line refers to motor with
defective rotor.
Figs. 2.5, 2.6, 2.7, 2.11, 2.12 and 2.13 show the K-L distances for all Monte
Carlo trials. On each vertical line, the central dot is the mean and the horizontal edges are the 4 times standard deviation. The figures show the results
with ngrid = 64 × 64 points and the acquisition time, for the benchmark and
real motors, equals to 0.4 s and 0.7 s respectively. This algorithm parameters
setting guarantees better results for these cases taking into account the classification accuracy and the processing time. In the real motor the algorithm
takes about 2.5 s for the classification output (Eq. 2.12): about 1 s to acquire
the current signals, of which 0.25 s in transient state and 0.7 s in steady-state,
and about 1.45 to evaluate the PDF and the classification output (Eq. 2.12).
Setting ngrid = 32 × 32 points, the processing time is reduced to 1.5 s but
decreasing the classification accuracy as shown in Tables 2.1 and 2.2. The tests
are performed for both cases using the asymmetric K-L divergence (Eq. 2.9).
The results, related to the symmetric K-L divergence and described in the
next subsections, are comparable to those achieved with the asymmetric K-L
divergence.
Broken Rotor Bars and Connectors Diagnosis
In the following, Figs. 2.2, 2.3 and 2.4 represent the pattern of the healthy
motor, one broken bar and one broken connector conditions; these figures show
how the PDFs, estimated by KDE in the principal components space, are used
as the specific patterns for the motor conditions. The simulation results, given
in Figs. 2.5, 2.6 and 2.7, show the faults diagnosis for broken rotor bars and
connectors, setting ngrid = 64 × 64 and the current signals acquisition time
16
in steady-state equals to 0.4 s. Fig. 2.5 shows the K-L divergence among the
PDFs, estimated by KDE, of all motor conditions (i.e. healthy, from one to
ten broken rotor bars and from one to ten broken connectors) and the PDF
estimated by KDE from stator current signals of healthy motor. The results
show as the minimum K-L distance is exactly the healthy condition. Fig. 2.6
shows the K-L divergence among all PDFs and the PDF estimated from stator
current signals affected by one broken rotor bar. In this case the graph shows
as the minimum K-L distance is exactly the broken bar condition. The last
graph, Fig. 2.7, shows the one broken connector diagnosis. Even in this case the
K-L divergence detects and identifies the fault, that is one broken connector.
By Monte Carlo simulations, all fault types are diagnosed with 100% accuracy
hence the K-L divergence figures for the others fault types are not reported.
Moreover the classification accuracy is 100% with acquisition time above 0.4 s
for each fault type, while below 0.4 s, the classification accuracy decreases as
shown in Table 2.1.
Figure 2.2: Interpolated PDFs of a finite element heathy motor in the twodimensional principal components space estimated by KDE.
Real Induction Motors Diagnosis
The Figs. 2.8, 2.9 and 2.10 represent the patterns of three real motors conditions: healthy, cracked and wrong rotor; these figures show as the PDFs,
estimated by KDE in the principal components space, are distinct and therefore can be used as a specific pattern for each motor condition. The simulation
results given in Figs. 2.11, 2.12 and 2.13 show the faults diagnosis for cracked
and wrong rotors, setting ngrid = 64 × 64 and the current signals acquisition
17
Figure 2.3: Interpolated PDFs of a finite element motor with one broken bar
in the two-dimensional principal components space estimated by
KDE.
Figure 2.4: Interpolated PDFs of a finite element motor with one broken connector in the two-dimensional principal components space estimated by KDE.
time in steady-state equals to 0.7 s. Fig. 2.11 shows the K-L divergence among
the PDFs, estimated by KDE, of all motor conditions (i.e. healthy, cracked
and wrong rotors) and the PDF estimated by KDE from stator current signals
of healthy motor. The results show as the minimum K-L distance is exactly
the healthy condition. Fig. 2.12 shows the K-L divergence among all PDFs and
18
2.4 Data-Driven Modeling and Diagnosis by Vibration Signals
"$
./001234 56710689:7;68<6=36
",
*
(
&
$
,
!
"# $# %# &# '# (# )# *# +# ",# "- $- %- &- '- (- )- *- +- ",-
Figure 2.5: K-L divergence in the case of a finite element healthy motor. The
blue dots are the mean, the blue bars are the four times standard
deviation and the red asterisks are the classification output. Label
H means healthy motor, labels 1-10B mean broken bars with the
relative number, labels 1-10C mean broken connectors with relative
number.
the PDF estimated from stator current signals where cracked rotors are diagnosed. In this case the graph shows as the minimum K-L distance is exactly
the cracked rotor condition. The last graph, Fig. 2.13, shows the wrong rotor
diagnosis. Even in this case the K-L divergence detects and identifies the fault.
By Monte Carlo simulations, all fault types are diagnosed with accuracy shown
in Table 2.2. It can be noticed how the classification accuracy in the case of
healthy motor is always 100%, therefore the algorithm is able to detect if the
motors are healthy or if there are some faults or defects. In Figs. 2.12 and 2.13
the blue lines of motors with cracked and wrong rotor are never overlapped to
the blue lines of healthy motors so, in these tests, the algorithm never confuses
the cases of healthy motors from those not healthy.
2.4 Data-Driven Modeling and Diagnosis by
Vibration Signals
In electric motors, faults and defects are often correlated to the vibration signals, which can be processed to model the motors behaviour by patterns that
represent the normal and abnormal motor conditions. Vibration analysis is
widely accepted as a tool to detect faults of a rotating machine as it is not
19
(
./001234 56710689:7;68<6=36
'
&
%
$
"
,
!
"# $# %# &# '# (# )# *# +# ",# "- $- %- &- '- (- )- *- +- ",-
Figure 2.6: K-L divergence in the case of a finite element motor with one broken bar. The blue dots are the mean, the blue bars are the four
times standard deviation and the red asterisks are the classification
output. Label H means healthy motor, labels 1-10B mean broken
bars with the relative number, labels 1-10C mean broken connectors
with relative number.
destructive, reliable and it permits continuous monitoring without stopping
the machine [28, 29, 30, 31, 32, 33]. In particular, it is possible to detect different faults that can arise in rotating machines by analyzing the vibration
power spectrum. Most common defects in these machines are unbalance and
misalignment. Unbalance generates mainly a radial component at the rotation
frequency and an axial component at the same frequency. Unbalance may be
caused by poor balancing, shaft inflexion (i.e. thermal expansion) and rotor
distortion by magnetic forces (a well known problem in high power electrical
machines). Misalignment generates a radial component at double of rotating
frequency and an axial component at rotation frequency. Misalignment may
be caused by misaligned couplings, misaligned bearings or crooked shaft. High
misalignment can produce sub-synchronous instability phenomenon. This effect is due to the oil whirl and a decrease in the bearing load.
Moreover, components of the spectrum over the rotation frequency are due
to bearings, events that occur many times per round, signal distortion, mechanical non linearities (i.e. backlash and loose coupling). In case of a cracked
shaft there is an increasing of vibrations at rotation frequency and at second
harmonic. Torsional vibrations are angular oscillations that overlap the normal rotational motion. Due to erosion of mechanical components in rotating
machines torsional vibrations rise up and involve the following effects:
20
+
./001234 56710689:7;68<6=36
*
)
(
'
&
%
$
"
,
!
"# $# %# &# '# (# )# *# +# ",# "- $- %- &- '- (- )- *- +- ",-
Figure 2.7: K-L divergence in the case of a finite element motor with one broken
connector. The blue dots are the mean, the blue bars are the four
times standard deviation and the red asterisks are the classification
output. Label H means healthy motor, labels 1-10B mean broken
bars with the relative number, labels 1-10C mean broken connectors
with relative number.
Figure 2.8: Interpolated PDFs of real healthy motors in the two-dimensional
principal components space estimated by KDE.
• High gears noise.
• Joint damage.
21
Table 2.1: Classification accuracy in the case of finite element motor, changing
ngrid , the points number in which the PDF is estimated, and the current signals acquisition time in steady-state. Label H means healthy
motor, labels 1-10B mean broken bars with the relative number,
labels 1-10C mean broken connectors with relative number.
ngrid
Acquisition
time (s)
128 × 128
0.3
0.15
64 × 64
0.3
32 × 32
0.15
0.3
0.15
%
H
1B
2B
3B
4B
5B
6B
7B
8B
9B
10B
1C
2C
3C
4C
5C
6C
7C
8C
9C
10C
100
100
100
100
100
100
100
100
99.82
100
100
100
99.98
99.88
99.79
99.98
100
100
100
100
100
100
100
100
100
100
100
100
99.98
95.34
99.74
99.43
100
96.89
91.71
97.61
98.96
100
100
100
99.99
99.89
100
100
100
100
100
100
100
100
99.89
100
100
100
99.88
99.72
99.84
99.99
100
100
100
100
100
100
100
100
100
100
99.96
100
99.94
95.97
99.53
99.32
100
95.99
95.74
98.10
98.49
100
100
100
100
99.77
100
100
100
100
100
100
100
100
99.55
100
99.99
100
99.56
98.48
99.93
99.96
100
100
100
100
100
100
100
100
99.93
99.89
99.70
99.87
98.19
91.41
98.56
99.43
99.99
91.01
93.75
96.87
96.37
99.98
99.94
99.88
99.47
96.95
Mean
99.97
99.03
99.97
99.18
99.88
98.15
• Accelerated wear and breakage of the gear.
• Deformation of keys.
• Misalignment of coupling hubs.
• Accelerated wear of the windings of electric motors.
• Fatigue failure of shaft.
• Erratic distribution of power.
Non-integer multiples of shaft speed may arise by belt drives, gears, etc. Often, a fault arising in a rotating machine increases the vibration amplitude
associated with the fault. For instance, if a fault occurs in gears, the vibration
amplitude of a whole family of sidebands increases in a specific region of its frequency spectrum, while a ball-bearing fault is characterized by an increment
22
Figure 2.9: Interpolated PDFs of real motors with cracked rotor in the twodimensional principal components space estimated by KDE.
Figure 2.10: Interpolated PDFs of real motors with wrong rotor in the twodimensional principal components space estimated by KDE.
in the amplitude of a family of harmonics. In Machine Vibration Signature
Analysis (MVSA), the Fourier transform is used to determine the vibration
spectrum [34], and the signature at different frequencies is identified and compared with initial measurement to detect faults in the machine. The short
coming of this approach is that the Fourier analysis is limited to stationary
signals, while vibrations are not stationary by nature.
23
945
:;$$<#*= >"+<$".-?+,".2"1*"
9
845
8
745
7
645
6
345
3
!"#$%&'
(")"*%+,"-./%/.
0./12-./%/.
Figure 2.11: K-L divergence in the case of real healthy motors. The blue dots
are the mean, the blue bars are the four times standard deviation
and the red asterisks are the classification output.
9
:;$$<#*= >"+<$".-?+,".2"1*"
845
8
745
7
✁
645
6
345
3
!"#$%&'
(")"*%+,"-./%/.
0./12-./%/.
Figure 2.12: K-L divergence in the case of real motors with cracked rotor. The
deviation and the red asterisks are the classification output.
In order to model the vibration signal, Multi-Scale Principal Component
Analysis (MSPCA) is taken into account [35]. MSPCA deals with processes
that operate at different scales, and have contributions from:
• events occurring at different localizations in time and frequency;
24
945
:;$$<#*= >"+<$".-?+,".2"1*"
9
845
8
745
7
645
6
345
3
!"#$%&'
(")"*%+,"-./%/.
0./12-./%/.
Figure 2.13: K-L divergence in the case of real motors with wrong rotor. The
deviation and the red asterisks are the classification output.
Table 2.2: Classification accuracy in the case of real motors, changing ngrid , the
points number in which the PDF is estimated, and the current signals acquisition time in steady-state. Motor conditions are: healthy,
motor with cracked rotor and motor with wrong rotor.
128 × 128
64 × 64
ngrid
Acquisition
time (s)
0.7
0.5
0.3
0.7
Healthy
Cracked rotor
Wrong Rotor
100
98.82
98.97
100
95.08
99.47
100
77.08
99.49
Mean
99.26
98.18
92.19
32 × 32
0.5
0.3
0.7
0.5
0.3
100
99.00
99.18
100
94.74
99.36
100
86.54
98.56
100
98.29
99.85
100
94.02
99.41
100
81.45
99.21
99.39
98.03
95.03
99.38
97.81
93.55
%
• stochastic processes whose energy or power spectrum changes with time
and/or frequency;
• variables measured at different sampling rate or containing missing data.
MSPCA transforms the process data information at different scales by Wavelet
Transform (WT). The information of each different scales is captured by PCA
modeling. These models, which represent the process conditions, can be used
to identify each fault and defect. WT is appropriated for extracting process
information from vibration data since the wavelets, with their time-frequency
25
localization and multi-resolution property, can be used as a useful framework
for multi-scale data representation [36].
To isolate the defects a KDE algorithm is used on the PCA residuals, and
thresholds are computed for each sensor signal to determine if, for each wavelet
matrix, the signals are involved in the defect or not. KDE method is widely
recognized as a robust methodology to determine numerically the data PDF,
in particular such estimation technique is introduced where Gaussian assumption is not recognized [37]. Fault isolation has been carried out by contribution
plots that takes into account the spacial correlations. This approach is based on
quantifying the contribution of each process variable to the individual scores of
the PCA representation, and for each process variable summing the contributions only of those variables responsible for the out-of-control status. Diagnosis
can be performed using the contribution plots because they represent the signatures of the rotating electrical machines conditions. In the QC context, a
supervised classificator, with input the PCA contributions, is used to diagnose
each motor defect. The results show that the identified signatures by PCA
contributions, are unique for each considered defect.
Principal Component Analysis
PCA is introduced in the section 2.3.1, here a improved PCA fault detection
index is described. A deviation of the new data sample X from the normal
correlation could change the projections onto the subspaces, either Sp or Sr .
Consequently, the magnitude of either X̃ or X̂ could increase over the values
obtained with normal data. The Square Prediction Error (SPE) is a statistic
that measures lack of fit of a model to data. The SPE statistic indicates
the difference, or residual, between a sample and its projection into the p
components retained in the model. The exact description of the distribution
of SPE is given in [38]:
. .2 .
.2
SP E ≡ .X̃ . = .X(I − P P T ). .
(2.13)
SP E ≤ δ 2
(2.14)
The process is considered faultless if:
where δ 2 is a confidence limit for SPE. A confidence limit expression for SPE,
when x follows a normal distribution, is developed in [39, 36, 34]. The detectability of a fault is given by conditions proven in [40] and recalled in the
following. Defining:
X = X ∗ + f Ξ,
(2.15)
26
where the sample vector for normal operating conditions is denoted by X ∗ , f
represents the magnitude of the fault and Ξ is a fault direction vector. Necessary and sufficient conditions for detectability are:
• Ξ̃ = (I − P P T )Ξ Ó= 0, with Ξ̃ the projection of Ξ on the residual
subspace;
- • -f˜- = -(I − P P T )f - > 2δ, with f˜ the projection of f on the residual
subspace.
The drawbacks of SP E index for fault detection are mainly two: the first
is related to the assumption of normal distribution to estimate the threshold
of this index, the second is that the SP E is a weighted sum, with unitary
coefficients, of quadratic residues X̃i . To improve the fault detection, these
two drawbacks are faced assuming that the process is considered faultless if,
for each i:
(2.16)
X̃i2 ≤ δi
i = 1, . . . , m.
where δi is a confidence limit for X̃i2 . To estimate the confidence limit δi , even
when the normality assumption of X̃i2 is not valid, the solution is to estimate
the PDF directly from X̃i2 through a non parametric approach. In [37, 41, 42],
KDE is considered because it is a well established non parametric approach to
estimate the PDF of statistical signals and evaluate the control limits. Assume
y is a random variable and its density function is denoted by p(y). This means
that:
Ú
k
p(y)dy.
P (y < k) =
(2.17)
−∞
Hence, by knowing p(y), an appropriate control limit can be determined for a
specific confidence bound α, using Eq. (2.17). Replacing p(y), in Eq. (2.17),
with the estimation of the probability density function of X̃i2 , called p̂(X̃i )2 ,
the control limits will be estimated by:
s δi
−∞
p̂(X̃i2 )dX̃i2 = α.
(2.18)
PCA and Eqs. 2.16 and 2.18 are applied to Tennessee Eastman Process
(TEP) [43] in order to show the advantages of these solutions. The process
consists of five major unit operations: a reactor, a product condenser, a vaporliquid separator, a recycle compressor, and a product stripper. The process
has 12 manipulated variables, 22 continuous process measurements, and 19
compositions sampled less frequently. In this study, total 33 variables suggested
by [44] are used for monitoring. Detail descriptions about the selected variables
are well explained in [44]. All composition measurements are excluded because
they are hard to measure on-line practice. A sampling interval of 3 min was
used to collect the simulated data for the training and testing sets. Both the
27
training and testing data sets for each fault are composed of 960 (48 hours)
samples. All faults in the test data set were induced from the sample 160 (8
hours). The detailed fault information is well presented in [44]. Since the TEP
is open-loop unstable, after approximately 2 hours (simulated time) the reactor
pressure will exceed the upper bound of 3000 kP a and the simulation will shut
down, so the decentralized control strategy, described in [45], is used. TEP
model can be downloaded from the website address [46]. Table 2.3 shows the
Table 2.3: TEP results with improved detection index
Indices
SP E
%
X̃i2
%
Fault
Fault
Fault
Fault
Fault
Fault
Fault
2.33
0.63
4.07
23.84
0.7
0.47
71.42
10.42
1.07
12.56
30.65
1.55
0.64
72.72
14.78
18.52
3
5
9
12
15
16
18
Mean
results related to the faults more difficult to detect. The results point out that
in the case of faults not much noticeable, the fault detection index (Eq. 2.18)
is more accurate to detect these faults than the SP E index.
Fault isolation and diagnosis are performed by the PCA contributions: defining the new observation vector xj ∈ Rm , the total contribution of the ith
process variable Xi is
CON Ti =
qN
j=1
x̃2ij
i = 1, . . . , m.
(2.19)
Wavelet Transform
The Wavelet Transform (WT) is defined as the integral of the signal f (t) multiplied by scaled, shifted version of basic wavelet function φ(t), that is a real
valued function whose Fourier transform satisfies the admissibility criteria [47].
Then the wavelet transformation c(·, ·) of a signal f (t) is defined as:
c(a, b) =
s
R
f (t) √1a φ
a ∈ R+ − {0}
b∈R
28
! t−b "
a
dt
(2.20)
where a is the so-called scaling parameter, b is the time localization parameter.
Both a and b can be continuous or discrete variables. Multiplying each coefficient by an appropriately scaled and shifted wavelet it yields the constituent
wavelets of the original signal. For signals of finite energy, continuous wavelets
synthesis provides the reconstruction formula:
4
3
Ú Ú
1
1
t − b da
f (t) =
c(a, b) √ φ
db
(2.21)
K φ R R+
a
a2
a
where:
Kψ =
Ú
+∞
−∞
| ψ̂(ξ) |
dξ
|ξ|
(2.22)
denotes a (Wavelet specific) normalization parameter in which φ̂ is the Fourier
transform of φ. Mother wavelets must satisfy the following properties:
s +∞
−∞
|φ(t)|dt < ∞,
s +∞
−∞
|φ(t)|2 dt = 1,
s +∞
−∞
φ(t)dt = 0.
(2.23)
To avoid intractable computations when operating at every scale of the Continuous WT (CWT), scales and positions can be chosen on a power of two, i.e.
dyadic scales and positions. The Discrete WT (DWT) analysis is more efficient
and just as accurate [47, 48]. In this scheme a and b are given by:
a = aj0 ,
b = b0 aj0 k, (j, k) ∈ Z2 , Z := {0, ±1, ±2, · · · }.
(2.24)
The variables a0 and b0 are fixed constants that are set a0 = 2 and b0 = 1 [48].
The discrete wavelet analysis can be described mathematically as:
q
c(a, b) = c(j, k) = n∈Z+ f (n)φj,k (n),
a = 2j , b = 2j k,
j ∈ Z, k ∈ Z,
(2.25)
considering the simplified notation f (n) = f (n·tc ), n ∈ Z+ and tc the sampling
time, the discretization of continuous time signal f (t) is considered. The inverse
transform, also called discrete synthesis, is defined as:
f (n) =
ØØ
c(j, k)φj,k (n).
(2.26)
j∈Z k∈Z
In [49], a signal is decomposed into various scales with different time and frequency resolutions, this algorithm is known as the multiresolution signal de-
29
composition. Defining:
!
"
φj,k = 2−j/2 φ 2−j t − k ,
!
"
ψj,k = 2−j/2 ψ 2−j t − k ,
Vj = span {φj,k , k ∈ Z} ,
Wj = span {ψj,k , k ∈ Z} .
(j, k) ∈ Z2
(2.27)
the wavelet function φj,k , is the orthonormal basis of Vj and the orthogonal
wavelet ψj,k , called scaling function, is the orthonormal basis of Wj . In [48] is
shown that:
Vj ⊥Wj ,
Vm = Wm+1 ⊕ Vm+1 .
Vm , Wm ⊂ L2 (R)
(2.28)
Defining f (n) = f as element of V0 = W1 ⊕ V1 , f can be decomposed into its
components along V1 and W1 :
(2.29)
f = P1 f + Q1 f.
with Pj the orthogonal projection onto Vj and Qj the orthogonal projection
onto Wj . Defining j ≥ 1 and f (n) = c0n , it results:
f (n) =
1
1
k φ1,k (n) +
k∈Z dk ψ1,k (n),
k∈Z cq
c1k = n∈Z h(n − 2k)c0n ,
q
d1k = n∈Z g(n − 2k)c0n ,
q
q
h(n − 2k) = éφ1,k (n), φ0,n (n)ê ,
g(n − 2k) = éψ1,k (n), ψ0,n (n)ê ,
k, n ∈ Z2 .
(2.30)
where the terms g and h are high-pass and low-pass filter coefficients derived
from the bases ψ and φ. Considering a dataset of N (n = 1, . . . , N ) samples,
and introducing a vector notation, c1k and d1k can be rewrite as [48]:
c1 = Hc0 ,
d1 = Gc0 ,
with

h(0)
h(1)
···

h(−1)
···
 h(−2)
H=
..
 ..
.
···
 .
h(−2k) h(1 − 2k) · · ·
30
(2.31)

h(N )

h(N − 2) 
,
..

.

h(N − 2k)
(2.32)

g(0)
g(1)

g(−2)
g(−1)

G=
..
 ..
.
 .
g(−2k) g(1 − 2k)
···
···
···
···

g(N )

g(N − 2) 
.
..

.

g(N − 2k)
(2.33)
The procedure can be iterated obtaining:
Then:
cj = Hcj−1 ,
dj = Gdj−1 .
(2.34)
cj = Hj c0 ,
dj = Gj d0 .
(2.35)
where Hj is obtained by applying the H filter j times, and Gj is obtained by
applying the H filter j − 1 times and the G filter once. Hence any signal may
be decomposed into its contributions in different regions of the time-frequency
space by projection on the corresponding wavelet basis function. The lowest
frequency content of the signal is represented on a set of scaling functions.
The number of wavelet and scaling function coefficients decreases dyadically at
coarser scales due to the dyadic discretization of the dilation and translation
parameters. The algorithms for computing the wavelet decomposition are based
on representing the projection of the signal on the corresponding basis function
as a filtering operation [49]. Convolution with the filter H represents projection
on the scaling function, and convolution with the filter G represents projection
on a wavelet. Thus, the signal f (n) is decomposed at different scales, the detail
scale matrices and approximation scale matrices. Defining L the decomposition
levels, the approximation scale AL and the detail scales Dj , j = 1, ..., L are
the composition of cj and dj for every m variables of the data matrix X:
Aj = [cj1 , cj2 , . . . , cjm ],
Dj = [dj1 , dj2 , . . . , djm ].
j = 1, ..., L
(2.36)
To select the wavelet decomposition level L, it is considered the minimum
number of decomposition levels, and used to obtain an approximation signal
AL so that the upper limit of its associated frequency band is under the fundamental frequency f , is described by the following condition:
2−(L+1) fs < f.
(2.37)
where fs is the sampling frequency of the signals and f is the impeller rotational frequency [50]. From this condition, the decomposition level of the
31
approximation signal is the integer L given by:
L = ⌈log2 (fs /f ) + 1⌉.
(2.38)
2.4.2 MSPCA Formulation
WT and PCA can be combined to extract both correlation within the sensors
and cross correlation among sensors, in this way it is possible to extract maximum information from multivariate sensor data. MSPCA can be used as a tool
for fault detection and diagnosis by means of statistical indexes. In particular,
faults are detected by using Eqs. 2.16 and 2.18 and the isolation is conducted by
the contributions method (Eq. 2.19). In this way it is possible to detect which
sensor is most affected by fault [36]. Two fundamental theorems exist for the
MSPCA formulation, they assess that PCA principles remains unchanged under the Wavelet transformation. These theorems are useful to apply MSPCA
methodology [35].
è
é′
Theorem 2.4.1 Let W = HL′ , G′L , G′L−1 , · · · , G′1 ∈ RN ×N the orthonormal matrix representing the orthonormal wavelet transformation operator containing the filter coefficients [35], the principal component loadings obtained by
the PCA of X and W X are identical, whereas the principal component scores
of W X are the wavelet transform of the scores of X.
Theorem 2.4.2 MSPCA reduces to conventional PCA if neither the principal
components nor the wavelet coefficients at any scale are eliminated.
The developed FDD MSPCA based procedure consists of two stages: first,
the training step, the faultless data are processed and a model of this data is
built. MSPCA training step are summarized below:
T1. Data are preprocessed and outlier replacement algorithm is used [18, 51];
T2. The Wavelet analysis is used, to refine the data, with a level of detail L
which is chosen by Eq. (2.38);
T3. Normalize mean and standard deviation of detail and approximation matrices and apply PCA to the approximation matrix AL , of order L, and
to the L detail matrices Dj , where j = 1, ...L;
T4. The PCA transformation matrix P and the signal covariance matrix S
are computed for each approximation and detail matrices;
T5. The X̃i signals (Eq. 2.13) are computed, for each wavelet matrix;
32
T6. The δi thresholds are computed, for each detail matrix and for the approximation matrix of order L, using the KDE algorithm (Eq. 2.18) and
a confidence bound α;
In the second phase, the diagnosis step, the model previously obtained is online compared with the new data and a statistical index of failure is calculated.
MSPCA diagnosis step are summarized below:
D1. The previous steps, except threshold computation, T1, T2, T3, T4, and
T5 are repeated for each new dataset, the data are standardized as in the
training step T3 and the PCA and X̃i signals are computed using the P
and S matrices, obtained in the training step;
D2. If any of the X̃i signals is over the thresholds δi , the fault is detected
. .
and the diagnosis is performed by the .X̃i . contributions, else the next
data set is analyzed (return to D1);
. .
D3. Compute all the residual contributions .X̃i ., for each sensor, for all
details and approximation matrices and diagnose the fault type.
2.4.3 Developed Experimental Setup
The diagnosis system has been prototyped on single-phase electric motors with
their respectively pallets. Motors have a power of 45 W at 4500 RP M and the
impeller mounted on the rotor shaft. Fig. 2.14 shows the motor mounted on
its pallet and the accelerometers installed on the radial and axial directions.
A laboratory equipment is used to make measurements and to validate the
procedure.
In rotating machines vibrations arise along two main directions: axial and
radial and two accelerometers along these axis are used. The NI (National
Instruments) CompactRIO (NI cRIO) 9004, is used for data acquisition [52].
The module NI 9233 of NI cRIO 9004 is used to acquire signal from accelerometers. It is characterized by a 24-bit (delta-sigma) resolution analog-to-digital
converter (ADC) with a sampling frequency up to 50kS/s. An oversampling
frequency is used for the ADC converter and then a digital band pass filter is
applied. The filter band has been designed on the basis of an accurate study of
the considered machines. The motor current is measured a current sensor, the
signal is acquired by a NI 9215 module and it is processed in the same way of
accelerometers. A scalable system has been developed where data are stored
and different MSPCA settings have been tested and prototyped. The scalable
system allows to monitor many machines with simultaneous data acquisition
and fault detection analysis.
33
Figure 2.14: Single-phase 25W motor for kitchen hoods mounted on pallet;
PCB accelerometers are installed on pallet.
2.4.4 Results
The proposed approach described in section 2.4.2 has been tested using a
Daubechies mother wavelet of order 15, defined db15 mother wavelet (defined
kernel φ in section 2.4.1). The detail level L is chosen considered the motor
rotation frequency, 75 Hz. The vibration analysis for this type of rotating machines highlights that frequencies over the double of rotation frequency are not
of interest, then a sampling frequency of 300 Hz is chosen. Therefore applying
Eq. (2.38), the level of detail obtained is L = 3. The dimension of Principal Components subspace p chosen by the Kaiser’s rule [18]. Based on the
results stated in [51], a training procedure with different outlier replacement
techniques is proposed.
The diagnosis algorithm is performed offline, at least 2L samples are needed
for the DWT analysis. Incoming batch data samples are then fed into the
MSPCA model and the PCA residual contributions are computed for the matrices Dj , j = 1, . . . , L, AL . In the following, these matrices are defined scale
matrices, and they are compared with the respective thresholds. When, at
any scale, the number of residual contribution samples over the thresholds is
greater than n · α · γ, where α is the significance level used for the threshold δi
calculation (stated in section 2.4.2), n is the sample number of batch data and
γ is a corrective index (fixed equal to 5), a defect is detected and the motor is
considered defective instead of healthy.
34
Once a defect is detected, the isolation and diagnosis tests are performed.
At this step the PCA contributions are computed for each scale matrix that
violates the δi limit, in other words the residual contributions are computed
for each scale that have detected a defect. Fault isolation allows to detect
which sensors are involved in the defect. By using several scales for the DWT
analysis, it is possible to cluster the residual contributions of each scale and
define a unique signature of the motor defect, as for the classical MVSA. More
in detail, the signature of each defect is given by the contributions of each
variable for each scale. This signature is used to diagnose the defect.
A3 matrix
contribution weights
20
15
10
5
0
Axial
Radial
Lem
Figure 2.15: Impeller with backlash: contribution weights of approximation
matrix A3 .
Fig. 2.15 shows the impeller backlash defect isolation for the A3 scale matrix,
in particular the thresholds are the contributions level in the case of healthy
motor. Moreover the current sensor give the main contribution to this defect
as highlighted in Table 2.4.
A misalignment defect involves all the scale matrices and is shown in Figs. 2.16,
2.17, 2.18 and 2.19. When the impeller is misaligned the effect propagates along
the rotating machine, Figs. 2.16, 2.17, 2.18 and 2.19 confirm this result, also
highlighted in Table 2.4, where the radial contribution at high frequencies is
higher and the axial contribution is greater in the other scale matrices.
The unbalanced impeller defect is detected and isolated in Fig. 2.20, also
classified in Table 2.4 where is diagnosed as an unbalance defect. In this defect
the radial contribution is the main contribution as stated in vibration analysis
(in section 2.4).
The following experiments show the application of the algorithm using the
data acquired from accelerometer mounted on the kitchen hoods. In this case
study two accelerometer sensor are placed by robotic arm on the top of the
35
A3 matrix
6
10
5
10
4
10
3
10
2
10
1
10
0
10
Axial
Radial
Lem
D1 matrix
Figure 2.16: Misaligned impeller: contribution weights of approximation matrix A3 .
4
10
3
10
2
10
1
10
0
10
Axial
Radial
Lem
Figure 2.17: Misaligned impeller: contribution weights of D3 scale matrix.
kitchen hoods, when in the production line the kitchen hoods reach the quality
control bench. Motor rotation frequency is 50 Hz and sampling frequency
is 20 kHz. Figs. 2.21 and 2.22 show the contribution plot at approximation
matrix A7 of each sensor. In the production line, some kitchen hoods could
be assembled with unbalanced impeller, this defect introduces higher energy
consumption and more noise than faultless kitchen hoods. These figures show
that the contributions for the first 11 kitchen hoods, which are faultless, are
lower than the last 8, which are assembled with defective motors.
36
D2 matrix
4
10
3
10
2
10
1
10
0
10
Axial
Radial
Lem
D3 matrix
4
10
3
10
2
10
1
10
0
10
Axial
Radial
Lem
37
A3 matrix
23
20
15
10
5
0
Axial
Radial
Lem
Figure 2.20: Unbalanced impeller: contribution weights of approximation matrix A3 .
Table 2.4: FDD results for motors installed in kitchen hoods.
Detection
Isolation
38
Scale
Backlash
Misalignment
Broken
A3
D1
D2
D3
X
-
X
X
X
X
X
-
A3
D1
D2
D3
Lem
-
Ax.
Rad.
Ax.
Ax.
Rad.
-
Contribution Accelerometer 1
Approximation matrix A7
3000
2000
1000
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
motor number
Contribution Accelerometer 2
Figure 2.21: Contribution weights of A7 scale matrix for the first accelerometer
in the case of unbalanced impeller.
Approximation matrix A
7
2000
1500
1000
500
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
motor number
Figure 2.22: Contribution weights of A7 scale matrix for the first accelerometer
in the case of unbalanced impeller.
39
Chapter 3
Modeling of Complex Systems with
FDI and Prognosis Applications
3.1 Introduction
Typically, complex systems belong to different systems classes such as natural
(e.g., biological evolution), financial (e.g., stock markets and economies), social
and industrial systems. Complex systems consist of a large number of nonlinearly interacting components, often called agents, which displays collective
behaviour that does not follow trivially from the behaviours of the individual
parts. These systems are open, they interchange information with environment and constantly modify their internal activity structure and patterns in
the self-organization process. Complex systems have several characteristics of
the structure and behaviour. In the case of complex industrial systems one
of these properties is the hierarchical structure, which is related to multilevel
structure organization of the system. Another property is the strong coupling
between agents defined in the complex system. This property is the cause of
cascading failures in complex system as complex industrial systems. Due to
the strong coupling between agents, a failure in one or more components can
lead to cascading failures which may have catastrophic consequences on the
functioning of the system. Moreover a fault in one or more agents can lead to
cascading faults which may induce the system in failure.
This chapter proposes two solutions to model and diagnose two different
complex systems: a paper mill plant and a turbofan engine. These solutions
are applied in order to monitor these complex systems in the FDI and prognostic context. A data-driven FDI algorithms based on MSPCA (defined in
section 2.4.2) is applied in the case of a paper mill plant. Paper mill plant is
an industrial complex system, which consists of several coupling nonlinear systems (e.g. paper machine, stock preparation) and subprocesses (e.g, fun pump,
tine unit, mixing unit, pope reel, pulping, broke handling machine, deflaking).
Paper mill plant is often monitored by a lot of sensors, which measures many
different variables, which can be discrete, logic and acquired at different sam41
Chapter 3 Modeling of Complex Systems with FDI and Prognosis Applications
pling frequency. These variables could also describe qualitatively the states of
the systems. The second solution is related to a prognostic management of a
turbofan engine. A common turbofan consists of coupling nonlinear systems
such as a compressor, combustor, and a turbine which drives the compressor.
In addition, it has a fan in front of the core compressor and a second power turbine behind the core turbine to drive the fan. Turbofan engines is a complex
system that requires adequate monitoring to ensure flight safety and timely
maintenance.
This chapter is based on the problem and the results presented in [53, 54].
3.2 Modeling and Diagnosis of a Paper Mill Plant:
FDI Application
In paper mill plants, the competition for increasing efficiency and reducing costs
is a primary purpose. Fault detection and diagnosis can help by minimize the
loss of production. In particular, for the stock preparation subprocess a signal
based FDI procedure is developed. MSPCA is used to monitor some critical
variables of the stock preparation of a paper mill plant in order to diagnose
faults and malfunctions. MSPCA simultaneously extracts both, cross correlation across the sensors (PCA approach) and auto-correlation within a sensor
(Wavelet approach). The advantage of MSPCA is validated on considered paper mill plant where several sensors are installed to control and monitor the
automation system.
Chemical process plant safety, production specifications, environmental regulations, operational constraints and plant economics are some of the main
reasons driving an upward interests in research and development of more robust methods for process monitoring and control. Another reason, which motivates the use of advanced techniques to monitor the plant, is that modern
control paradigms are heavily dependent on the quality of data provided by
sensors. This result in a growing necessity of discriminate normal plant situation from abnormal situations. Though infrequent, these abnormal situations
have caused a significant impact on the safety and economy of process industry [36]. It is well known that chemical processes are well equipped with
measurement sensors such as temperature, flow-rate, pressure and electrical
entities sensors. Availability of many sensors provides valuable redundancy
for fault detection and identification because sensors measurements are highly
correlated under normal conditions. These correlations are mainly due to physical and chemical principles governing the process operations such as mass and
energy balances [40]. The paper industry is a productive sector with a high
level of automation and, considering that a modern paper mill has hundreds of
42
sensors and actuators connected to its automation systems, it is evident that
some systematic methods are needed to process the data.
On-line process monitoring with fault detection and diagnosis can provide
efficiency improvement for a wide range of processes, as stated in [55, 10]. A
large number of applications have been reviewed, e.g. by Isermann and Ballé
[56] and Patton et al. [57]. Venkatasubramanian et al. [58, 59, 60] published
an article series reviewing monitoring methods with particular attention in
the field of chemical processes. They classified the methods as model-based,
signal-based and knowledge-based. Signal-based approaches to FDI in paper
mill and, generally in chemical plants, are consolidated and well studied because
for large-scale processes the development of model-based FDI methods require
considerable and eventually too high effort, and moreover because large amount
of data are collected, as stated in [61, 62].
Common features of the statistical methods (a subset of the signal-based
methods) are able to reduce the correlations between variables and the dimensionality of the data. These characteristics enable efficient extraction of the
relevant information from the data. MSPCA, that is a combination of PCA
and wavelet analysis, it is able to remove the autocorrelations of variables by
means of wavelet analysis, and to eliminate cross-correlations between variables with PCA [36]. MSPCA proposed by Bakshi [35] deals with processes
that operate at different scales, and have contributions from:
• events occurring at different localizations in time and frequency;
• stochastic processes whose energy or power spectrum changes with time
and/or frequency;
• variables measured at different sampling rate or containing missing data.
The primary motivation for jointly using PCA and Wavelet Transform comes
from the idea that, in PCA, the correlation among sensors is used to transform
the multivariate space into a subspace which preserves maximum variance of
the original space. However, PCA fails to make use of correlation within the
sensor along the time line. Wavelets, on the other hand, capture the correlation
within a sensor whereas PCA correlates across sensors [36]. Thus, wavelets and
PCA based analysis of multivariate data combine two extremes: signal trend
and correlation. MSPCA is a technique that extracts maximum information
from multivariate sensor data.
Detection and diagnosis of stock preparation process of a paper mill plant
is considered. Recursive MSPCA is applied for on-line fault detection and
diagnosis: once a fault is detected a multi-scale fault identification is performed.
43
The mathematical framework is introduced in section 2.4.1. In order to apply
a recursive MSPCA for on-line fault detection and diagnosis only the following
equation is introduced, which defines the projection of a single sample in the
principal component space. A new observation vector x ∈ Rm can be projected
into the lower-dimensional score space with:
t = xP ∈ Rp .
(3.1)
3.2.2 Case Study
Variable name
Description
CIC T8
mixture consistency [%]
DIC205 MV
Mix short fiber consistency in [%]
DIC219 MV
Mix long fiber consistency in [%]
DIC226 MV
Mix CTMP fiber consistency in [%]
PT207
Broke Machine inlet pressure [bar]
LT242
long fiber tank T6 level [%]
LIV T13
Broke Machine water level [%]
LIV T3
Broke Machine short fiber level [%]
LIV T1 T2
short fiber tank T1 and T2 level [%]
JT MOTOR P20
high press. water pump current [A]
FCV 212 AO1 3
short fiber input valve [%]
JT MOTOR E26
Broke Machine motor current [A]
JT MOTOR E16
Broke deflaking motor current [A]
FCV406
KROFTA Sludge Mix in valve [%]
FB MACC
Broke Machine Set point [%]
Table 3.1: Stock preparation sensors
An important aspect of the developed work consists of applying the MSPCA
formulation to an industrial data set where no a priori assumptions could be
introduced. Data from a stock preparation process of a paper mill plant are
analyzed, in a first test some faults are simulated in order to validate the developed algorithm, then real faults are detected and isolated during the normal
process operation. The stock preparation is a process of the paper mill plant
44
where pulp is usually refined and blended to the appropriate proportion of
hardwood, softwood or recycled fiber, and diluted to as uniform and constant
consistency. The pH is controlled and various fillers such as whitening agents,
size and wet or dry strength additives are added if necessary. Additional fillers
such as clay or titanium dioxide increase opacity to increment the quality of
print. In the considered process, different types of pulp are normally treated in
separate but similar process lines until combined at a blend chest. From high
density storage or from slusher/pulper the pulp is pumped to a low density
storage chest (tank). From there it is typically diluted to about 4% consistency before being pumped to an unrefined stock chest. From the unrefined
stock chest, stock is again pumped, with consistency control, through a refiner.
Refining is an operation whereby the pulp slurry passes between a pair of discs,
one of which is stationary and the other rotating at speeds of typically 1000
or 1200 RP M for 50 and 60 Hz AC, respectively. The discs have raised bars
on their faces and pass each other with narrow clearance. This action unravels
the outer layer of the fibers, causing the fibrils of the fibers to partially detach
and bloom outward, increasing the surface area to promoting bonding. Refining thus increases tensile strength. For example, tissue paper is relatively
unrefined whereas packaging paper is more highly refined. Hardwood fibers are
typically 1 mm long and smaller in diameter than the 4 mm length typical of
softwood fibers. Refining can cause the softwood fiber tube to collapse resulting
in undesirable properties in the sheet. From the refined stock, or blend chest,
stock is again controlled in the consistency as it is being pumped to a machine
chest. It may be refined or additives may be added en route to the machine
chest. The machine chest is basically a consistency leveling chest having about
15 minutes retention. This is enough retention time to allow any variations in
consistency entering the chest to be leveled out by the action of the basis weight
valve receiving feedback from the on line basis weight measuring scanner. The
stock preparation process ends in the machine chest, where the paper machine
picks the refined pulp and realizes the paper. This brief description of stock
preparation process shows the complex dynamics of paper mill plants. An analytical description of the plant is not suitable for fault detection and diagnosis,
then a signal based approach is considered. The knowledge of the considered
process is necessary to choose most meaningful process signals to monitor for
FDI purposes, and allow a proper setup of MSPCA procedure. In particular,
Table 3.1 summarizes the set of sensors chosen for applies the MSPCA FDI
procedure, these sensors measure the most relevant stock preparation process
dynamics taking into account the process knowledge.
45
70
60
SPE
50
40
30
20
10
0
0
500
1000 samples 1500
2000
(a)
80
SPE A3
60
40
20
0
0
50
100 samples150
200
250
300
(b)
Figure 3.1: Simulated abrupt fault. (a) SPE of reconstructed PCA; (b) SPE of
approximation matrix A3
3.2.3 Results
MSPCA approach has been implemented, using a Haar wavelet kernel, a level
of detail L = 3, and the dimension of Principal Components subspace p is
chosen by the Kaiser’s rule [18]. For the training stage of PCA a faultless
dataset is selected and an outlier elimination is performed in order to ensure
robustness to analysis procedure. In [51] the training procedure with different
outlier elimination techniques is presented. The hypothesis, under which this
procedure can be implemented, is the Gaussian distribution of signals. Here,
the outlier elimination is performed using a Huber estimator for matrix T with
46
7
SPE Contribution A3
6
5
4
3
2
1
0
1
2
3
4
5
6
7 Sensors
8 9
10 11 12 13 14 15
(a)
0.7
SPE Contribution
0.6
0.5
0.4
0.3
0.2
0.1
0
1
2
3
4
5
6
7 8 9
Sensors
10 11 12 13 14 15
(b)
Figure 3.2: Simulated abrupt fault. (a) SPE contribution of approximation
matrix A3 ; (b) SPE contribution of reconstructed PCA
parameter k = 4 [51]. The fault detection and diagnosis algorithm is performed
on-line recursively, by means of a l samples moving window using, at each step,
at least 2L samples needed by the DWT. When the number of SPE values over
the threshold is greater than l ∗ α ∗ m, with α the significance level using in the
six step of MSPCA algorithm (treated in section 2.4.2) and l = 96, a fault is
detected. The paper mill is a complex process and there are different levels of
joint dynamics and also the algorithm has a high sensitivity to small changes
that can be caused by transient and peaks in the signals. These problems lead
to the presence of false alarms and for this reason a variable called m has been
47
added. To prevent false alarms, from the experimental data has been placed
m = 8. If a fault is detected in detail or approximation level j and at sample
n, the time when the fault occurs is n ∗ 2j .
With the objective of validate and test the proposed MSPCA method two faults
are simulated on the nominal plant signals. Subsequently the algorithm has
been used on stock preparation process data during normal operation. In particular, the attention has been focused on three section of the stock preparation
process, the pulping, broke handling machine and deflaking. These subprocesses are monitored by 15 sensors. A fault is detected and identified, during
an observation period, this fault arise in the broke machine. The Tables 3.2,
3.3 and 3.4 show the matrices which arise the fault, changes in percentage of
the values out of threshold from the faultless case and the fault time.
Simulated Faults
To validate and set algorithm parameters, since no a priori assumption is made
about the process, two faults are simulated. In the first simulation, an abrupt
fault is added to the sensor number 5, which measures the inlet pressure to the
broke handling machine. The fault has a width of 4% of the sensor average
value in the faultless case, and the fault occurs at sample 1912. As shown
in Figs. 3.1(a) and 3.1(b) and Table 3.2, the fault is detected by the SPE
approximation matrix A3 at the window 237, with a delay of 80 samples. The
reconstructed PCA does not detect the fault, even if there is an increase of
the SPE. This shows that the addition of the wavelet increases sensitivity in
detecting the fault than in the case without wavelet. The fault is identified
in the sensor 5, which has the greatest variation in contributions, as shown in
Figs. 3.2(a) and 3.2(b). In the second simulation, a sinusoidal signal is added
Table 3.2: Fault diagnosis with simulated abrupt fault.
A3
D1
D2
D3
Reconst.
PCA
Fault
X
275%
0.9%
1%
0.5%
74%
Window
Sample
237
1896-1992
to the sensor number 3, which measures the current of P 17 pump with filtered
water at normal pressure. The fault has a width of 1.5% of the average value of
the sensor in the faultless case, and the fault occurs at sample 634. As shown
in Fig. 3.3 and Table 3.3, the fault is detected by the SPE detail matrix D1 in
48
the window 312 with a delay of 86 samples. The fault is identified in the sensor
3, which has the greatest variation in contributions, as shown in Fig. 3.4.
100
SPE D1
80
60
40
20
0
0
100
200
300
400
samples
500
600
700
800
Figure 3.3: Simulated harmonic fault. SPE of detail matrix D1
SPE Contribution D1
2
1.5
1
0.5
0
1
2
3
4
5
6
7
8
9
Sensors
10
11
12
13
14
15
Figure 3.4: Simulated harmonic fault. SPE contribution of detail matrix D1
Real Fault
The last experiment concerns with a real fault occurred in the stock preparation
subprocess of paper mill during normal operation. The fault occurrence is
identified on sensors 5 and 13, which measure the inlet pressure to the broke
handling machine and the current of the motor broke deflacking. From sample
49
Table 3.3: Fault diagnosis with simulated harmonic fault.
Fault
A3
D1
D2
D3
Reconst.
PCA
31.8%
X
89.5%
19.8%
19.5%
12.3%
Window
Sample
312
624-720
600 the motor is idling and is regarded as a soft fault, which is not considered
as a motor fault but due to a malfunction of the machine as a jam. After
such an interval it takes an abrupt fault identified in both sensors. As shown
in Figs. 3.6(a) and 3.6(b) and Table 3.4, the fault is detected by the SPE
approximation matrix A3 in the window 74 with a delay of 88 samples. Also
the SPE of reconstructed PCA detected three faults at samples 3979, 4180 and
4481, that correspond to peaks in the sensor 5. The fault is identified in the
sensors 5 and 13, which has the greatest variation in contributions at all levels,
shown in Figs. 3.7, 3.8(a), 3.8(b), 3.9(a) and 3.9(b). Considering the SPE
contribution at each level it is possible to define the signature of the isolated
fault and identify its nature. The signature associated with the fault is unique
and no other faults generated a similar signature in contribution plots.
Table 3.4: Fault diagnosis with real fault.
50
A3
D1
D2
D3
Reconst.
PCA
Fault
X
1218%
124%
120%
156%
X
345%
Window
Sample
74
592-688
3979-4100
4180-4361
4481-4770
3.3 Modeling and diagnosis of a Turbofan Engine: Prognosis Application
Broke handling machine
inlet pressure [bar]
4
3
2
1
0
0
1000
2000
3000
4000 5000 6000
samples
7000
8000
9000
7000
8000
9000
(a)
Broke handling machine
motor current [A]
100
80
60
40
20
0
−20
0
1000
2000
3000
4000 5000 6000
samples
(b)
Figure 3.5: Signals of broke handling with fault. (a) Sensor 5, inlet pressure;
(b) Sensor 13, motor broke deflacking current
3.3 Modeling and diagnosis of a Turbofan Engine:
Prognosis Application
Residual life time of systems is a determinant factor for machinery and environment safety. In this section the issue of estimate the residual useful life
(RUL) of turbofan engines is addressed.
Systems, plants and machinery prognosis is the forecast of the remaining
operational life, future condition, or probability of reliable operation, and it is
based on an equipment that acquires condition monitoring data. This approach
51
4
10
2
SPE
10
0
10
−2
10
0
1000
2000
3000
4000 5000
samples
6000
7000
8000
9000
(a)
4
SPE A3
10
2
10
0
10
0
200
400
600
samples
800
1000
1200
(b)
Figure 3.6: Real fault. (a) SPE of reconstructed PCA in logarithmic scale; (b)
SPE of approximation matrix A3 in logarithmic scale
to modern maintenance practice promises to reduce downtime, spares inventory, maintenance costs and safety hazards [63]. The assumption under which
the prognosis became effective is that failure mechanism of systems involve
several degraded health-states, or systems are subjected to wear. Tracking
and forecasting the evolution of health-states and impending failures, in the
form of Remaining Useful Life (RUL), is a critical challenge and regarded one
of the main topics of Condition Based Maintenance (CBM) [64]. CBM is a
maintenance technology that employs such tasks as monitoring, classification,
and forecasting to increase system readiness and safety while reducing costs
52
40
35
30
SPE
25
20
15
10
5
0
1
2
3
4
5
6
7
8
9
Sensors
10
11
12
13
14
15
Figure 3.7: Real fault. SPE contribution of reconstructed PCA
attributed to reduced maintenance and inventory, increased capacity, and enhanced logistics and supply chain performance.
Many approaches exist to monitor the health state or to estimate the RUL
of systems, they can be divided into physics-based prognostic models and
data-driven prognostic models. Physics-based models typically involve building
mathematical models to describe physics relations of the system, failure and
wear propagation [63, 65, 66]. Data-driven approaches attempt to derive models
directly from collected Condition Monitoring (CM) data, they produce prediction output directly in terms of CM data. Conventional data-driven methods
include simple projection model such as exponential smoothing [67, 68]. Most of
these trend forecasting techniques assume that there is some drift in measured
system signals that reflects the health degradation. Artificial Neural Network
(ANN) is one of the most commonly data-driven technique in the prognostic
algorithms. In [69] a Recurrent Wavelet Neural Network (RWNN) is developed
to predict rolling element bearing crack propagation. The network performs
a tracking of enlarged crack. In [70] a Neuro-Fuzzy (NF) network is used to
predict spur gear condition value one step ahead. Fuzzy interference structure
is determined by experts, whereas fuzzy membership functions are trained by
neural network. Adaptive training technique was proposed in [71] to improve
the NF model. Multiple-step-ahead prediction is also performed in [72] for
rolling element bearing condition monitoring and estimation, by feeding the
predicted value back into the network input until desired prediction horizon is
reached. Some rules are used to vary data sampling period taking into account
the change ratio of consecutive condition index value. Particle filtering has also
been implemented to provide non-linear projection in forecasting the growth of
53
600
500
SPE A3
400
300
200
100
0
1
2
3
4
5
6
7 8 9
Sensors
10 11 12 13 14 15
(a)
70
60
SPE D3
50
40
30
20
10
0
1
2
3
4
5
6
7 8 9
Sensors
10 11 12 13 14 15
(b)
Figure 3.8: Real fault. (a) SPE contribution of approximation matrix A3 ; (b)
SPE contribution of detail matrix D3
a crack on a turbine engine blade [73]. In [74] a recursive Bayesian technique
is proposed to calculate failure probability based on the joint density function
of many CM data features. The use of Hidden Markov Models (HMMs) in
bearing fault prognosis is proposed in [75]. In an HMM a system is modeled to
be a stochastic process in which the subsequent states have no causal connection with previous states. Typically the HMMs are trained to estimate health
or fault states. Indeed HMMs are able to estimate unobservable health-states
using observable sensor signals or defined features computed by other algorithms. Some approaches combine fault diagnosis and prognosis in a unified
54
180
160
140
SPE D2
120
100
80
60
40
20
0
1
2
3
4
5
6
7 8 9
Sensors
(a)
10 11 12 13 14 15
1
2
3
4
5
6
7 8 9
Sensors
(b)
10 11 12 13 14 15
14
12
SPE D1
10
8
6
4
2
0
Figure 3.9: Real fault. (a) SPE contribution of detail matrix D2 ; (b) SPE
contribution of detail matrix D1
framework, or need to extract features from data, that are used by HMMs for
estimate the health-state. In [76, 75] the features are computed by Principal
Component Analysis, where measured signals are vibrations. The features are
extracted by amplitude demodulation as in [77]. Whether signals from sensors
are used or that features are computed, the inputs of HMM need to be chosen
as most reliable as possible. Indeed appropriate features are able to capture
unique properties of fault conditions and health state. Another motivation that
induces authors to generate features, is to reduce the computation complexity
of HMMs algorithms.
55
In the present work a HMM is used to estimate the RUL of a turbofan,
features are extracted by an ANN that is trained to identify faultless parameter
of the turbofan in different flight conditions. Residuals are obtained at the
end of each flight and a set of indexes are generated. The HMM uses these
indexes and computes RUL estimation, as the number of remaining flights.
These models give estimations on residual life and health-state by modeling
observations (inputs) as probability density functions. Thus it is possible to
define a model composed by a set of states that are described by a probability
density function already. This permits the use of Bayesian inference algorithms
for estimate health conditions. Data are generated by the model simulator CMAPSS (Commercial Modular Aero-Propulsion System Simulation). Signals
consist of time series of sensed measurements typically available from aircraft
gas turbine engines. The data were used as challenge for the Prognostics and
Health Management (PHM) data competition at PHM’08 [78].
3.3.1 Problem Definition and Process Model
Turbofan engines constitute a complex system that requires adequate monitoring to ensure flight safety and timely maintenance [79, 80, 81]. Therefore
it is essential to assess prognostic techniques, that can help to provide early
detection and isolation of precursor and/or incipient fault condition to a component failure, and can also help manage as well as predict the progression of
various faults to component failure. The prognostic module would also perform failure prognosis, which involves both forecasting of system degradation
based on observed system condition (current diagnostic state and available operating data), and prediction of useful remaining life of the turbofan engine.
Prognostics has taken center stage in condition-based maintenance where it is
desirable to estimate RUL of a system. Estimating the RUL of a component
or system with uncertainty bounds that are narrow enough offers the prospect
for increased system safety along with more cost-effective maintenance. Based
on RUL estimation, the operation team can perform on-demand maintenance
or CBM, otherwise from the traditional time-based practice in which components are managed to life limits based upon fleet-wide statistics and average
expected usage. The traditional approach is necessarily conservative, requiring
the replacement of parts irrespective of how much of their useful life is actually
expended. For example, aircraft engine turbine discs are usually retired at the
time when 1 out of every 1000 discs has initiated a short detectable fatigue
crack. On a life-distribution plot, this is the “−3 sigma” life curve. This implies that over 99.9% of expensive turbine rotor discs are retired before their
useful life has been consumed, a practice that is extremely wasteful [82], so conventional maintenance strategies (like corrective and preventive maintenance)
56
are not adequate to fulfill the needs of expensive and high availability systems
as the turbofan engine. In contrast, a condition-based predictive maintenance,
which is needed to assess the future health of critical components of engines
based on observed data and available knowledge about the system, is used for
making proactive decisions about preventive and/or evasive actions with the
objectives of maximizing the service life of replaceable/serviceable components,
minimizing operational risks, and saving costs, with the same safety margin,
obtained during inefficient schedule-based preventive maintenance. To accomplish this demanding task, engine monitoring systems (EMS) have become
increasingly standard in the last two decades, in step with advances in aircraft
engines and computer technology. So the goal, to be asked, is to automate the
procedures for monitoring and prognosis in order to reduce costs and maintain
high system reliability. Moreover, in turbofan engines the control task is an
essential part of the jet engine, that result in a mechatronic system. Turbofans
are most effective when they can operate at or near their mechanical, thermal,
flow or pressure limitations such as rotor speeds, turbine temperatures, internal pressures, etc. Controlling at but not exceeding a limit is a very important
aspect of engine control which must, therefore, provide both regulation and
limit management. Minimum control requirements include a main fuel control
to provide limit protection. More advanced controls schedule engine geometry
and provide fan and booster stall protection, control variable parasitic engine
flows, and need to monitor many engine parameters [81].
A common turbofan has as its “core” a compressor, combustor, and a turbine
which drives the compressor. In addition it has a fan in front of the core
compressor and a second power turbine behind the core turbine to drive the
fan as shown in Fig. 3.10. The flow capacity of the fan is designed to be
larger than the compressor so that the excess air can be bypassed around the
core and exhausted through a separate nozzle. The bypass approach reduces
engine specific thrust but increases propulsion efficiency thereby reducing fuel
consumption and is the engine chosen for subsonic commercial airplanes [81].
Some of the particular features of turbofan engine are [83]:
1. single stage fan with high pressure ratio.
2. a low pressure compressor.
3. high stage pressure rise mixed-flow compressor.
4. double-annular combustor.
5. high pressure turbine.
6. low pressure turbines.
57
7. variable cycle capability with forward blocker doors and an aft variable
area bypass injector.
8. advanced exhaust nozzle technology.
The diagram in Fig. 3.10 shows the main elements of the turbofan engine
model [78]. The simulated turbofan has equipped with 21 sensors that describe
Figure 3.10: Simplified diagram of turbofan engine
the engine state and has 5 input that are considered as external conditions.
Tables 3.6 and 3.5 resume the signals acquired with their descriptions.
Variable name
alt
MN
TRA
Wf
Fn
Description
Altitude
Mach number
Throttle resolver angle
Fuel flow
Net thrust
Table 3.5: C-MAPSS inputs
3.3.2 Hidden Markov Model and Prognosis Procedure
HMMs are a class of Markov models composed by a set of states that map
observations in a probability density function of each one. The resulting model
58
Variable name
T2
T24
T30
T50
P2
P15
P30
Nf
Nc
epr
Ps30
phi
NRf
NRc
BPR
farB
htBleed
Nf_dmd
PCNfR_dmd
W31
W32
Description
Total temperature at fan inlet
Total temperature at LPC outlet
Total temperature at HPC outlet
Total temperature at LPT outlet
Pressure at fan inlet
Total pressure in bypass-duct
Total pressure at HPC outlet
Physical fan speed
Physical core speed
Engine pressure ratio (P50/P2)
Static pressure at HPC outlet
Ratio of fuel flow to Ps30
Corrected fan speed
Corrected core speed
Bypass Ratio
Burner fuel-air ratio
Bleed Enthalpy
Demanded fan speed
Demanded corrected fan speed
HPT coolant bleed
LPT coolant bleed
Table 3.6: C-MAPSS outputs
consists of two stochastic processes, one of which is not directly observable
but can be estimated through the other one, that produces the sequence of
observations. These models have many applications in speech recognition where
they were studied [84]. Define the states as S = {S1 , S2 , . . . , SN }, with N the
state number, the state at time t as qt and the observations O = o1 o2 . . . oT ,
the HMM is defined as λ = (A, B, π) where:
. A = {aij } is the state transition probability distribution, aij = P [qt+1 =
Sj |qt = Si ];
. π = {πi } is the initial state distribution,
πi = P [q1 = Si ];
. B = {bj (O)} is the continuous observations probability distribution,
qK
bj (O) = m=1 cjm N(O|µjm , σ jm ).
1 ≤ i, j ≤ N , 1 ≤ t ≤ T , cjm are the mix factors, N(O|µjm , σ jm ) are the
Gaussian distributions of a Gaussian Mixture Model (GMM) and K is the
number of mix factors [85, 86]. The probability of being in state Si at time t,
59
and state Sj , at time t + 1, given the model and the observation sequence is:
ξt (i, j) = P [qt = Si , qt+1 = Sj |O, λ],
(3.2)
so the expected number of transitions from Si to Sj is:
ε(i, j) =
T
−1
Ø
ξt (i, j).
(3.3)
t=1
The three basic problems [84] in Hidden Markov Model are:
. given the observation sequence O = o1 o2 . . . oT , and a model λ =
(A, B, π), compute P (O|λ), the probability of the observation sequence
given the model;
. given the observation sequence O = o1 o2 . . . oT , and the model λ =
(A, B, π), choose a corresponding state sequence S1 , S2 , . . . , SN which
best explains the observations;
. fix the model parameters λ = (A, B, π) to maximize P (O|λ).
These problems are solved by three efficient and well defined procedures: forwardbackward algorithm [87], Viterbi algorithm [88] and Baum-Welch algorithm [89].
The computational complexity is related to forward-backward inference algorithm that is N 2 T .
The HMM can be used to describe the fault progression process of physical
systems. In Fig. 3.11 the HMM model is shown, it represents the fault progression where each state is an health-state, this model is called left-right model.
The HMM based scheme is useful for prognostic because by monitoring the
progression of the state sequence it is possible to have a qualitative information of the current degradation state and the wear progression, a quantitative
information of the RUL, and it is possible to predict the system life evolution.
To obtain these information, the following algorithms are used:
1. the progression of states and the current state is calculated by Viterbi
algorithm [88, 90];
2. the RUL is estimated using transition matrix that defines the evolution
of failure progression in terms of quantity. The mean time steps∗ to
failure state, given the current state, is calculate with Monte Carlo simulation [64]: considering n simulations, during each of the n simulation
runs, next health-state is estimated based on the transition probabilities by generating an uniformly distributed random number between 0
and 1. This process is repeated considering the calculated next state as
the current state until the failure-state is reached. Then, the number of
60
Figure 3.11: Fault progression process described by a HMM
transitions is counted as the RUL value. This is repeated for all samples
yielding n RULi values and steps∗ is the mean of the n RULi value:
n
steps∗ =
1Ø
RU Li .
n i=1
(3.4)
The advantage of this method is that requires no assumptions about the
knowledge of the HMM model.
The prognosis algorithm consists of two procedures: the training and the prognosis. The training consists of:
. the initial parameters of HMM model are chosen to better perform the
HMM training:
1. the initial state distribution π = {πi } is uniformly distributed;
2. the initial state transition probability distribution A = {aij } is set
based on the structure of the HMM model chosen;
3. the dataset is clustered with GMM algorithm. The outputs are
cjm , µjm and σ jm to obtain B = {bj (O)}, the continuous observations probability distribution;
. the number of HMM states is determined computing the Bayesian Information Criterion (BIC) [91]:
BIC = L ∗ ln T − 2 ln P (O|λ),
(3.5)
where L is the number of parameters. The minimum of this index gives
the information of the states number of HMM;
61
. to avoid over fitting, during the training step, add a constant called σmin ,
to the diagonal of σ jm matrix. This entity prevent the possibility of
matrix σ jm to became singular and the following expression of GMM
covariance matrix update is used:
σ jm =
qT
j=1
γt (j, m) ∗ (ot − µjm ) ∗ (ot − µjm )′
+ σmin I,
qT
j=1 γt (j, m)
(3.6)
where γt (j, m) is the probability of being in state j a time t with the mth
mixture component accounting for ot ;
. the HMM is trained by Baum-Welch algorithm [89].
The prognosis steps consist of:
. when a new data is collected the inference is calculated to obtain εdata (i, j);
. subtract the model HMM matrix εmodel (i, j) with the new data matrix
εdata (i, j)
εnew (i, j) = εmodel (i, j) − εdata (i, j),
(3.7)
to obtain the εnew (i, j) and so the new transition matrix is:
εnew (i, j)
, 1 ≤ i, j ≤ N ;
Anew = {aij } = qN
j=1 εnew (i, j)
(3.8)
. apply the Viterbi algorithm to evaluate the current state;
. by current state and new transition matrix, calculate RUL ( 3.4).
3.3.3 Features Extraction
The HMM inference algorithm has a computational complexity of N 2 T using
the forward-backward procedure, however it is not possible to use the simulator
signals directly because it generates 21 turbofan variables sampled at 1Hz for
a flight of about 1 hour, and a variable number of flights depends on the degradation rate of the turbofan. The result is that a feature extraction is needed
to reduce the number of variables and samples. Reduction of computational
complexity is made by an Artificial Neural Network (ANN), considering that
what matters is to estimate the number of remaining flights as a measure of
the RUL, then the ANN is used to extract a scalar set of indexes that describe
the situation of all sensors for the whole flight. Another objective is to estimate the engines faultless RUL using some faultless simulations for training
the prognostic procedure.
62
The ANN model is trained to fit engine parameters given the flight condition
inputs. Then for each flight the following error indexes are evaluated from all
parameters residuals of entire flight:
• MSE, Mean Square Error;
• Std, Standard Deviation of the error;
• Ave, Average error.
The ANN used is a 3 layer Multi Layer Perceptron (MLP) network with 5 input
neurons and 21 hidden layer neurons and output neurons. The ANN inputs
and outputs are those described in the section 3.3.1, reported in Tables 3.5
and 3.6 respectively. The ANN training step is performed using a set of faultless turbofan simulations. Then a clustering technique is applied in order to
reduce the sample number. For each cluster the sample number is reduced and
the ANN is trained with the obtained dataset. Once the ANN is trained the
procedure that calculate the features is applied, this procedure is summarized
in the following algorithm. For each flight do:
1. collect data sample of the current flight simulation;
2. simulate the turbofan parameters behaviour through ANN;
3. compute residuals of ANN parameters identification;
4. compute features of the current flight as MSE, Std and Ave of residuals.
3.3.4 Implementation and Results
Data are collected from the C-MAPSS simulator from data competition at
PHM’08. Data are referred to a turbofan engine that simulates parameters
values from flight conditions. Each flight is recorded at sampling frequency of
1Hz and consists of 7 flight conditions repeated for every flight. Then an engine
is simulated until its wear index reaches zero, this means that the turbofan
ended its remaining operational life. These simulations are repeated for several
engines in different conditions, faultless cases, fault Fan, fault High Pressure
Turbine (HPT), fault High Pressure Compressor (HPC) and fault Low Pressure
Turbine (LPT).
The HMM training step is performed using the dataset generated by the
neural network in its training step, this mean that the training dataset is the
same for both algorithms and it is a faultless dataset. HMM training steps are
summarized below:
1. using GMM algorithm, cluster the degradation curves previously extracted by the neural network, which represent the faultless operation
of turbofan;
63
2. evaluate BIC index to choice the N number of states for HMM;
3. set the lower bound of the covariance matrix σmin to 10−16 and train the
HMM with the data and GMM;
Once the training is done, it is possible to run the prognosis steps on the fault
data, summarized in the following steps:
1. obtained new data sample from ANN, calculate the current health-state
by Viterbi algorithm;
2. update the state transition matrix with the new observations (Eq. 3.8)
and calculate the RUL with Monte Carlo technique (Eq. 3.4).
Fig. 3.12 shows a sample of features extracted by ANN from flight data, that are
used for training step. These features are taken to train the HMM model. In
Fig. 3.5 the BIC index, computed by 3.5 is shows, its optimum is the minimum
value reached. In this case the HMM states number is 30, in other words there
are 30 health-states from the normal operation condition to the failure condition
state. In Fig. 3.14 is shown the data clustering of the training step by means of
GMM in the scatter plot of dataset. Once the ANN and HMM are trained the
−4
x 10
6
MSE
5
4
3
2
1
0
0
50
100
150
Flights
200
250
300
Figure 3.12: Faultless features extracted from ANN training data
simulations on prognosis can be made on test datasets. In particular, a fault
arises at unknown flight in the simulated turbofan to generate a test bench
for the integrated procedure. The simulated faults are non safety critical for
obviously reasons and they decrease the turbofan life. This difference can be
seen by comparing the number of flights of faultless dataset in Fig. 3.12 and of
test datasets in Figs. 3.15, 3.16, 3.17 and 3.18.
64
4
−1.6
x 10
−1.7
BIC
−1.8
−1.9
−2
−2.1
−2.2
0
10
20
30
40
50
States number
60
70
80
Figure 3.13: Bayesian information criterion of fan engine 1
−4
Second MSE dataset
x 10
−5
−10
−15
0
1
First MSE dataset
2
−4
x 10
Figure 3.14: Cluster of training data
The following simulations report how the developed procedure is able to track
the true RUL even if perturbations occur. Fig. 3.15 shows the prognostic results
of the turbofan life when a fault occurs in the fan. In particular Fig. 3.15(a)
shows the RUL tracking and the Fig. 3.15(b) shows the health-state progress.
The fault on fan is visible at flight 60 when the RUL has a jump. Next simulation is performed in the case of a fault in HPT. In this case the jump in
the RUL estimation is better highlighted as shown in Fig. 3.16(a). Again the
65
RUL (Flights)
150
100
50
0
0
20
40
60
80
100
Flights
120
140
160
180
120
140
160
180
(a)
30
States sequence
25
20
15
10
5
0
0
20
40
60
80
100
Flights
(b)
Figure 3.15: (a) Turbofan estimated RUL in presence of FAN fault, solid blue
line is the estimated RUL, dashed red line is true RUL; (b) Turbofan health states sequence in presence of FAN fault, solid blue
line is the states health sequence, dash red line is the failure state
state sequence, shown in Fig. 3.16(b) quickly track degradation evolution when
the fault occurs. The last two simulations start with an initial error on the
estimate because the algorithm doesn’t know the initial RUL value and cannot
takes into account the degradation induced by the fault. The prognosis of the
turbofan affected by the HPC fault is shown in Fig. 3.17. The RUL shown in
66
140
RUL (Flights)
120
100
80
60
40
20
0
0
20
40
60
Flights
80
100
120
80
100
120
(a)
30
States sequence
25
20
15
10
5
0
0
20
40
60
Flights
(b)
Figure 3.16: (a) Turbofan estimated RUL in presence of HPT fault, solid blue
line is the estimated RUL, dashed red line is true RUL; (b) Turbofan health states sequence in presence of HPT fault, solid blue
Fig. 3.17(a) converges to the real RUL after the fault occurrence. The healthstate sequence is reported in Fig. 3.17(b). The last simulation involve the
turbofan with a fault on the LPT and is shown in Fig. 3.18. Fig. 3.18(a) shows
the initial RUL overestimate and the correct tracking reached after the fault
occurrence. At the same time the health-state sequence shown in Fig. 3.18(b)
67
RUL (Flights)
150
100
50
0
0
20
40
60
80
100
60
80
100
Flights
(a)
30
States sequence
25
20
15
10
5
0
0
20
40
Flights
(b)
Figure 3.17: (a) Turbofan estimated RUL in presence of HPC fault, solid blue
line is the estimated RUL, dashed red line is true RUL; (b) Turbofan health states sequence in presence of HPC fault, solid blue
follows the degradation of the turbofan engine.
These simulations highlight important considerations, first the initial error
of the RUL is derived from the different working point of the new data from
the trained model. Second once the fault occurs, the RUL suddenly falls down,
it shows the algorithm robustness and its ability to correct the predicted RUL
68
140
RUL (Flights)
120
100
80
60
40
20
0
0
20
40
60
80
100
60
80
100
Flights
(a)
30
States sequence
25
20
15
10
5
0
0
20
40
Flights
(b)
Figure 3.18: (a) Turbofan estimated RUL in presence of LPT fault, solid blue
line is the estimated RUL, dashed red line is true RUL; (b) Turbofan health states sequence in presence of LPT fault, solid blue
in presence of faults. Third, the RUL converges in all types of fault.
69
Chapter 4
Modeling and diagnosis of EEG
signals in BCI Applications
4.1 Introduction
Brains are an aggregate of neurons that communicate with each other to achieve
a common goal. Neurons are the elementary units, and both electrical and
chemical messages constitute the base through which neurons communicate.
Through these messages, neurons are able to achieve a coherent oscillatory
activity, or neuronal synchronization, among large and sometimes distant populations, so neurons behave as coupled oscillators. The brain is assumed to
be a classical example of a complex, self-organizing system. As such, it exhibits hallmarks of nonlinearity, multistability, and “nondiffusivity” (large coherent fluctuations). Brain oscillations as measured by electroencephalographic
(EEG) recordings are usually classified as δ (1–4 Hz), θ (4–7 Hz), α (8–12
Hz), β (13–30 Hz), γ (30–100+ Hz) and µ (8-13 Hz) rhythms. These oscillations are produced by large ensembles of synchronized neuronal activity and
the resulting electrophysiological signals in the different frequency bands are
associated with different functional states (e.g. sleep, wake, perception and
attention). Computational studies adopt a variety of abstractions in order to
deal with complex dynamical systems like the brain. In this context a mathematical models of brain oscillations is the socalled Kuramoto model of coupled
phase oscillators [92]. Kuramoto model, which is a nonlinear, non-stationary
and networked dynamical system, posits that the activity of a local system
(neuron/ neural column/ cortical area) can be sufficiently represented by its
circular phase alone. Moreover this model is able to reproduce synchronization
phenomena in coupled systems such as brain.
Brain-Computer Interfaces (BCIs) are devices which translate the brain activity of the user into specific signals, which may be used for communicating
or controlling external devices [4, 5] without the use of peripheral nerves and
muscles [6]. BCIs represent an interesting option to people affected by neuromuscolar disorders, but whose brain activity is normal, such as in patients
71
affected by Amyotrophic Lateral Sclerosis (ALS). Three different types of stimuli are commonly adopted to drive a BCI: visual stimuli, tactile stimuli and
auditory stimuli. Visual stimuli were the first to be studied, and typically
lead to the best classification results [93, 94]. Visual stimuli, however, can
not be used when the user’s sight has been compromised (e.g. limited horizontal eyes movement, incapability to focus the gaze, etc . . . ), which is the
most critical problem faced by both visual BCI and non-BCI systems (such
as Eye Gaze systems [95]). In these cases tactile stimuli and auditory stimuli
can be adopted instead. Tactile BCI proved to be a good choice for navigation
purposes [96, 97, 98], but only recently it has been used as a communication
device [99]. Visual BCI systems have been intensively researched in the literature, however they can not be adopted by users suffering of visual impairments.
Auditory BCI systems represent a valid alternative, even if they yield to lower
classification scores.
This chapter proposes an auditory BCI paradigm for systems based on P300
signals which are generated by auditory stimuli characterized by different sound
typologies and locations. This paradigm is able to improve the classification
scores with respect to the classical auditory BCI systems. A Head Related
Transfer Function approach is chosen to virtualize auditory stimuli. When virtualized audio is used, the user has to focus the attention both on the type
and location of the stimulus, thus generating P300 signals whose amplitude is
higher than that generated without audio virtualization. The auditory algorithm processes the Electroencephalography (EEG) signals in order to model,
by data-driven algorithms, patterns that describe the user intentions, in particular if the user focuses the attention on auditory stimuli and when he doesn’t.
These patterns can be used as typical features in order to diagnoses the user
intentions. Supervised classification is performed by Support Vector Machines,
in which gaussian radial basis functions are used as kernel functions, in order to diagnose the user attention on a specific stimulus. The system has been
validated with 14 users, who were asked to choose one among five common spoken words, previously virtualized and transmitted to stereophonic headphones.
Classification results prove that the proposed auditory BCI system performed
similarly to common visual BCI P300 systems, representing then an alternative
to visual BCI for users with visual impairments.
This chapter is based on the problem and the results presented in [100].
4.2 Auditory BCI
Different typologies of EEG signals have been used in the literature for developing auditory BCIs (e.g. cortical potentials, sensorimotor rhythm, steady state
evoked potential and P300), and auditory BCIs represent at the moment the
72
4.2 Auditory BCI
most suitable alternative to visual BCI. Slow Cortical Potential (SCP) signals
were studied in [101]. The users involved in the experiment received either
visual, auditory or combined visual/auditory feedback of their SCPs. Results
showed that even if the visual feedback led to the highest number of correct
answers, auditory stimuli could be used as well. In [102], instead, the authors
adopted an auditory BCI driven by the Sensory Motor Rhythm (SMR) signal.
Experimental results showed that auditory stimuli led to similar final results as
visual stimuli, even if in the first case the training time was longer. A different
approach to auditory paradigm exploits the Steady-State Auditory Evoked Potentials (SSAEP). These are elicited by click-trains, amplitude/frequency modulated tones. A steady-state response is represented by a significant amount
of power at the modulation amplitude/frequency of a stimulus [103]. Many of
the auditory BCIs available in the literature, however, are based on the P300
component of the Event Related Potential (ERP). In [104] P300 responses to
two simultaneous auditory stimulus streams were classified. The users had to
choose among one of the two streams and focus their attention by counting
the target stimulus. The outcome of the experiment was that a user could
possibly direct his/her attention using auditory stimuli only. In [105] a fourchoice BCI was tested with both healthy users and patients affected by ALS.
The users were presented auditory and visual stimuli and they had to choose
the words “yes” or “no” among “yes”, “no”, “pass” and “end”, according to a
classic Oddball Paradigm (OP). The results showed that a target probability of
25% was enough to elicit a reliable P300 signal both in healthy users and ALS
patients. A P300 speller driven by auditory stimuli was first presented in [106].
The authors created a 5 × 5 letter matrix similar to that adopted in common
visual P300 BCI spellers (see [107]). Column and row flashes were replaced
with auditory stimuli that were coded to particular columns and rows in the
matrix (i.e. spoken number of column and row). Even if the presence of a
visual support matrix was still needed, more than half of the users were able to
focus their attention so that the auditory stimulus could be correctly detected
and classified, even if with an average accuracy and bit rate lower than those
achievable through visual BCIs. Similar results were obtained in [108], where
the authors extended the letter matrix to 36 characters and added visual cues
early in the training phase. A larger amount of choices did not compromise
the classification performances, while the addition of visual cues allowed for a
better accuracy during the online phase. The above mentioned articles prove
that auditory BCI is a possible alternative to visual BCI, however at the cost
of lower classification scores and average bit rates. An alternative method to
improve performances has been presented in [109], where the authors adopted
spatial auditory stimuli. Users had to sit in the middle of a room surrounded
by five speakers with 45◦ angle between them. All speakers were given a unique
73
complex audio stimulus, so that the discriminating cue was both the physical
property and spatial location of the stimulus. The results showed an increment in the classification score w.r.t. the case where a single speaker only was
adopted. Moreover by increasing the number of runs (times that the audio
stimuli were repeated) it was also possible to achieve results similar to that of
visual BCIs, however impacting negatively on the bit rate. The main drawback
of the proposed solution was that the user had to stand still in the middle of
a room surrounded by speakers. The present paper tries to overcome this obstacle by using a single stereo headphone where audio stimuli are virtualized.
Sound virtualization has already been studied in [110] to show that spatial
location can be a cue determining factor for BCI applications. The auditory
paradigm aims to give the user the opportunity to choose one between five different audio stimuli, retaining at the same time the possibility for the user to
be moved within the home environment. Moreover the audio stimuli presented
to the users are simple words referred to common daily life activities, rather
than audio tones set at specific frequencies [101, 104, 109, 110], numbers [106],
or instrumental sounds [102, 108]. It is the authors claim that the use of words
of the common language in auditory BCIs can lead to a straightforward communication paradigm, reducing at the same time the training time needed to
use the BCI correctly.
4.3 Spatial Audio
Given an audio source in a room, the human ear can perceive mainly two
information: the sound and the position of the source. In anechoic chamber, in
case of source in front of the listener, the human auditory system can recognize
variation of sound source direction of about 1◦ on the horizontal plane. In case
of source behind or beside the listener, the sensibility significantly decreases
to about 10◦ . On the vertical plane there are no differences between sources
in front of and behind the listener and also in this case the order is about
10◦ [111, 112, 113].
In order to obtain spatial audio, one of the most used technique is the binaural recording: the aim is to get a very realistic recording of a sound event,
which takes place in a real environment, through a single pair of microphones,
placed on an artificial head at the ears. It is necessary to obtain spatial audio which can later be used as auditory stimulus directly fed into the user’s
headphone: the binaural recording thus represents a natural approach to obtain highly realistic sound images. In this context the Head Related Transfer
Function (HRTF) assumes a great importance. HRTF is an impulse response
that describes how a sound coming from a well-defined direction is perceived
by the human ear. With a set of two HRTFs, one for each ear, any direction
74
4.3 Spatial Audio
of sound source propagation can be synthesized (Fig. 4.1(b)).
(a) The five audio stimuli directions, played by headphones, with an off-set of 45◦ .
(b) Schematic of left and right HRTF relative to a sound source coming from a
well-defined direction α.
Figure 4.1: Spatial hearing
75
Therefore, given the left and right HRTFs relative to a desired sound direction
α, a mono signal s becomes directive through the operation of convolution:
outR = s ∗ HRTFR (α)
outL = s ∗ HRTFL (α)
(4.1)
(4.2)
Database of HRTFs for several sound directions in anechoic environment
can be found in the literature: the one used has been realized by MIT Media
Lab [114].
Five audio signals, namely the words “bathroom”, “bedroom”, “kitchen”,
“help” and “stop” have been virtualized through the use of ten different HRTFs,
i.e. five different sound directions per ear with an off-set of 45◦ (Fig. 4.1(a)).
4.4 Testing Methodologies
4.4.1 Participants
Fourteen healthy subjects (10 males, 4 females, mean age 25.4, standard deviation ± 2.85, range 22 − 33) participated in the study. All subjects were
volunteering group members and had some previous experience with visual
BCI, mainly based on imagined movement and P300 tasks. No one had previous experiences with auditory BCI. The lack of experience is not a main issue:
the proposed BCI system, based on auditory stimuli represented by common
spoken words, is simpler to use than auditory BCI systems in which stimuli
are represented by tones or instrumental sounds, thus requiring short training
phase.
4.4.2 Data Acquisition
The EEG was recorded monopolarly using an electrode cap with 8 active highpurity gold (Au) electrodes (g.tec medical engineering GmbH) following the
American Electroencephalographic Society modified version of the 10-20 system [115]. These are located at positions Fz, Cz, Po7, P3, Pz, P4, Po8, and
Oz (see Fig. 4.2). Channels are referenced to the left earlobe and grounded
to the left mastoid. Signals were acquired and amplified using a g.MobiLab+
(g.tec medical engineering GmbH, Germany). Data collection and stimulus
presentation were controlled by the BCI2000 software package [116].
4.4.3 EEG Signals Modeling
Prior to recording periods, participants were asked to minimize eye movements
and muscle contractions during the experiment. Each participant was equipped
76
4.4 Testing Methodologies
Figure 4.2: Electrode set for recording and analysis. Eight data channels are
according to the International 10-20 electrodes system; the reference
and ground electrodes are selected as the left earlobe and the left
mastoid, respectively.
with stereophonic headphones, and was requested to repeatedly fulfill the following auditory task: listen to a sequence of five words and focus his/her
attention when the target word was played (i.e. mentally counting how many
times the target word was listened to). Each run contained 1 target word and
4 non-target words: both the sequence of the five words and their spatial orientation were randomly chosen. The users were not requested to consciously
identify the word spatial orientation, however this association is unconsciously
made by the users, thus increasing the P300 activity as already shown in [109].
Each run was repeated 150 times, for a total of 750 audio stimuli of which 150
were target stimuli and 600 non-target stimuli. A ratio of 1 to 5 between target
and non-target stimuli has been shown to be rare enough to produce a P300
77
response [105]. A stimulus duration of 1500 ms and an Inter Stimulus Interval
(ISI) of 250 ms were chosen. Electrooculogram (EOG) was not recorded, then
the artifact rejection was not considered, but the artifact reduction was implemented using the following filters: a high pass filter at 0.1 Hz, a low pass filter
at 30 Hz and a notch filter at 50 Hz. A Common Average Reference (CAR)
spatial filter was applied to the temporal filtered signals [117]. Acquired signals were segmented into epochs of 800 ms starting at the onset of a stimulus.
The data, that was originally sampled at a rate of 256 Hz, was decimated and
moving average filtered by a frequency of 20 Hz. This resulted in 150 target
trials (i.e. number of audio stimuli listened) and 600 non-target trials.
A Support Vector Machine (SVM) was used for data classification [118, 119],
with the following gaussian radial basis function used as kernel function:
φ(ëxi − xj ë) = e−aëxi −xj ë ,
(4.3)
where xi , xj , are the i-th and j-th data sample. The kernel function parameter
a is chosen as the value that maximizes the average between the target and
non-target classification accuracy. To increase sensitivity, outcomes of multiple
runs for the same task can be averaged. In this way, the influence of single trials
can be decreased and the selection score can be more robust. One possibility is
to average the raw trials timeseries for each task and classify them as a single
trial. Another option is to classify each original trial individually and average
over the classifier scores: which implies the use of two or more iterations (i.e.
number of runs repeated before the classificator generates the output). This
second approach is opted, since it showed better performances.
Datasets from the BCI experiments contained four times more non-target
stimuli than targets. Although the classification task is essentially binary,
chance level for classification is 80%, which could potentially be obtained by
simply assigning all samples to the non-target group. Therefore, to evaluate the
performances different type of classification accuracy indexes are considered.
• The classification accuracy, which refers to the binary classification
and is defined as the percentage of trials in which both the target or
non-target stimuli are correctly scored.
• The target accuracy, which is defined as the percentage of trials in
which a target stimulus is correctly scored.
• The non-target accuracy, which is defined as the percentage of trials
a non-target stimulus is correctly scored.
• The selection accuracy, which denotes the percentage of trials in which
the BCI system returns the target action thought by the user.
78
4.5 Results
The selection accuracy index is evaluated for all iterations, therefore it is the
average of the classifier scores for each trial. In order to have a single target
output from the BCI system, just the target which has the largest classification
output is chosen thus multiple targets are not allowed and one target always
exists.
4.4.4 Information Transfer Rate
The Information Transfer Rate (ITR) measures the amount of information carried by every selection and, is a performance index for the evaluation of BCI
systems. The ITR facilitates the performance comparison with other BCI applications and it is calculated in bits per selection with the following formula [120]:
B = log2 N + P log2 P + (1 − P ) log2
3
1−P
N −1
4
,
(4.4)
where N represents the number of classes (five in the present case of study)
and P is the selection accuracy. The ITR in bits per minute was obtained by
multiplying the bit rate B by the classification speed V , that is the average
number of selections per minute, as follows:
IT R = B · V.
(4.5)
Eq. (4.4) shows that even though the selection accuracy may increase when
using two or more iterations, the ITR may stay the same or even decrease
when V decreases, that is to say when selection takes more time. This is
typically the case of our auditory BCI, which requires audio stimuli of long
duration based on words of common language rather than digital tones.
4.5 Results
4.5.1 Classification Performance
Table 4.1 gives the classification, target and non-target accuracy for the BCI
experiment when the SVM is required to perform a classification within a
single run. In this case only one subject reaches 70% of target accuracy, while
the remaining subjects scored a target accuracy below the 70% limit, which
is assumed to be the minimal limit for useful BCI operations [121]. Please
note that target accuracy being lower than non-target accuracy is considered
normal: whenever the ratio between target and non-target words is small, the
classificator tends to weight non-target words more than target ones. When
using multiple iterations, instead, the score went up quickly for most of the
subjects, as shown in Fig. 4.3, which summarizes the selection accuracy in
79
Table 4.1: Classification accuracy, target accuracy and non-target accuracy for
auditory stimuli (stimulus duration 1500 ms, ISI 250 ms) within a
single run. Peak amplitude for the auditory condition is determined
as the maximum amplitude in the range from 0 to 800 ms.
Participant
Classification
accuracy (%)
76,8
78,8
84,0
82,9
84,0
78,0
86,2
82,5
78,9
87,3
82,5
80,2
90,6
85,1
82,5
3,9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Mean
SD
Target
accuracy (%)
42,0
44,0
51,6
48,4
52,0
33,9
58,1
47,3
36,7
61,3
47,3
40,6
71,0
54,8
49,1
9,7
Non-target
accuracy (%)
85,5
86
90,7
90,0
90,4
86,8
92,0
89,5
87,3
92,7
89,5
88,1
94,7
91,3
89,6
2,6
function of iterations required by the SVM to perform classification, for users
1, 5, 6, 8, 9 and 13.
Selection accuracy %
100
90
Participant 1
Participant 5
Participant 6
Participant 8
Participant 9
Participant 13
80
70
60
50
40
30
1
2
3
4
5
6
7 8 9 10 11 12 13 14 15
Iterations
Figure 4.3: Selection scores, for auditory stimuli (stimulus duration 1500 ms,
ISI 250 ms), plotted as a function of the number of iterations for
the users 1, 5, 6, 8, 9 and 13.
80
4.5 Results
The average value of iterations to reach the 70% selection score is 5, as shown
in Fig. 4.4. Mean selection score for a single run is about 50%, as shown in
Fig. 4.4 and Table 4.1. The participants reached the 80% selection score after
95
85
75
65
55
45
1
2
3
4
5
6
7 8 9 10 11 12 13 14 15
Iterations
Figure 4.4: Mean selection accuracy, for auditory stimuli (stimulus duration
1500 ms, ISI 250 ms), plotted as a function of the number of iterations for fourteen participants.
ten iterations and 90% after fourteen iterations.
Fig. 4.5 shows the boxplot of the selection score for all iterations. On each
box, the central mark is the median, the edges of the box are the lower and
higher quartiles. When the lower quartile is considered, seven subjects are over
the 70% selection accuracy. Considering the median, six subjects are over the
70% selection accuracy. If the higher quartile is considered, instead, eleven
participants are over the selection score limit, and only subjects number 6 and
11 are below. Subject 6, with the minimum selection value, does not reach the
70% selection score. The maximum selection score achieved is 99% and the
minimum is 32.3%. Selection scores are comparable to those achievable with
visual and auditory P300 spellers [106].
4.5.2 ITR Performance
ITR performances are shown in Fig. 4.6 for six participants. When using multiple iterations, ITR for most subjects went down quickly as shown in Fig. 4.6.
This is a consequence of the classification speed (V ) reduction: since ISI is 250
ms and stimulus duration is 1500 ms, each additional iteration increases the
classification time of 1750 ms by no trials within each run (i.e 7.5 s).
The worst ITR is 0.2 bits/min, the best result is 3.8 bits/min, which is
achieved by the seventh participant. The average value of iterations needed to
81
110
100
90
80
70
60
50
40
30
1
2
3
4
5
6 7 8 9
Participants
10 11 12 13 14
Figure 4.5: Selection accuracy boxplot, for auditory stimuli (stimulus duration
1500 ms, ISI 250 ms), of all participants. Boxplot is evaluated with
all iterations.
4
Participant 1
Participant 5
Participant 6
Participant 8
Participant 9
Participant 13
ITR bits/min
3.5
3
2.5
2
1.5
1
0.5
0
1
2
3
4
5
6
7 8 9 10 11 12 13 14 15
Iterations
Figure 4.6: ITR for auditory stimuli (stimulus duration 1500 ms, ISI 250 ms),
plotted as a function of the number of iterations for the subjects 1,
5, 6, 8, 9 and 13.
reach the 70% selection score is 5. At this iteration value, ITR is 1.3 bits/min
as shown in Fig. 4.7. Mean ITR for one run is 2.4 bits/min, as shown in
Fig. 4.7. When the participants reach the 80% selection score after ten iterations, the ITR is 1 bits/min. After fourteen iterations, which corresponds to
90% selection accuracy, ITR is 0.9 bits/min. Fig. 4.8 shows the boxplot of
the ITR for all iterations. The boxplot shows that seven subjects are, for all
iterations, below 1 bits/min and seven subjects are above this value. The best
82
4.5 Results
ITR bits/min
2.6
2.4
2.2
2
1.8
1.6
1.4
1.2
1
0.8
1
2
3
4
5
6
7 8 9 10 11 12 13 14 15
Iterations
Figure 4.7: Mean ITR, for auditory stimuli (stimulus duration 1500 ms, ISI 250
ms), plotted as a function of the number of iterations for fourteen
participants.
4
ITR bits/min
3.5
3
2.5
2
1.5
1
0.5
0
1
2
3
4
5
6 7 8 9
Participants
10 11 12 13 14
Figure 4.8: ITR boxplot for auditory stimuli (stimulus duration 1500 ms, ISI
250 ms), of all participants. Boxplot is evaluated for all iterations.
ITR median is 1.8 bits/min, which is achieved by subject 13. Subject 6 shows
the worst ITR performances.
ITR are not high compared to visual and auditory P300 spellers [106]. ITR
performances, as shown in Eq. (4.4) and ( 4.5), depend from speed and selection
accuracy. Speed is related to stimulus duration and ISI. In the present study,
the time interval, between the onset of one stimulus until the next, is 1.75 s, that
is much higher respect to visual and auditory P300 speller based systems. This
high time interval entails a lower ITR but a more natural way of communicating
83
with the user, because the subject has not to pay attention to different tones,
timbres or pitches but to single words only. ITR results are comparable to
those achievable by auditory P300 in BCI [108, 109].
84
Chapter 5
Concluding Remarks
In this dissertation, a contribution to complex system dynamics modeling and
diagnosis is presented. With particular attention to real systems, different
applications are discussed as many case studies. This concluding chapter summarizes the results achieved by the solution proposed in the previous chapters,
giving an insight of possible future works.
5.1 Modeling and Diagnosis of Electric Motor in a
Quality Control Scenario
The first contribution of the dissertation is that of developing two data-driven
diagnostic modules which can be applied to detect faults and defects of electric
motors. The first one is based on stator current signals. A FDD solution is
proposed in order to model and diagnose the faults dynamics, which can not be
described by analytical equations. The FDD procedure uses the PCA in data
pre-processing to reduce the currents space in two dimensions. The PDF of
PCA-transformed signals is estimated by KDE. PDFs are the models that can
be used to identify each fault and defect. Diagnosis has been carried out using
the K-L divergence, which measures the difference between two probability
distributions. This divergence is used as a distance measure between classified
statistic signatures obtained by KDE. The results show that the proposed datadriven diagnosis procedure is able to detect and diagnose different induction
motor faults and defects. The second one is based on vibration signals. A
FDD solution is proposed in order to model and diagnose the faults dynamics,
which can not be described by analytical equations. The FDD procedure used
is the MSPCA, which guarantees robustness and reliability of the detection
and diagnosis of defects. The identified signature is unique for each defect,
experiments on single-phase motors prove this fact. The laboratory bench
test is composed by high performance monitoring programmable automation
system. Nevertheless, after some analysis it is observed that, for the considered
machines, vibrations phenomena arise at low frequencies. Then a low cost
85
Chapter 5 Concluding Remarks
measurement system is used that consists of MEMS (Micro Electro-Mechanical
Systems) sensors.
Possible future works for the FDD solutions for rotating electric machine
should be mainly focused on the improvement of fault diagnosis. In the case
of vibration signals, this could be achieved using the Wavelet Packet Transform (WPT), which is an extension of the classical wavelet analysis applied in
MSPCA, and a filter that chooses the details and approximation scale matrices obtained by the WPT in a way to maximize the separation of the classes
related to the motor conditions. A possible solution is to consider as filter the
Common Spatial Pattern (CSP). In the case of current signals, a possible future work is the extension of the algorithm to on-line FDD procedure in order
to avoid one of the major drawback of the algorithm which concerns the data
batch processing because it needs to acquire several current samples for the
fault diagnosis procedure.
5.2 Modeling of Complex Systems with FDI and
Prognosis Applications
The second contribution of the dissertation is that of developing two solutions
for modeling and diagnose two different complex systems: a paper mill plant
and a turbofan engine. These solutions are applied in order to monitor these
complex systems in FDI and prognostic contexts. Since these complex systems
consist of several coupling nonlinear systems, they cannot be modeled by using
a single model for all operating conditions. For this reason the proposed solution suffer the problem of change in the operating condition. In this case the
model, obtained by data-driven procedures, could be not reliable for FDI and
Prognosis applications.
Possible future works for complex dynamics modeling should be mainly focused on the improvement of FDI and prognosis solutions in order to obtain
models more robust to changes of operating conditions. Furthermore, since
complex systems have a nonlinear behaviour, seems natural to extend these
algorithms by kernel models.
5.3 Modeling and diagnosis of EEG signals in BCI
Applications
The last contribution of the dissertation is that of developing an auditory BCI
paradigm for systems based on P300 signals which are generated by auditory
stimuli characterized by different sound typologies and locations. In order
86
5.3 Modeling and diagnosis of EEG signals in BCI Applications
to achieve this objective, EEG are modeled by data-driven approaches since it
would not have been possible through analytical models. The main contribution
has been the development of an auditory BCI paradigm, which diagnoses the
user intention by auditory stimuli as common spoken words. Possible future
works should be mainly focused on the improvement of the user intentions
diagnosis. A possible solution could be the development of a hybrid modeling
solution for EEG signals based on data-driven approaches and Kuramoto model
since this model is able to reproduce synchronization phenomena, which could
be linked to P300 signal as recent findings indicate.
87
Bibliography
[1] A. Zecevic and D. Siljak, Control of Complex Systems: Structural Constraints and Uncertainty, ser. Communications and Control Engineering.
Springer, 2010.
[2] A. Giantomassi, Modeling estimation and identification of complex system
dynamics. LAP Lambert Academic Publishing, October 2012.
[3] C. Aldrich and L. Auret, Unsupervised Process Monitoring and Fault
Diagnosis with Machine Learning Methods, ser. Advances in Computer
Vision and Pattern Recognition. Springer, 2013.
[4] N. Birbaumer and L. Cohen, “Brain-computer interfaces: communication
and restoration of movement in paralysis,” The Journal of physiology, vol.
579, no. 3, pp. 621–636, 2007.
[5] D. J. McFarland and J. R. Wolpaw, “Brain-computer interfaces for communication and control,” Communications of the ACM, vol. 54, no. 5,
pp. 60–66, 2011.
[6] J. Wolpaw, N. Birbaumer, D. McFarland, G. Pfurtscheller, and
T. Vaughan, “Brain-computer interfaces for communication and control,”
Clinical Neurophysiology, vol. 6, no. 113, 2002.
[7] W. Thomson and M. Fenger, “Current signature analysis to detect induction motor faults,” IEEE Ind. Appl. Mag., vol. 7, no. 4, pp. 26–34,
2001.
[8] F. Ferracuti, A. Giantomassi, S. Iarlori, G. Ippoliti, and S. Longhi, “Induction motor fault detection and diagnosis using kde and kullback-leibler
divergence,” in Industrial Electronics Society, IECON 2013 - 39th Annual
Conference of the IEEE, 2013, pp. 2923–2928.
[9] F. Ferracuti, A. Giantomassi, and S. Longhi, “Mspca with kde thresholding to support qc in electrical motors production line,” in Manufacturing
Modelling, Management, and Control, vol. 7, no. 1, 2013, pp. 1542–1547.
[10] F. Ferracuti, A. Giantomassi, G. Ippoliti, and S. Longhi, “Multi-scale
pca based fault diagnosis for rotating electrical machines,” in European
89
Bibliography
Workshop on Advanced Control and Diagnosis, 8th ACD, Ferrara, Italy,
2010, pp. 296 – 301.
[11] M. Stuart, E. Mullins, and E. Drew, “Statistical quality control and improvement,” European Journla of Operational Research, vol. 88, pp. 203–
214, 1995.
[12] K. Linderman, R. G. Schroeder, S. Zaheer, and A. S. Choo, “Six sigma: a
goal-theoretic perspective,” Journal of Operations Management, vol. 21,
no. 2, pp. 193 – 203, 2003.
[13] M. El Hachemi Benbouzid, “A review of induction motors signature
analysis as a medium for faults detection,” IEEE Trans. Ind. Electron.,
vol. 47, no. 5, pp. 984–993, 2000.
[14] S. Nandi and H. Toliyat, “Condition monitoring and fault diagnosis of
electrical machines-a review,” in Industry Applications Conference, 1999.
Thirty-Fourth IAS Annual Meeting. Conference Record of the 1999 IEEE,
vol. 1, 1999, pp. 197–204 vol.1.
[15] N. Feki, G. Clerc, and P. Velex, “Gear and motor fault modeling and
detection based on motor current analysis,” Electric Power Systems Research, vol. 95, no. 0, pp. 28 – 37, 2013.
[16] Z. I. Botev, J. F. Grotowski, and D. P. Kroese, “Kernel density estimation
via diffusion,” Annals of Statistics, vol. 38, no. 5, pp. 2916–2957, 2010.
[17] M. P. Wand and M. C. Jones, “Multivariate plug-in bandwidth selection,”
Computational Statistics, vol. 9, pp. 97–116, 1994.
[18] I. T. Jolliffe, Principal component analysis.
Berlin: Springer, 2002.
[19] M. Manish, H. H. Yuea, S. J. Qin, and C. Lingb, “Multivariate process
monitoring and fault diagnosis by multi-scale PCA,” Comput. Chem.
Eng., vol. 26, pp. 1281–1293, 2002.
[20] E. Parzen, “On Estimation of a Probability Density Function and Mode,”
The Annals of Mathematical Statistics, vol. 33, no. 3, pp. 1065–1076,
1962.
[21] M. P. Wand and M. C. Jones, Kernel Smoothing.
CRC, Dec. 1994.
Chapman and Hall
[22] A. Mugdadi and I. A. Ahmad, “A bandwidth selection for kernel density
estimation of functions of random variables,” Computational Statistics &
Data Analysis, vol. 47, no. 1, pp. 49 – 62, 2004.
90
Bibliography
[23] D. Comaniciu, “An algorithm for data-driven bandwidth selection,” IEEE
T. Pattern Anal., vol. 25, no. 2, pp. 281–288, 2003.
[24] S. J. Sheather, “Density estimation,” Statist. Sci, pp. 588–597, 2004.
[25] S. Kullback and R. A. Leibler, “On information and sufficiency,” Annals
of Mathematical Statistics, vol. 22, pp. 49–86, 1951.
[26] J. Bangura, R. Povinelli, N. A. O. Demerdash, and R. Brown, “Diagnostics of eccentricities and bar/end-ring connector breakages in polyphase
induction motors through a combination of time-series data mining
and time-stepping coupled FE-state-space techniques,” IEEE Trans. Ind
Appl., vol. 39, no. 4, pp. 1005–1013, 2003.
[27] E. Keogh, “UCR time series data mining archive,” 2013. [Online].
Available: http://www.cs.ucr.edu/~eamonn/iSAX/iSAX.html
[28] Y. Fan and G. Zheng, “Research of high-resolution vibration signal detection technique and application to mechanical fault diagnosis,” Mechanical
Systems and Signal Processing, vol. 21, no. 2, pp. 678 – 687, 2007.
[29] B.-S. Yang and K. J. Kim, “Application of dempster-shafer theory in
fault diagnosis of induction motors using vibration and current signals,”
Mechanical Systems and Signal Processing, vol. 20, no. 2, pp. 403 – 420,
2006.
[30] F. Immovilli, A. Bellini, R. Rubini, and C. Tassoni, “Diagnosis of bearing
faults in induction machines by vibration or current signals: A critical
comparison,” Industry Applications, IEEE Transactions on, vol. 46, no. 4,
pp. 1350 –1359, july-aug. 2010.
[31] V. T. Tran, B.-S. Yang, M.-S. Oh, and A. C. C. Tan, “Fault diagnosis of
induction motor based on decision trees and adaptive neuro-fuzzy inference,” Expert Systems with Applications, vol. 36, no. 2, Part 1, pp. 1840
– 1849, 2009.
[32] N. Sawalhi and R. Randall, “Simulating gear and bearing interactions in
the presence of faults: Part I. the combined gear bearing dynamic model
and the simulation of localised bearing faults,” Mechanical Systems and
Signal Processing, vol. 22, no. 8, pp. 1924 – 1951, 2008.
[33] N. Sawalhi and R. Randall, “Simulating gear and bearing interactions
in the presence of faults: Part II: Simulation of the vibrations produced
by extended bearing faults,” Mechanical Systems and Signal Processing,
vol. 22, no. 8, pp. 1952 – 1966, 2008.
91
Bibliography
[34] P. Rodriguez, A. Belahcen, and A. Arkkio, “Signatures of electrical faults
in the force distribution and vibration pattern of induction motors,” Electric Power Applications, IEE Proceedings -, vol. 153, no. 4, pp. 523 –529,
july 2006.
[35] B. R. Bakshi, “Multiscale PCA with application to multivariate statistical
process monitoring,” AIChE Journal, vol. 44, pp. 1596–1610, 1998.
[36] M. Manish, H. H. Yuea, S. J. Qin, and C. Lingb, “Multivariate process monitoring and fault diagnosis by multi-scale PCA,” Computers &
Chemical Engineering, vol. 26, pp. 1281–1293, 2002.
[37] P. E. Odiowei and Y. Cao, “Nonlinear dynamic process monitoring using canonical variate analysis and kernel density estimations,” Industrial
Informatics, IEEE Transactions on, vol. 6, no. 1, pp. 36 –45, feb. 2010.
[38] J. E. Jackson, A User’s Guide to Principal Components.
Wiley-Interscience, 2003.
New York:
[39] J. Jackson and G. Mudholkar, “Control procedures for residuals associated with principal component analysis,” Technometrics, vol. 21, pp.
341–349, 1979.
[40] R. Dunia and S. Qin, “Joint diagnosis of process and sensor faults using
principal component analysis,” Control Engineering Practice, vol. 6, pp.
457–469, 1998.
[41] J. Yu, “Bearing performance degradation assessment using locality preserving projections,” Expert Systems with Applications, vol. 38, no. 6, pp.
7440 – 7450, 2011.
[42] J. Yu, “Bearing performance degradation assessment using locality preserving projections and gaussian mixture models,” Mechanical Systems
and Signal Processing, vol. 25, no. 7, pp. 2573 – 2588, 2011.
[43] J. Downs and E. Vogel, “A plant-wide industrial process control problem,” Computers & Chemical Engineering, vol. 17, no. 3, pp. 245 – 255,
1993.
[44] L. H. Chiang, E. L. Russel, and R. D. Braatz, Fault detection and diagnosis in industrial systems. Berlin: Springer, 2001.
[45] N. L. Ricker, “Decentralized control of the tennessee eastman challenge
process,” Journal of Process Control, vol. 6, no. 4, pp. 205 – 221, 1996.
[46] “Tennessee eastman process model,” 2002. [Online]. Available: http:
//depts.washington.edu/control/LARRY/TE/download.html
92
Bibliography
[47] X. Li, S. Dong, and Z. Yuan, “Discrete wavelet transform for tool breakage monitoring,” Int. Journal of machine tool manufacture, vol. 99, pp.
1944–1955, 1999.
[48] I. Daubechies, “Orthonormal bases of compactly supported wavelets,”
Communications on Pure and Applied Mathematics, vol. 41, pp. 909–
996, 1988.
[49] S. Mallat, “A theory for multiresolution signal decomposition: the
wavelet representation,” Pattern Analysis and Machine Intelligence,
vol. 11, pp. 674–693, 1989.
[50] J. Antonino-Daviu, M. Riera-Guasp, J. Roger-Folch, F. MartinezGimenez, and A. Peris, “Application and optimization of the discrete
wavelet transform for the detection of broken rotor bars in induction machines,” Applied and Computational Harmonic Analysis, vol. 21, pp. 268
–279, 2006.
[51] K. A. Ho, K. J. Tvarlapat, M. J. Piovoso, and R. Hajare, “A method of
robust multivariate outlier replacement,” Computer and Chemical Engineering, vol. 26, pp. 17–39, 2002.
[52] NI, “National instruments inc.” 2009. [Online]. Available:
//www.ni.com
http:
[53] F. Ferracuti, A. Giantomassi, S. Longhi, and N. Bergantino, “Multi-scale
pca based fault diagnosis on a paper mill plant,” in Emerging Technologies
Factory Automation (ETFA), 2011 IEEE 16th Conference on, 2011, pp.
1–8.
[54] A. Giantomassi, F. Ferracuti, A. Benini, S. Longhi, G. Ippoliti, and
A. Petrucci, “Hidden markov model for health estimation and prognosis of turbofan engines,” in ASME 2011 International Design Engineering Technical Conferences & Computers and Information in Engineering
Conference, 2011, pp. 1–6.
[55] H. Cheng, M. Nikus, and S. L. Jamsa-Jounela, “Evaluation of pca methods with improved fault isolation capabilities on a paper machine simulator,” Chemometrics and Intelligent Laboratory Systems, vol. 92, pp.
186–199, 2008.
[56] R. Isermann and P. Ballé, “Trends in the application of model-based
fault detection and diagnosis of technical processes,” Control Eng. Pract.,
vol. 5, pp. 709–719, 1997.
93
Bibliography
[57] R. J. Patton, F. J. Uppal, and C. J. Lopez-Toribio, “Soft computing approaches to fault diagnosis for dynamic systems: a survey,” in IFAC
Symposium on Fault Detection, Supervision and Safety dor Technical
Processes, Budapest, Hungary, 2000, pp. 298–311.
[58] V. Venkatasubramanian, R. Rengaswamy, and S. N. Kavuri, “A review
of process fault detection and diagnosis part I: quantitative model-based
methods,” Comput. Chem. Eng., vol. 27, pp. 293–311, 2000.
of process fault detection and diagnosis part II: qualitative models and
search strategies,” Comput. Chem. Eng., vol. 27, pp. 313–326, 2000.
of process fault detection and diagnosis part III: process history based
methods,” Comput. Chem. Eng., vol. 27, pp. 327–346, 2000.
[61] R. Isermann, Fault-Diagnosis Systems.
Berlin: Springer-Verlag, 2006.
[62] L. H. Chiang, E. L. Russell, and R. D. Braatz, “Fault diagnosis in chemical processes using fischer discriminant analysis, discriminant partial least
squares, and principal component analysis,” Chemometrics and Intelligent Laboratory Systems, vol. 50, pp. 243–252, 2000.
[63] A. Heng, S. Zhang, A. C. C. Tan, and J. Mathew, “Rotating machinery
prognostics: State of the art, challenges and opportunities,” Mechanical
Systems and Signal Processing, vol. 23, no. 3, pp. 724–739, April 2009.
[64] F. Camci and R. B. Chinnam, “Health-state estimation and prognostics
in machinery processes,” IEEE Transactions on Automation Science and
Engineering, vol. 7, no. 3, pp. 581–597, 2010.
[65] Y. Li, S. Billington, C. Zhang, T. Kurfess, S. Danyluk, and S. Liang,
“Adaptive prognostics for rolling element bearing condition,” Mechanical
Systems and Signal Processing, vol. 13, no. 1, pp. 103–113, January 1999.
[66] Y. Li, T. Kurfess, and S. Y. Liang, “Stochastic prognostics for rolling
element bearings,” Mechanical Systems and Signal Processing, vol. 14,
no. 5, pp. 747–762, September 2000.
[67] S. Goto, Y. Afachi, S. Katafuchi, T. Furue, Y. Uchida, M. Sueyoushi,
H. Hatazaki, and M. Nakamura, “On-line deterioration prediction and
residual life evaluation of rotating equipment based on vibration measurement,” in Proceedings of the SICE Conference, Japan, 2008, pp. 812–817.
94
Bibliography
[68] C. Ciandrini, M. Gallieri, A. Giantomassi, G. Ippoliti, and S. Longhi,
“Fault detection and prognosis methods for a monitoring system of rotating electrical machines,” in Industrial Electronics (ISIE), 2010 IEEE
International Symposium on, 2010, pp. 2085–2090.
[69] P. Wang and G. Vachtsevanos, “Fault prognostics using dynamic wavelet
neural networks,” Artificial Intelligence for Engineering Design, Analysis
and Manufacturing, vol. 15, no. 11, pp. 349–365, January 2002.
[70] W. Q. Wang, M. F. Golnaraghi, and F. Ismail, “Prognosis of machine
health condition using neuro-fuzzy systems,” Mechanical Systems and
Signal Processing, vol. 18, no. 4, pp. 813–831, July 2004.
[71] W. Wang, “An adaptive predictor for dynamic system forecasting,” Mechanical Systems and Signal Processing, vol. 21, no. 2, pp. 809–823,
February 2007.
[72] Y. Shao and K. Nezu, “Prognosis of remaining bearing life using neural
networks,” Proceedings of the institution of Mechanical Engineers, Part I:
Journal of Systems and Control Engineering, vol. 214, no. 3, pp. 217–230,
2000.
[73] M. Orchard, B. Wu, and G. Vachtsevanos, “A particle filter framework
for failure prognosis,” in Proceedings of the World Tribology Congress III,
Washington, 2005.
[74] S. Zhang, L. Ma, Y. Sun, and J. Mathew, “Asset health reliability estimation based on condition data,” in Proceedings of the 2nd WCEAM
and the 4th ICCM, Harrogate, UK, 2007, pp. 2195–2204.
[75] X. Zhang, R. Xu, C. Kwan, S. Y. Liang, Q. Xie, and L. Haynes, “An integrated approach to bearing fault diagnostics and prognostics,” in Proceedings of American Control Conference, Portland OR, USA, 2005, pp.
2750–2755.
[76] C. Kwan, X. Zhang, R. Xu, and L. Haynes, “A novel approach to fault
diagnostics and prognostics,” in Proceedings of the IEEE International
Conference on Robotics and Automation, Taipei, Taiwan, 2003, pp. 604–
609.
[77] H. Ocak and K. A. Loparo, “A new bearing fault detection and diagnosis
scheme based on hidden markov modeling of vibration signals,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and
Signal Processing ICASSP ’01, Taipei, Taiwan, 2001, pp. 3141–3144.
95
Bibliography
[78] A. Saxena, K. Goebel, D. Simon, and N. Eklund, “Damage propagation
modeling for aircraft engine run-to-failure simulation,” in International
Conference on Prognostics and Health Management PHM’08, Denver CO,
USA, 2008.
[79] I. Y. Tumer and A. Bajwa, “A survey of aircraft engine health monitoring
systems,” in Joint Propulsion Conference, Los Angeles CA, USA, 1999.
[80] M. Kurosaki, T. Morioka, K. Ebina, M. Maruyama, T. Yasuda, and
M. Endoh, “Fault detection and identification in an IM270 gas turbine
using measurements for engine control,” Journal of Engineering for Gas
Turbines and Power, vol. 126, no. 4, pp. 726–732, October 2004.
[81] H. A. S. III and H. Brown, “Control of jet engines,” Control Engineering
Practice, vol. 7, pp. 1043–1059, 1999.
[82] S. Vittal, P. Hajela, and A. Joshi, “Review of approaches to gas turbine
life management,” in Proceeding of 10th AIAA/ISSMO Multidisciplinary
Analysis and Optimization Conference, New York , NY, USA, 2004.
[83] S. Chatterjee and J. Litt, “Online model parameter estimation of jet
engine degradation for autonomous propulsion control,” in NASA, Technical Manual TM2003-212608, 2003.
[84] L. R. Rabiner, “A tutorial on hidden markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2,
pp. 257–286, February 1989.
[85] L. A. Liporace, “Maximum likelihood estimation for multivariate observations of markov sources,” IEEE Trans. Informat. Theory, vol. 28, no. 5,
pp. 729–734, September 1982.
[86] B. H. Juang, S. E. Levinson, and M. M. Sondhi, “Maximum likelihood
estimation for multivariate mixture observations of markov chains,” IEEE
Trans. Informat. Theory, vol. 32, no. 2, pp. 307–309, March 1986.
[87] L. E.Baum and J. A. Egon, “An inequality with applications to statistical
estimation for probabilistic functions of markov process and to a model
for ecology,” Bull. Amer. Meteorol. Soc., vol. 73, pp. 360–363, 1967.
[88] A. J. Viterbi, “Error bounds for convolutional codes and an asymptotically optimal decoding algorithm,” IEEE Trans. Informat. Theory,
vol. 13, no. 2, pp. 260–269, April 1967.
[89] L. E.Baum, “An inequality and associated maximization technique in
statistical estimation for probabilistic functions of markov processes,”
Inequalities, vol. 73, pp. 1–8, 1972.
96
Bibliography
[90] G. D. Forney, “The viterbi algorithm,” Proceedings of the IEEE, vol. 61,
no. 3, pp. 268–278, March 1973.
[91] G. Schwarz, “Estimating the dimension of a model,” Annals of Statistics,
vol. 6, no. 2, pp. 461–464, march 1978.
[92] Y. Kuramoto, Chemical oscillations, waves, and turbulence, ser. Chemistry Series. Dover Publications, 2003, originally published: Springer
Berlin, New York, Heidelberg, 1984.
[93] A. Lenhardt, M. Kaper, and H. Ritter, “An adaptive P300-based online
brain-computer interface,” Neural Systems and Rehabilitation Engineering, IEEE Transactions on, vol. 16, no. 2, pp. 121–130, 2008.
[94] C. Guger, S. Daban, E. Sellers, C. Holzner, G. Krausz, R. Carabalona,
F. Gramatica, and G. Edlinger, “How many people are able to control a
P300-based brain-computer interface (BCI)?” Neuroscience letters, vol.
462, no. 1, pp. 94–98, 2009.
[95] A. Calvo, A. Chiò, E. Castellina, F. Corno, L. Farinetti, P. Ghiglione,
V. Pasian, and A. Vignola, “Eye tracking impact on quality-of-life of als
patients,” in Computers Helping People with Special Needs, ser. Lecture
Notes in Computer Science, K. Miesenberger, J. Klaus, W. Zagler, and
A. Karshmer, Eds. Springer Berlin Heidelberg, 2008, vol. 5105, pp.
70–77.
[96] J. Van Erp, “Presenting directions with a vibrotactile torso display,”
Ergonomics, vol. 48, no. 3, pp. 302–313, 2005.
[97] M. Thurlings, J. Erp, A. Brouwer, and P. Werkhoven, “EEG-based navigation from a human factors perspective,” Brain-Computer Interfaces,
pp. 71–86, 2010.
[98] A. Brouwer and J. Van Erp, “A tactile P300 brain-computer interface,”
Frontiers in neuroscience, vol. 4, 2010.
[99] M. van der Waal, M. Severens, J. Geuze, and P. Desain, “Introducing the
tactile speller: an ERP-based brain-computer interface for communication,” Journal of Neural Engineering, vol. 9, no. 4, 2012.
[100] F. Ferracuti, A. Freddi, S. Iarlori, S. Longhi, and P. Peretti, “Auditory
paradigm for a p300 bci system using spatial hearing,” in Intelligent
Robots and Systems (IROS), 2013 IEEE/RSJ International Conference
on, 2013, pp. 871–876.
97
Bibliography
[101] T. Hinterberger, N. Neumann, M. Pham, A. Kübler, A. Grether, N. Hofmayer, B. Wilhelm, H. Flor, and N. Birbaumer, “A multimodal brainbased feedback and communication system,” Experimental Brain Research, vol. 154, pp. 521–526, 2004.
[102] F. Nijboer, A. Furdea, I. Gunst, J. Mellinger, D. J. McFarland, N. Birbaumer, and A. Kübler, “An auditory brain-computer interface (BCI),”
Journal of Neuroscience Methods, vol. 167, no. 1, pp. 43–50, 2008.
[103] D.-W. Kim, H.-J. Hwang, J.-H. Lim, Y.-H. Lee, K.-Y. Jung, and C.H. Im, “Classification of selective attention to auditory stimuli: Toward
vision-free brain-computer interfacing,” Journal of Neuroscience Methods, vol. 197, no. 1, pp. 180 – 185, 2011.
[104] N. Hill, T. Lal, K. Bierig, N. Birbaumer, and B. Schölkopf, “An auditory
paradigm for brain–computer interfaces,” Advances in neural information
processing systems, pp. 569–576, 2004.
[105] E. W. Sellers and E. Donchin, “A P300-based brain-computer interface:
Initial tests by ALS patients,” Clinical Neurophysiology, vol. 117, no. 3,
pp. 538–548, 2006.
[106] A. Furdea, S. Halder, D. Krusienski, D. Bross, F. Nijboer, N. Birbaumer,
and A. Kübler, “An auditory oddball (P300) spelling system for braincomputer interfaces,” Psychophysiology, vol. 46, no. 3, pp. 617–625, 2009.
[107] L. Farwell and E. Donchin, “Talking off the top of your head: toward
a mental prosthesis utilizing event-related brain potentials,” Electroencephalography and Clinical Neurophysiology, vol. 70, no. 6, pp. 510–523,
1988.
[108] D. Klobassa, T. Vaughan, P. Brunner, N. Schwartz, J. Wolpaw, C. Neuper, and E. Sellers, “Toward a high-throughput auditory P300-based
brain-computer interface,” Clinical neurophysiology: official journal of
the International Federation of Clinical Neurophysiology, vol. 120, no. 7,
p. 1252, 2009.
[109] M. Schreuder, B. Blankertz, and M. Tangermann, “A new auditory multiclass brain-computer interface paradigm: spatial hearing as an informative cue,” PLoS One, vol. 5, no. 4, p. e9813, 2010.
[110] R. Sonnadara, C. Alain, L. Trainor et al., “Effects of spatial separation
and stimulus probability on the event-related potentials elicited by occasional changes in sound location,” Brain research, vol. 1071, no. 1, pp.
175–185, 2006.
98
Bibliography
[111] W. M. Hartmann, “Localization of Sound in Rooms,” J. Acoust. Soc.
Amer., vol. 74, no. 5, pp. 1380–1391, 1983.
[112] J. Blauert, Spatial Hearing - Revised Edition: The Psychophysics of Human Sound Localization. MIT Press, 1996.
[113] J. S. Bradley and G. A. Soulodre, “The Influence of Late Arriving Energy
on Spatial Impression,” J. Acoust. Soc. Amer., vol. 97, no. 4, pp. 2263–
2271, 1995.
[114] M. Gardner and K. Martin, “HRTF Measurements of a KEMAR DummyHead Microphone,” MIT Media Lab, Tech. Rep., 1994.
[115] F. Sharbrough, G. E. Chatrian, R. P. Lesser, H. Luders, M. Nuwer, and
T. W. Picton, “American Electroencephalographic Society guidelines for
standard electrode position nomenclature,” J. Clin. Neurophysiol., vol. 8,
pp. 200–202, 1991.
[116] G. Schalk, D. McFarland, T. Hinterberger, N. Birbaumer, and J. Wolpaw,
“BCI2000: a general-purpose brain-computer interface (BCI) system,”
IEEE Transactions on Biomedical Engineering, vol. 51, no. 6, pp. 1034–
1043, 2004.
[117] D. J. McFarland, L. M. McCane, S. V. David, and J. R. Wolpaw, “Spatial
filter selection for EEG-based communication,” Electroencephalography
and Clinical Neurophysiology, vol. 103, no. 3, pp. 386–394, 1997.
[118] V. N. Vapnik, Statistical learning theory, 1st ed.
Wiley, Sep. 1998.
[119] V. Vapnik, “An overview of statistical learning theory,” Neural Networks,
IEEE Transactions on, vol. 10, no. 5, pp. 988 –999, sep 1999.
[120] J. Wolpaw, H. Ramoser, D. McFarland, and G. Pfurtscheller, “EEGbased communication: improved accuracy by response verification,”
IEEE Transactions on Rehabilitation Engineering, vol. 6, no. 3, pp. 326
–333, 1998.
[121] A. Kübler, N. Neumann, B. Wilhelm, T. Hinterberger, and N. Birbaumer, “Predictability of brain-computer communication,” Journal of
Psychophysiology, vol. 18, no. 2, pp. 121–129, 2004.
99

Chapter 4 Modeling and diagnosis of EEG signals in BCI Applications

Transcription

Similar documents

TDR / Toner – Cable Fault Locator

GOLDEN MILE PROPERTY

Projet COLMEIA

Making a Positive Connection

Marie-Helene Cormier1, Ruth Elaine Blake2, Dwight F. Coleman1

unparalleled protection - I-Gard

W-12 view

Apresentação do PowerPoint

a flyer on making the transition to IP