Chapter 4 Modeling and diagnosis of EEG signals in BCI Applications
Transcription
Chapter 4 Modeling and diagnosis of EEG signals in BCI Applications
i Università Politecnica delle Marche Scuola di Dottorato di Ricerca in Scienze dell’Ingegneria Curriculum in Ingegneria Informatica Gestionale e dell’Automazione Modeling and Diagnosis of Complex Systems Dynamics by Data-Driven Approaches Ph.D. Dissertation of: Francesco Ferracuti Advisor: Prof. Giuseppe Orlando Coadvisor: Prof. Gianluca Ippoliti Curriculum Supervisor: Prof. Sauro Longhi XII edition - new series Università Politecnica delle Marche Scuola di Dottorato di Ricerca in Scienze dell’Ingegneria Curriculum in Ingegneria Informatica Gestionale e dell’Automazione Modeling and Diagnosis of Complex Systems Dynamics by Data-Driven Approaches Ph.D. Dissertation of: Francesco Ferracuti Advisor: Prof. Giuseppe Orlando Coadvisor: Prof. Gianluca Ippoliti Curriculum Supervisor: Prof. Sauro Longhi XII edition - new series Università Politecnica delle Marche Scuola di Dottorato di Ricerca in Scienze dell’Ingegneria Facoltà di Ingegneria Via Brecce Bianche – 60131 Ancona (AN), Italy To Arianna, for her support and understanding Acknowledgments I would like to thank Prof. Longhi for giving me the chance to investigate different aspects of research, granting me at the same time freedom of choice, and for his significant contribution to all my Ph.D. course and for his time and effort spent for my research activity. I also wish to thank Dr. Gianluca Ippoliti, whose support has been determinant for the results obtained in these years. My thanks to Prof. Orlando for its cordiality and availability. Finally, a special thank goes to all my colleagues and friends, whose company has made enjoyable the academic job. Ancona, Gennaio 2013 Francesco Ferracuti ix Abstract Complex systems are found in almost all field of contemporary science, and are associated with a wide variety of financial, physical, biological, information and social systems. Complex systems consist of a large number of nonlinearly interacting components which display collective behaviour that does not follow trivially from the behaviours of the individual parts. Although complex systems have many different properties, the most important are: dimensionality, uncertainty, nonlinearity and coupling between components. The procedures to obtain analytical models are usually classified into physical modeling and identification. These procedures cannot be implemented easily if applied to complex systems because the properties, which arise from these systems, make difficult the modeling. Since a mathematical model is a description of system behaviour, accurate modeling for a complex system is very difficult to achieve in practice. Furthermore, sometimes, it may even be impossible to describe the system by analytical equations. The present dissertation tries to address two issues regarding the modeling and diagnosis of complex systems. The first one deals with the issue of modeling a complex system, in the case the analytical model is not obtainable. The second one deals with the issue of diagnosing the system behaviour. Diagnosis should detect if the complex system is normal or a change is occurring due to abnormal events and in addition, the probable causes of the abnormal events should be detected by means of the diagnosis. Modeling of complex systems is addressed developing data-driven procedures, which are able to learn the complex system dynamics from data that are provided by installed sensors on the system in order to monitor the physical system variables. Diagnosis of complex systems is addressed developing machine learning procedures in order to classify the probable causes of deviations from system normal events. A large amount of attention is paid to the issue of modeling and diagnosis for complex systems with particular attention to real systems, for this different applications are discussed as many case studies. The first application deals with the issue of modeling and diagnose the defects and faults in a Quality Control scenario for electric motors. The second application deals with the issue of modeling and diagnose a complex industrial system as a paper mill xi plant. The third application deals with the issue of estimating the residual useful life of a turbofan engine and the last deals with the issue of modeling the Electroencephalography signals by data-driven algorithms in order to diagnose the user intentions. This solution addresses the modeling problem in the Brain Computer Interface context. Since the modeling and diagnosis problem is faced by data-driven procedures, the developed algorithms can be applied to a wide class of rotating electrical machines and complex industrial systems, and not only to those mentioned. xii Sommario I sistemi complessi possono essere rinvenuti in quasi tutti i campi della scienza contemporanea, e possono avere diversa natura: finanziaria, fisica, biologica, informativa, sociale, ecc. I sistemi complessi consistono di un gran numero di componenti che interagiscono non linearmente tra loro e che mostrano un comportamento collettivo non derivante semplicemente dal comportamento delle parti individuali. Sebbene tali sistemi godano di numerose proprietà, le più importanti sono: dimensionalità, incertezza, non linearità e accoppiamento tra i componenti. Le procedure per ottenere modelli analitici di determinati sistemi sono solitamente classificate in modellazione fisica e identificazione. Queste procedure possono essere difficilmente implementabili se applicate a sistemi complessi perché le loro caratteristiche peculiari rendono difficile la modellazione. Poiché un modello matematico è una descrizione del comportamento di un sistema, una modellazione accurata per i sistemi complessi è molto difficile da ottenere in pratica. Per di più, a volte, potrebbe addirittura essere impossibile descrivere il sistema attraverso equazioni analitiche. Alla luce di quanto emerso, la presente trattazione si propone di affrontare due problemi riguardanti la modellazione e la diagnosi dei sistemi complessi: il primo riguarda specificamente la modellazione di un sistema complesso, nel caso in cui il modello analitico non sia ottenibile; il secondo si riferisce alla diagnosi del comportamento del sistema. Quest’ultima attività dovrebbe rilevare se il sistema complesso è normale o se sta avvenendo un cambiamento dovuto a eventi anomali, nonché le cause probabili di tali eventi. La modellazione dei sistemi complessi viene affrontata sviluppando metodi data-driven, che sono capaci di apprendere le dinamiche del sistema complesso direttamente dai dati forniti da sensori installati sul sistema, al fine di monitorarne le variabili fisiche. La diagnosi dei sistemi complessi viene invece affrontata sviluppando metodi di apprendimento automatico in modo da classificare le probabili cause di scostamento da eventi normali del sistema. Nella trattazione ampia attenzione è posta al problema di modellazione e diagnosi di sistemi complessi con riferimento a sistemi reali, presentando diverse applicazioni pratiche e casi di studio. Il primo caso di studio riguarda la modellazione e diagnosi di difetti e guasti di motori elettrici in uno scenario di controllo qualità mentre il secondo si riferisce ad un sistema complesso indusxiii triale, quale quello di una cartiera. Nel terzo caso viene affrontata la questione di stimare la vita utile rimasta di un motore turbofan e l’ultimo tratta il problema di modellare segnali elettroencefalografici attraverso algoritmi basati sui dati. Dato che il problema di modellazione e diagnosi è affrontato attraverso procedure basate sui dati, gli algoritmi sviluppati possono essere applicati ad un’ampia classe di macchine elettriche rotanti e sistemi complessi industriali, e non solo a quelli riportati. xiv Contents 1 Introduction 1.1 Data-Driven approach to Modeling and Diagnosis of Complex Dynamic Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Data-Driven Modeling of Complex Systems Dynamics . 1.1.2 Data-Driven Diagnosis of Complex Systems Dynamics . 1.2 Data-Driven Modeling and Diagnosis of Complex Systems Dynamics: Applications . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Rotating Machines . . . . . . . . . . . . . . . . . . . . . 1.2.2 Industrial Systems . . . . . . . . . . . . . . . . . . . . . 1.2.3 Brain Computer Interface . . . . . . . . . . . . . . . . . 1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 3 3 4 4 4 5 6 6 2 Modeling and Diagnosis of Electric Motors in a Quality Control Scenario 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Quality Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Data-Driven Modeling and Diagnosis of Induction Motor by Current Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Recalled Results . . . . . . . . . . . . . . . . . . . . . . 2.3.2 Developed Algorithm . . . . . . . . . . . . . . . . . . . . 2.3.3 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Data-Driven Modeling and Diagnosis by Vibration Signals . . . 2.4.1 Recalled Results . . . . . . . . . . . . . . . . . . . . . . 2.4.2 MSPCA Formulation . . . . . . . . . . . . . . . . . . . . 2.4.3 Developed Experimental Setup . . . . . . . . . . . . . . 2.4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 11 13 15 15 19 26 32 33 34 3 Modeling of Complex Systems with FDI and Prognosis Applications 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Modeling and Diagnosis of a Paper Mill Plant: FDI Application 3.2.1 Recalled Results . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 41 42 44 44 46 7 7 9 xv Contents 3.3 Modeling and diagnosis of a Turbofan Engine: Prognosis cation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Problem Definition and Process Model . . . . . . 3.3.2 Hidden Markov Model and Prognosis Procedure 3.3.3 Features Extraction . . . . . . . . . . . . . . . . 3.3.4 Implementation and Results . . . . . . . . . . . . 4 Modeling and diagnosis of EEG signals 4.1 Introduction . . . . . . . . . . . . . 4.2 Auditory BCI . . . . . . . . . . . . 4.3 Spatial Audio . . . . . . . . . . . . 4.4 Testing Methodologies . . . . . . . 4.4.1 Participants . . . . . . . . . 4.4.2 Data Acquisition . . . . . . 4.4.3 EEG Signals Modeling . . . 4.4.4 Information Transfer Rate . 4.5 Results . . . . . . . . . . . . . . . . 4.5.1 Classification Performance . 4.5.2 ITR Performance . . . . . . in BCI Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appli. . . . . . . . . . . . . . . . . . . . 51 56 58 62 63 . . . . . . . . . . . 71 71 72 74 76 76 76 76 79 79 79 81 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Concluding Remarks 5.1 Modeling and Diagnosis of Electric Motor in a Quality Control Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Modeling of Complex Systems with FDI and Prognosis Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Modeling and diagnosis of EEG signals in BCI Applications . . xvi 85 85 86 86 List of Figures 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14 2.15 2.16 2.17 2.18 2.19 Efficiency characterization of tested induction motors . . . . . . Interpolated PDFs of a finite element heathy motor in the twodimensional principal components space estimated by KDE. . . Interpolated PDFs of a finite element motor with one broken bar in the two-dimensional principal components space estimated by KDE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Interpolated PDFs of a finite element motor with one broken connector in the two-dimensional principal components space estimated by KDE. . . . . . . . . . . . . . . . . . . . . . . . . . K-L divergence in the case of a finite element healthy motor . . K-L divergence in the case of a finite element motor with one broken bar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . K-L divergence in the case of a finite element motor with one broken connector . . . . . . . . . . . . . . . . . . . . . . . . . . Interpolated PDFs of real healthy motors in the two-dimensional principal components space estimated by KDE. . . . . . . . . . Interpolated PDFs of real motors with cracked rotor in the twodimensional principal components space estimated by KDE. . . Interpolated PDFs of real motors with wrong rotor in the twodimensional principal components space estimated by KDE. . . K-L divergence in the case of real healthy motors . . . . . . . . K-L divergence in the case of real motors with cracked rotor . . K-L divergence in the case of real motors with wrong rotor . . Single-phase 25W motor for kitchen hoods mounted on pallet; PCB accelerometers are installed on pallet. . . . . . . . . . . . Impeller with backlash: contribution weights of approximation matrix A3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Misaligned impeller: contribution weights of approximation matrix A3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Misaligned impeller: contribution weights of D3 scale matrix. . Misaligned impeller: contribution weights of D2 scale matrix. . Misaligned impeller: contribution weights of D1 scale matrix. . 16 17 18 18 19 20 21 21 23 23 24 24 25 34 35 36 36 37 37 xvii List of Figures 2.20 Unbalanced impeller: contribution weights of approximation matrix A3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.21 Contribution weights of A7 scale matrix for the first accelerometer in the case of unbalanced impeller. . . . . . . . . . . . . . 2.22 Contribution weights of A7 scale matrix for the first accelerometer in the case of unbalanced impeller. . . . . . . . . . . . . . 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 3.13 3.14 3.15 3.16 3.17 3.18 4.1 4.2 4.3 xviii 38 39 39 Simulated abrupt fault. (a) SPE of reconstructed PCA; (b) SPE of approximation matrix A3 . . . . . . . . . . . . . . . . . . . . 46 Simulated abrupt fault. (a) SPE contribution of approximation matrix A3 ; (b) SPE contribution of reconstructed PCA . . . . 47 49 Simulated harmonic fault. SPE of detail matrix D1 . . . . . . Simulated harmonic fault. SPE contribution of detail matrix D1 49 Signals of broke handling with fault. (a) Sensor 5, inlet pressure; (b) Sensor 13, motor broke deflacking current . . . . . . . . . . 51 Real fault. (a) SPE of reconstructed PCA in logarithmic scale; (b) SPE of approximation matrix A3 in logarithmic scale . . . 52 Real fault. SPE contribution of reconstructed PCA . . . . . . . 53 Real fault. (a) SPE contribution of approximation matrix A3 ; (b) SPE contribution of detail matrix D3 . . . . . . . . . . . . 54 Real fault. (a) SPE contribution of detail matrix D2 ; (b) SPE contribution of detail matrix D1 . . . . . . . . . . . . . . . . . 55 Simplified diagram of turbofan engine . . . . . . . . . . . . . . 58 Fault progression process described by a HMM . . . . . . . . . 61 Faultless features extracted from ANN training data . . . . . . 64 Bayesian information criterion of fan engine 1 . . . . . . . . . . 65 Cluster of training data . . . . . . . . . . . . . . . . . . . . . . 65 (a) Turbofan estimated RUL in presence of FAN fault; (b) Turbofan health states sequence in presence of FAN fault . . . . . 66 (a) Turbofan estimated RUL in presence of HPT fault; (b) Turbofan health states sequence in presence of HPT fault . . . . . 67 (a) Turbofan estimated RUL in presence of HPC fault; (b) Turbofan health states sequence in presence of HPC fault . . . . . 68 (a) Turbofan estimated RUL in presence of LPT fault; (b) Turbofan health states sequence in presence of LPT fault . . . . . 69 Spatial hearing . . . . . . . . . . . . . . . . . . . . . . . . . . . Electrode set for recording and analysis . . . . . . . . . . . . . Selection scores, for auditory stimuli (stimulus duration 1500 ms, ISI 250 ms), plotted as a function of the number of iterations for the users 1, 5, 6, 8, 9 and 13. . . . . . . . . . . . . . . . . . 75 77 80 List of Figures 4.4 4.5 4.6 4.7 4.8 Mean selection accuracy, for auditory stimuli (stimulus duration 1500 ms, ISI 250 ms), plotted as a function of the number of iterations for fourteen participants. . . . . . . . . . . . . . . . . Selection accuracy boxplot, for auditory stimuli (stimulus duration 1500 ms, ISI 250 ms), of all participants. Boxplot is evaluated with all iterations. . . . . . . . . . . . . . . . . . . . ITR for auditory stimuli (stimulus duration 1500 ms, ISI 250 ms), plotted as a function of the number of iterations for the subjects 1, 5, 6, 8, 9 and 13. . . . . . . . . . . . . . . . . . . . . Mean ITR, for auditory stimuli (stimulus duration 1500 ms, ISI 250 ms), plotted as a function of the number of iterations for fourteen participants. . . . . . . . . . . . . . . . . . . . . . . . . ITR boxplot for auditory stimuli (stimulus duration 1500 ms, ISI 250 ms), of all participants. Boxplot is evaluated for all iterations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 82 82 83 83 xix List of Tables 2.1 2.2 2.3 2.4 Classification accuracy in the case of finite element motor Classification accuracy in the case of real motors . . . . . TEP results with improved detection index . . . . . . . . FDD results for motors installed in kitchen hoods. . . . . . . . . . . . . . . . . 22 25 28 38 3.1 3.2 3.3 3.4 3.5 3.6 Stock preparation sensors . . . . . . . . . . . . Fault diagnosis with simulated abrupt fault. . . Fault diagnosis with simulated harmonic fault. Fault diagnosis with real fault. . . . . . . . . . C-MAPSS inputs . . . . . . . . . . . . . . . . . C-MAPSS outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 48 50 50 58 59 4.1 Classification accuracy, target accuracy and non-target accuracy for auditory stimuli . . . . . . . . . . . . . . . . . . . . . . . . . 80 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi Chapter 1 Introduction Complex systems are found in almost all field of contemporary science. Typically, complex systems belong to different systems classes such as natural (e.g., biological evolution), financial (e.g., stock markets and economies), social and belonging to industrial systems. Complex systems consist of a large number of nonlinearly interacting components, often called agents, which displays collective behaviour that does not follow trivially from the behaviours of the individual parts. These systems are open, they interchange information with environment and constantly modify their internal activity structure and patterns in the self-organization process [1]. Complex systems have several characteristics of the structure and behaviour. In the case of complex industrial systems one of these properties is the hierarchical structure, which is related to multilevel structure organization of the system. Another property is the strong coupling between agents defined in the complex system. Computational problems, which arise from the dimensionality property, and uncertainty are other typical characteristic that complex systems possess. Technological advances in the process industries in recent years have implied in increasingly complicated systems, processes and products that pose the attention on considerable challenges in their modeling, analysis, design, manufacturing and management for successful operation and use over their life cycles. The increasing of complexity implies the research of more efficient mathematical frameworks which are able to model these systems. A model is a mathematical representation of a physical, biological or information system. Models of real systems are of fundamental importance in virtually all disciplines. Models can be useful for different purpose such as analysis, i.e., for gaining a better understanding of the system, predict or simulate a system’s behaviour, control and analysis. In engineering, models are required for the design of new processes and for the analysis of existing processes [2]. The procedures to obtain analytical models are usually classified into physical modeling and identification. However in many contexts (like industrial, natural, financial) the development of a mathematical model of the process is a difficult task due to the complexity of the processes that are involved. Modeling may 1 Chapter 1 Introduction demand considerable engineering efforts and thus becomes not practical for complex processes. Modeling complex systems with physical models could be very difficult or not possible, on the other hand mathematical models that describe the complex systems dynamics could not be easily manipulated. Complex systems, of which the analytical model is not available, could be modeled by data-driven approaches. Data-driven approaches, simply speaking, are based on data matrices, which usually contain measurements of physical process variables, and computational intelligence and machine learning algorithms. Machine learning is an automatic computing based on several operations to learn models from data. Particularly, it’s the study of computational methods and algorithms in order to improve the performance of machines by automating the acquisition of information and knowledge from experience or data [3]. Different machine learning paradigms include artificial neural networks (e.g., multilayer perceptrons, self-organizing maps, radial basis function neural networks); instance-based learning (e.g., case-based reasoning, nearest neighbour methods); rule induction; genetic algorithms, where knowledge is typically represented by Boolean features, sometimes as the conditions and actions of rules, statistics and analytical learning. Machine learning algorithms are of fundamental importance in virtually all disciplines, with a lot of application in engineering contexts. In this field often the modeling objective for complex systems is that of monitoring the processes in order to evaluate and analyze the performance of these systems and diagnose the system behaviours. The strong coupling between processes is the cause of cascading failures in complex systems as industrial processes. Due to the strong coupling between agents, a failure in one or more components can lead to cascading failures which could have catastrophic consequences on the functioning of the system. Moreover a fault in one or more agents can lead to cascading faults which may induce the system in failure. A basic diagnosis should detect if the complex system is normal or a change is occurring due to abnormal events. This objective is related to Fault Detection and Isolation applications, in which the systems are modeled by model-based or data-driven approaches, in order to detect the location and time of a fault. In addition, the diagnosis should detect the fault probable causes to enable appropriate supervisory control decisions and actions to bring the process back to a normal, safe operating state become all the more important. 2 1.1 Data-Driven approach to Modeling and Diagnosis of Complex Dynamic Systems 1.1 Data-Driven approach to Modeling and Diagnosis of Complex Dynamic Systems In this section a brief description of modeling and diagnosis issues are presented with some details and considerations concerning the described case studies. 1.1.1 Data-Driven Modeling of Complex Systems Dynamics Data-driven modeling may not have the mathematical information of the system but it needs of data matrices, which are relevant of the system dynamics. The data-driven construction of models for complex systems can be cast in a general framework consisting of a number of different mathematical units. These include a data matrix representing the process or system being monitored. Data are processed to extract a relevant feature matrix from the data matrix. Finally, machine learning algorithms are applied to feature matrix to build a data-driven model that describes the dynamics of the considered complex system. Particularly, the model must be able to simulate and predict the system dynamics and generalize the dynamic learnt by the feature matrix used for the modeling. The data matrix usually contains measurements of physical process variables, which can be of different classes such as discrete, logic, acquired at different sampling frequency and also qualitative. Data-driven models would be able to model the dynamics of a particular system or process if a considerable amount of data describing the dynamics is available and if there are no considerable changes to the modeled system during the period covered by the model. However this last case may be solved using several data-driven models, which are able to cover the whole dynamics of the system. In this dissertation the modeling issue is faced by data-driven approaches for different complex systems. The choice of the artificial intelligence algorithms and relevant features for the considered system is a problem-dependent. For example in the case of FDI for a paper mill plant, which is a critical process monitored by many sensors, Principal Component Analysis algorithm is considered because is computationally fast and is able to describe easily the correlation between process variables. In the case of FDI for electric motor, Wavelet Transform is considered because the faults can be described by vibration signals in frequency and time domain. Moreover, the solution based on Kernel Density Estimation and Kullback-Leibler divergence, which are described in section 2.4, is able to map the frequency features of electrical signals in a 2-D space. 3 Chapter 1 Introduction 1.1.2 Data-Driven Diagnosis of Complex Systems Dynamics In the context of complex systems the objective of diagnosis is to know the system conditions and behaviours by the measurements of physical variables and the model provided by the data-driven modeling. Data-Driven diagnosis is linked to the concept of classification. In machine learning theory a distinction can be made between supervised, unsupervised classification. In unsupervised learning problems, unlabelled data are used, which is determined by a cost function to be minimized. Supervised algorithms give better classification results but they require more information than unsupervised classificator. In supervised learning, the training data consist of a set of data each of which is a pair comprising an input and an output vector so labelled data is required to train supervised classificators. If the output is continuous, a regression function is learnt, otherwise, if the output is discrete, a classification function is learnt. Both supervised and unsupervised algorithms provides a parameterized model, which gives in output the labeling of input data. As for the modeling, the diagnosis issue is also a problem-dependent. For example in the case of the Quality Control application, a supervised classification is considered because the defects knowledge is available. In the case of a paper mill plant a classificator is not considered for diagnosis because obviously the faults knowledge is not available apriori, so the monitoring algorithm is able only to detect and isolate the faults. 1.2 Data-Driven Modeling and Diagnosis of Complex Systems Dynamics: Applications Several complex systems are described in this dissertation. The modeling and diagnosis issues are faced by data-driven approaches for industrial complex systems such as electric motor, a paper mill plant, a turbofan engine. The complex system, which belongs to biological systems and faced in chapter 4, concerns the Brain Computer Interface. 1.2.1 Rotating Machines Rotating machines are among the most important devices in many industrial applications and are frequently integrated in commercially available equipment and industrial processes. Rotating machines are well known dynamical systems with accurate analytical models and extensive results in literature. Although these models, which are mainly nonlinear, describe accurately the dynamics, not all intrinsic dynamics are described by these models. Unmodeled dynamics can be vibrations, thermal drift, which change significantly the electrical and 4 1.2 Data-Driven Modeling and Diagnosis of Complex Systems Dynamics: Applications mechanical parameters of the machines, and faults. Mechanical faults in rotating machines, such as misalignment, broken bars, gear and bearing defects are not simple to model by analytical way for this the modeling and the diagnosis, with Fault Detection and Isolation applications, is dealt with data-driven approaches. Another reason to support data-driven methods is that the use of models require the knowledge of parameters. In manufacturing industries, where a wide kind of different rotating machines could be produced with many different models and parameters, the accurate knowledge of model parameters is not available. In spite of this, data-driven approaches require to monitor a sample of faultless reference machines, for tuning the necessary parameters. In this dissertation both electrical and mechanical machines are faced. Chapter 2 deals with the issue of modeling and diagnose defects by Fault Detection and Isolation applications in a Quality Control scenario for electric motors. DataDriven modeling and diagnosis solutions are proposed in order to model and diagnose mechanical faults, which could not be diagnosed by analytical models. Chapter 3 deals with the monitoring problem of a complex system as a turbofan engine. In this case a prognosis application is proposed. 1.2.2 Industrial Systems A paper mill plant is a classical example of a complex systems, which consists of several coupling nonlinear systems (e.g. paper machine, stock preparation) and subprocesses (e.g, fun pump, tine unit, mixing unit, pope reel, pulping, broke handling machine, deflaking). Paper mill plant is often monitored by a lot of sensors, which measures many different variables, which can be discrete, logic and acquired at different sampling frequency. These variables could also describe qualitatively the states of the systems. One of the problems in complex industrial systems as a paper mill plant concerns the high number of variables that are monitored. This is a classical example of dimensionality problem in complex system. Chapter 3 proposes a Fault Detection and Isolation application for monitoring processes in a paper mill plant, which helps the early detection and identification of faults. The modeling and diagnosis algorithm uses the correlation among sensors to transform the original multivariate variable space into a subspace which preserves maximum information of the original space. By means of algorithm the main information is maintained and that not relevant is discarded. However, this transformation fails to make use of correlation within the sensor along the time line so another data-driven algorithm is used with the first to capture also the correlation within a sensor. 5 Chapter 1 Introduction 1.2.3 Brain Computer Interface The brain is assumed to be a classical example of a complex, self-organized system. As such, it exhibits hallmarks of nonlinearity, multistability, and “nondiffusivity”. These oscillations are produced by large ensembles of synchronized neuronal activity and the resulting electrophysiological signals in the different frequency bands are associated with different functional states (e.g. sleep, wake, perception and attention). Computational studies adopt a variety of abstractions in order to deal with complex dynamical systems like the brain. Brain-Computer Interfaces are devices which translate the brain activity of the user into specific signals, which may be used for communicating or controlling external devices [4, 5] without the use of peripheral nerves and muscles [6]. Brain-Computer Interfaces represent an interesting option to people affected by neuromuscolar disorders, but whose brain activity is normal, such as in patients affected by Amyotrophic Lateral Sclerosis. In this context Electroencephalography signals are monitored and modeled by data-driven algorithms in order to diagnose when the user focuses the attention on auditory stimuli and when he doesn’t. 1.3 Thesis Outline The thesis is organized in 4 chapters. Chapter 2 - In this chapter are proposed two data-driven Fault Detection and Diagnosis algorithms based on stator current and vibration signals in a Quality Control scenario. Several experimentation are carried out on real electric motors in order to model the faults dynamics and diagnose the fault probable causes. Chapter 3 - In this chapter are proposed two solutions to model and diagnose two different complex systems: a paper mill plant and a turbofan engine. These solutions are applied in order to monitor these complex systems in Fault Detection and Isolation and Prognostic context. A brief description of the two complex system is provided as well. Chapter 4 - This chapter deals with the issue of modeling the Electroencephalography signals, which measures the brain electric activity, and diagnose from these signals the user intention. The proposed solutions is an auditory Brain Computer Interface paradigm for systems based on P300 signals which are generated by auditory stimuli characterized by different sound typologies and locations. Chapter 5 - The final chapter summarizes the obtained results, providing a critical analysis of them and giving an insight of possible future works. 6 Chapter 2 Modeling and Diagnosis of Electrical Motors in a Quality Control Scenario 2.1 Introduction Electric motors are among the most important electrical machines in many industrial applications and are frequently integrated in commercially available equipment and industrial processes. Electric motors equipment often provide tools and solutions to monitor the machines in order to insure reliability and safety for equipment and personnel. In spite of these tools, many companies are still faced with unexpected system failures and reduced machine lifetime. Environmental, duty, design errors and installation issues may combine to reduce residual useful life far sooner than the designed electrical machine lifetimes. Advances in sensors, algorithms, and architectures should provide the necessary technologies for effective incipient failure detection. Fault Detection and Diagnosis (FDD) algorithms could be used not only with the fault diagnosis purpose but also to improve the Quality Control (QC) of these machines. These algorithms can be integrated in test benches at the end or in the middle of the production line in order to test the machines quality. When the electric motors reaches the test benches, the FDD procedures acquire sensors measurements and detect if the motor is normal or defective, in this last case further inspections can diagnose the defect type in order to improve the production efficiency, the machines reliability and the customer satisfaction. In this context, FDD algorithms have the advantage, compared to the standard FDD algorithms, to be implemented offline because it could be not necessary to detect the defects early. Particularly in a FDD scenario for electric motor, several solutions have been proposed which are mainly based on different measures: stator current and vibration signals. Electric motors are well known dynamical systems with accurate analytical models and extensive results in literature. Model-Based Fault 7 Chapter 2 Modeling and Diagnosis of Electric Motors in a Quality Control Scenario Detection and Isolation (FDI) algorithms for electric motors are fast, online, reliable and allow to protect the power system and the machine from incipient faults, but in literature there are a few diagnosis contributions for electric motors. Model-Based solutions are very limited because the analytical models, which describe the faults dynamic that occurs in the electric motors, don’t exist for vibration and current analysis therefore FDD solutions are mainly based on data-driven algorithms. Failure surveys report that faults, in induction motors, are: stator related (38%), rotor related (10%), bearing related (40%) and others (12%) [7]. Typical faults, which occur in rotating electrical machines are: • Unbalance. • Bent shaft. • Eccentricity. • Misalignment. • Backlash. • Gear defects. • Bearing defects. • Electrical faults. • Shaft cracks. These faults and defects are not modeled by analytical models, however they can be described by signal-based approaches. Another reason to support the signal-based methods is that the use of models require the knowledge of parameters. In manufacturing industries, where a wide kind of different electric motors is produced with many different models and parameters, the accurate knowledge of model parameters is not available. In spite of this, signal-based approaches require to monitor a sample of faultless reference motors, for tuning the necessary parameters in a fast training stage. This chapter proposes two data-driven FDD algorithms based on Motor Current Signature Analysis (MCSA) and vibration signals in a QC scenario, in order to develop a monitoring system and improve the reliability of electric motors. This procedures allow to diagnose defects and faults of electric motors at the end of the production line in a motors production plant. In the first algorithm, the FDD algorithms process the three-phase stator current and the vibration signals in order to model patterns related to healthy and faulty motors, which can be used as typical features of each motors condition. After the modeling stage, a distance is used as an index to identify the dissimilarity 8 2.2 Quality Control between two patterns. Such index allows the automatic identification of each fault and defect. Several simulations and experimentations are carried out in order to verify the effectiveness of the proposed methodology: broken rotor bars and connectors are simulated, while experimentations on real induction motors at the end of production line are presented. In the case of vibration signals a probabilistic algorithm is used for fault diagnosis on the residual signals. Several simulations and experimentations are carried out by a test bench of single-phase electric motors which are mounted on kitchen hoods. This chapter is based on the problem and the results presented in [8, 9, 10]. 2.2 Quality Control In industry, QC is a collection of methods that are able to improve the quality and efficiency in processes and productions and in many others industry aspects. In 1924, Walter Shewhart designed the first control chart and gave a rationale for its use in process monitoring and control [11]. The main concept of QC is the “proactiveness”, in order to ensure the product quality, monitoring processes and related signals to detect when they “go out of control”. In the last years, manufacturing industries are reversing many attentions and efforts for the introduction of QC in the production lines and with large volumes of low-tech products are concentrating many investigations on the efficient introduction of QC in their production lines. One of the major problems, in which these manufacturing industries are involved, is the customers satisfaction, because they usually purchase lots of products with some unwanted defective component. In order to satisfy customers, manufacturing industries carry out some spot checks at the end of product lines. This method does not ensure the quality of products and total defective products removal. If, for example, a company produces electric motors for kitchen hoods in a −4σ defect level [12], the total number of defective motors could be about 30 considering 10000 motors produced per day, each one built by 30 assembling operations. The number of checks needed to find all the possible defects are too expensive and too high when compared with the product cost. A desirable QC solution for these manufacturing industries should be minimally invasive, effective and with a low payback period. In addition, the testing could be made systematic using a low-cost system base on a reduced set of sensors integrated in the test bench. When the electric motor reaches the end of production line the FDD system acquires sensors measurements and detects if the product is defective or not. Moreover, by isolating and identifying the defective type, the FDD procedure helps to estimate in which subprocess the defect is introduced, and allows to remove the defective products and to improve the quality of processes as 9 Chapter 2 Modeling and Diagnosis of Electric Motors in a Quality Control Scenario a proactive measures for the QC methodology. Statistics can be computed in order to investigate about the production subprocesses efficiency and then to improve those subprocess which introduce more defects in the products. In this way, the tests performed at the end of production lines allow to remove the defective products, estimate the efficiency drop of subprocesses which introduce defects in the products and improve the quality of processes as a proactive measures for the QC methodology. 2.3 Data-Driven Modeling and Diagnosis of Induction Motor by Current Signals In induction motors, faults and defects are often correlated to the three-phase stator current signals, which can be processed to model the motors behaviour by patterns that represent the normal and abnormal motor conditions. MCSA is a noninvasive, on-line monitoring solution based on current measurements, which are available by inverter, for the diagnosis of faults in induction motors [7, 13, 14]. MCSA techniques use data-driven procedures to model patterns by current signals which are indicative of normal and abnormal motor conditions. Moreover MCSA procedures can be used to detect and diagnose not only classic motor faults (i.e. rotor eccentricity), but also gear faults (i.e. tooth spalls) [15]. In order to model the three-phase stator current signals, Principal Component Analysis (PCA) and Kernel Density Estimation (KDE) are taken into account. PCA is used in data pre-processing to reduce the currents space in two dimensions. The Probability Density Function (PDF) of PCA-transformed signals is estimated by KDE, which is a non-parametric method useful to assess the data distribution [16]. PDFs are the models that can be used to identify each fault and defect. The advantage of non-parametric approaches, respect to parametric ones, is that they offer greater flexibility in modeling a given dataset, and they are not affected by problems as stated in [16] (and reference therein). In the test KDE with Gaussian kernel function is considered and the plug-in bandwidth selection procedure is applied [17]. Diagnosis has been carried out using the Kullback-Leibler (K-L) divergence, which measures the difference between two probability distributions. This divergence is used as a distance measure between classified statistic signatures obtained by KDE. K-L is an index that allows to identify the dissimilarity between two determined probability distributions (that can also be multidimensional): one concerns the modeled signatures and the other concerns the acquired data samples. By K-L divergence, the classification of each motor condition is performed. 10 2.3 Data-Driven Modeling and Diagnosis of Induction Motor by Current Signals 2.3.1 Recalled Results Principal Component Analysis PCA is a dimensionality reduction technique that produces a lower dimensional representation in a way that preserves the correlation structure between the process variables capturing the variability in the data [18]. By PCA, the correlation among sensors is used to transform the multivariate space into a subspace which preserves maximum variance of the original space in minimum number of dimensions. In other words, PCA rotates the original coordinate system along the direction of maximum variance. Considering a data matrix X ∈ RN ×m of N sample rows and m variable columns that are normalized to zero mean, with mean values vector µ, the matrix X can be decomposed as follows: X = X̂ + X̃, (2.1) where X̂ is the projection on the Principal Component Subspace (PCS) Sp , and X̃, the residual matrix, is the projection on the Residual Subspace (RS) Sr [19]. Defining the loading matrix P , whose columns are the right singular vectors of X, and selecting the columns of the loading matrix P ∈ Rm×p , which correspond to the loading vector associated with the first p singular values, it follows that: X̂ = XP P T ∈ Sp . (2.2) The residuals matrix X̃, is the difference between the data matrix X and its projection into the first p principal components retained in the PCA model: X̃ = X(I − P P T ) ∈ Sr , (2.3) therefore the residual matrix captures the variations in the observations space spanned by the loading vectors associated with the r = m − p smallest singular values. The projections of the observations in X into the lower-dimensional space are contained in the score matrix: T = XP ∈ RN ×p . (2.4) PCA is here applied to the currents of three-phase induction motor in order to reduce the inputs space from the three original dimensions to two because the currents are highly correlated. Indeed for healthy motor, with three-phase without neutral connection, ideal conditions for the motor and a balanced voltage supply, the stator currents are given by Eq. (2.5), where ia , ib and ic denote the three stator currents, Imax their maximum value, f their frequency, φ their phase angle and t the time. Then it is known that each stator current 11 Chapter 2 Modeling and Diagnosis of Electric Motors in a Quality Control Scenario is given by the combination of the others: ia (t) = Imax sin(2πf t − φ) ib (t) = Imax sin(2πf t − 2π/3 − φ) i (t) = I c max sin(2πf t − 4π/3 − φ). (2.5) The PCA transform ( 2.4), applied to the signals in Eq. (2.5), makes the smallest singular values equal to zero. This implies that the information of the principal component, captured by the smallest singular values, is null then the last principal component could be deleted and the original space reduced from three to two without losing information. This is justified by the fact that in Eq. (2.5), each stator current is perfectly correlated to the sum of the others. Adding Gaussian white noise, with standard deviation σ, to the stator current signals ( 2.5), the smallest singular values will not be equal to zero, but it will depend by the ratio between Imax and σ. Kernel Density Estimation Given N independent and identically distributed (i.i.d.) random vectors X = [X1 , . . . , XN ], where Xi = [Xi1 , . . . , Xid ], whose distribution function F (x) = P [X ≤ x] is absolutely continuous with unknown PDF f (x). The estimated density at x is given by [20]: 4 3 N 1 Ø 1 x − Xi f (x) = , K N i=1 |H|d |H|d (2.6) In the present study a two-dimensional Gaussian kernel function is used so d is 2 and a further simplification, which follows from the restriction of kernel * ) bandwidth H = h2 I : h > 0 , leads to the single bandwidth estimator so the estimated density f (x) becomes [21]: f (x) = N ëx−Xi ë2 1 Ø 1 − 2h2 e . N i=1 (2πh2 )1/2 (2.7) where x ∈ Rd whose size ngrid is the points number in which the PDF is estimated. It is well known that the value of the bandwidth h and the shape of the kernel function are of critical importance [22]. In many computationalintelligence methods that employ KDE, the issue to find the appropriate bandwidth h is the issue [22, 23, 24]. In the present work the Asymptotic Mean Integrated Squared Error (AMISE) with plug-in bandwidth selection procedure is used to choose automatically the bandwidth h [17]. In the proposed algorithm, KDE is used to model a specific pattern for each motor condition, 12 2.3 Data-Driven Modeling and Diagnosis of Induction Motor by Current Signals indeed the features of the current signals are mapped in the two-dimensional principal components space, representing specific signatures of the motor conditions. Kullback-Leibler Divergence Given two continuous PDFs f1 (x) and f2 (x), a measure of “divergence” or “distance” between f1 (x) versus f2 (x) is given by [25]: I1:2 (X) = Ú f1 (x) log Rd f1 (x) dx, f2 (x) (2.8) f2 (x) dx. f1 (x) (2.9) and between f2 (x) versus f1 (x) is given by: I2:1 (X) = Ú f2 (x) log Rd Therefore the K-L divergence between f1 (x) and f2 (x) is: J(f1 ; f2 ) = I1:2 (X) + I2:1 (X) = Ú f1 (x) = (f1 (x) − f2 (x)) log dx. f2 (x) Rd (2.10) The above equation is known as the symmetric K-L divergence, which represents a non negative measure between two PDFs. In the present work d is 2 and a discrete form of K-L divergence is adopted: ngrid J(f1 ; f2 ) = d ØØ i=1 j=1 (f1 (xij ) − f2 (xij )) log f1 (xij ) . f2 (xij ) (2.11) The K-L divergence allows to define a fault index: if fΩ is the PDF in the PCs space estimated by KDE of the oncoming current measurements, the present motor condition is that which minimizes the K-L divergence between fΩ and fi that is the i-th PDF related to each modeled motor condition: c = arg minJ(fΩ ; fi ), (2.12) i where c is the classification output. 2.3.2 Developed Algorithm The developed FDD procedure based on KDE consists of two stages: training and FDD monitoring. In the first, one KDE model is computed by feature signals for each motor condition, in order to have one KDE model in the case 13 Chapter 2 Modeling and Diagnosis of Electric Motors in a Quality Control Scenario of healthy motor and one KDE model for each faulty case. The training steps are summarized below: T1. Stator current signals for each motor condition are acquired; T2. Data are normalized; T3. PCA transform ( 2.4) is applied to stator current signals, which are projected into the two-dimensional principal components space; T4. The matrices P and µ are stored; T5. KDE is performed on the lower-dimensional principal components space ( 2.4) using a grid of ngrid points and a bandwidth h for the Gaussian kernel function ( 2.7); T6. PDFs estimated by KDE ( 2.7) are stored. In diagnosis step, the previously obtained models are compared with the new data and a fault statistical index is calculated. The diagnosis steps are summarized below: D1. Stator current signals are acquired; D2. Data are normalized; D3. The matrices P and µ, previously computed (T3), are applied to signals; D4. KDE is performed on the lower-dimensional principal components space ( 2.4) using the same points grid ngrid and bandwidth h used in the training step (T5); D5. Symmetric K-L divergence ( 2.11) is computed between the estimated PDF by KDE ( 2.7) using the acquired current signals, and those stored in the training step (one for each condition) (T6); D6. Diagnosis is evaluated using Eq. (2.12). Faults are identified using the Eq. (2.12) where fΩ is the PDF, estimated by KDE, in the PCs space of the oncoming current measurements and fi is the i − th PDF related to each modeled motor condition. K-L divergence is used as an input for fault decision algorithm allowing to take decision automatically on the operating state and condition of the machine and detecting any abnormal operating condition. 14 2.3 Data-Driven Modeling and Diagnosis of Induction Motor by Current Signals 2.3.3 Case Study In order to verify the effectiveness of the proposed methodology several simulations are carried out using one benchmark and some experimentations using real asynchronous motors. The benchmark uses a Time Stepping Coupled Finite Element-State Space modeling approach to generate current signals for induction motors as described in [26]. The simulation dataset consists of twenty-one different motor conditions, which are: one healthy condition, ten broken bars conditions and ten broken connectors conditions. Twenty time series are generated for each motor condition. Each signal consists of 1500 samples. The dataset can be download from UCR time series data mining archive [27]. The characteristics of the three-phase induction motors are: input voltage 208 V , frequency 60 Hz, number of rotor bars 34, pole number 2 and power 1.2 hp. The sample rate is 33.3 kHz and the processed data, for each test, are related to 0.5 s of acquisition. White noise with standard deviation σ = 0.2 is added to the simulated current signals. The results are the average of 200 Monte Carlo simulations where the training and testing data sets are randomly changed. The real tests are carried out using three phase induction motors whose parameters are: input voltage 380 V , frequency 50 Hz, power 0.75 kW , sample rate 20 kHz. Two different faults are tested: wrong rotor and cracked rotor. Wrong rotor refers to a non compliant rotor, in particular a single phase rotor is assembled instead of a three phase rotor. Ten motors for each faulty and for the healthy case are tested. The acquisition time is 14 s. The processed data, for each test, are related to 0.7 s of acquisition. In this study case the results are the average of 2000 Monte Carlo simulations where the training and testing data sets are randomly changed. The motors, with a defective rotor installed, have at the operating point of 2800 RP M about 3% of efficiency drop, as shown in Fig. 2.1. So it is important to detect this defect in the energy efficiency context and QC. 2.3.4 Results The proposed approach processes three-phase stator currents in order to perform defects detection and diagnosis as described in section 2.3.2. The following two subsections show the results related to the two cases described previously. From Fig. 2.2 to 2.13 the simulation and experimentation results are shown. The classification accuracy is considered as an index to evaluate the performance of the proposed algorithm as shown in tables 2.1 and 2.2. This index is obtained using the K-L distances probability distributions of each class, approximated as normal distributions and estimated by Monte Carlo trials. The simulations are carried out changing ngrid , the points number in which the PDF is estimated, and the current signals acquisition time in steady-state. 15 Chapter 2 Modeling and Diagnosis of Electric Motors in a Quality Control Scenario 80 75 70 65 60 % 55 50 45 40 35 30 25 20 2000 2100 2200 2300 2400 2500 2600 Speed [RPM] 2700 2800 2900 3000 Figure 2.1: Efficiency characterization of tested induction motors. Blue solid line refers the healthy motor, red dashed line refers to motor with defective rotor. Figs. 2.5, 2.6, 2.7, 2.11, 2.12 and 2.13 show the K-L distances for all Monte Carlo trials. On each vertical line, the central dot is the mean and the horizontal edges are the 4 times standard deviation. The figures show the results with ngrid = 64 × 64 points and the acquisition time, for the benchmark and real motors, equals to 0.4 s and 0.7 s respectively. This algorithm parameters setting guarantees better results for these cases taking into account the classification accuracy and the processing time. In the real motor the algorithm takes about 2.5 s for the classification output (Eq. 2.12): about 1 s to acquire the current signals, of which 0.25 s in transient state and 0.7 s in steady-state, and about 1.45 to evaluate the PDF and the classification output (Eq. 2.12). Setting ngrid = 32 × 32 points, the processing time is reduced to 1.5 s but decreasing the classification accuracy as shown in Tables 2.1 and 2.2. The tests are performed for both cases using the asymmetric K-L divergence (Eq. 2.9). The results, related to the symmetric K-L divergence and described in the next subsections, are comparable to those achieved with the asymmetric K-L divergence. Broken Rotor Bars and Connectors Diagnosis In the following, Figs. 2.2, 2.3 and 2.4 represent the pattern of the healthy motor, one broken bar and one broken connector conditions; these figures show how the PDFs, estimated by KDE in the principal components space, are used as the specific patterns for the motor conditions. The simulation results, given in Figs. 2.5, 2.6 and 2.7, show the faults diagnosis for broken rotor bars and connectors, setting ngrid = 64 × 64 and the current signals acquisition time 16 2.3 Data-Driven Modeling and Diagnosis of Induction Motor by Current Signals in steady-state equals to 0.4 s. Fig. 2.5 shows the K-L divergence among the PDFs, estimated by KDE, of all motor conditions (i.e. healthy, from one to ten broken rotor bars and from one to ten broken connectors) and the PDF estimated by KDE from stator current signals of healthy motor. The results show as the minimum K-L distance is exactly the healthy condition. Fig. 2.6 shows the K-L divergence among all PDFs and the PDF estimated from stator current signals affected by one broken rotor bar. In this case the graph shows as the minimum K-L distance is exactly the broken bar condition. The last graph, Fig. 2.7, shows the one broken connector diagnosis. Even in this case the K-L divergence detects and identifies the fault, that is one broken connector. By Monte Carlo simulations, all fault types are diagnosed with 100% accuracy hence the K-L divergence figures for the others fault types are not reported. Moreover the classification accuracy is 100% with acquisition time above 0.4 s for each fault type, while below 0.4 s, the classification accuracy decreases as shown in Table 2.1. Figure 2.2: Interpolated PDFs of a finite element heathy motor in the twodimensional principal components space estimated by KDE. Real Induction Motors Diagnosis The Figs. 2.8, 2.9 and 2.10 represent the patterns of three real motors conditions: healthy, cracked and wrong rotor; these figures show as the PDFs, estimated by KDE in the principal components space, are distinct and therefore can be used as a specific pattern for each motor condition. The simulation results given in Figs. 2.11, 2.12 and 2.13 show the faults diagnosis for cracked and wrong rotors, setting ngrid = 64 × 64 and the current signals acquisition 17 Chapter 2 Modeling and Diagnosis of Electric Motors in a Quality Control Scenario Figure 2.3: Interpolated PDFs of a finite element motor with one broken bar in the two-dimensional principal components space estimated by KDE. Figure 2.4: Interpolated PDFs of a finite element motor with one broken connector in the two-dimensional principal components space estimated by KDE. time in steady-state equals to 0.7 s. Fig. 2.11 shows the K-L divergence among the PDFs, estimated by KDE, of all motor conditions (i.e. healthy, cracked and wrong rotors) and the PDF estimated by KDE from stator current signals of healthy motor. The results show as the minimum K-L distance is exactly the healthy condition. Fig. 2.12 shows the K-L divergence among all PDFs and 18 2.4 Data-Driven Modeling and Diagnosis by Vibration Signals "$ ./001234 56710689:7;68<6=36 ", * ( & $ , ! "# $# %# &# '# (# )# *# +# ",# "- $- %- &- '- (- )- *- +- ",- Figure 2.5: K-L divergence in the case of a finite element healthy motor. The blue dots are the mean, the blue bars are the four times standard deviation and the red asterisks are the classification output. Label H means healthy motor, labels 1-10B mean broken bars with the relative number, labels 1-10C mean broken connectors with relative number. the PDF estimated from stator current signals where cracked rotors are diagnosed. In this case the graph shows as the minimum K-L distance is exactly the cracked rotor condition. The last graph, Fig. 2.13, shows the wrong rotor diagnosis. Even in this case the K-L divergence detects and identifies the fault. By Monte Carlo simulations, all fault types are diagnosed with accuracy shown in Table 2.2. It can be noticed how the classification accuracy in the case of healthy motor is always 100%, therefore the algorithm is able to detect if the motors are healthy or if there are some faults or defects. In Figs. 2.12 and 2.13 the blue lines of motors with cracked and wrong rotor are never overlapped to the blue lines of healthy motors so, in these tests, the algorithm never confuses the cases of healthy motors from those not healthy. 2.4 Data-Driven Modeling and Diagnosis by Vibration Signals In electric motors, faults and defects are often correlated to the vibration signals, which can be processed to model the motors behaviour by patterns that represent the normal and abnormal motor conditions. Vibration analysis is widely accepted as a tool to detect faults of a rotating machine as it is not 19 Chapter 2 Modeling and Diagnosis of Electric Motors in a Quality Control Scenario ( ./001234 56710689:7;68<6=36 ' & % $ " , ! "# $# %# &# '# (# )# *# +# ",# "- $- %- &- '- (- )- *- +- ",- Figure 2.6: K-L divergence in the case of a finite element motor with one broken bar. The blue dots are the mean, the blue bars are the four times standard deviation and the red asterisks are the classification output. Label H means healthy motor, labels 1-10B mean broken bars with the relative number, labels 1-10C mean broken connectors with relative number. destructive, reliable and it permits continuous monitoring without stopping the machine [28, 29, 30, 31, 32, 33]. In particular, it is possible to detect different faults that can arise in rotating machines by analyzing the vibration power spectrum. Most common defects in these machines are unbalance and misalignment. Unbalance generates mainly a radial component at the rotation frequency and an axial component at the same frequency. Unbalance may be caused by poor balancing, shaft inflexion (i.e. thermal expansion) and rotor distortion by magnetic forces (a well known problem in high power electrical machines). Misalignment generates a radial component at double of rotating frequency and an axial component at rotation frequency. Misalignment may be caused by misaligned couplings, misaligned bearings or crooked shaft. High misalignment can produce sub-synchronous instability phenomenon. This effect is due to the oil whirl and a decrease in the bearing load. Moreover, components of the spectrum over the rotation frequency are due to bearings, events that occur many times per round, signal distortion, mechanical non linearities (i.e. backlash and loose coupling). In case of a cracked shaft there is an increasing of vibrations at rotation frequency and at second harmonic. Torsional vibrations are angular oscillations that overlap the normal rotational motion. Due to erosion of mechanical components in rotating machines torsional vibrations rise up and involve the following effects: 20 2.4 Data-Driven Modeling and Diagnosis by Vibration Signals + ./001234 56710689:7;68<6=36 * ) ( ' & % $ " , ! "# $# %# &# '# (# )# *# +# ",# "- $- %- &- '- (- )- *- +- ",- Figure 2.7: K-L divergence in the case of a finite element motor with one broken connector. The blue dots are the mean, the blue bars are the four times standard deviation and the red asterisks are the classification output. Label H means healthy motor, labels 1-10B mean broken bars with the relative number, labels 1-10C mean broken connectors with relative number. Figure 2.8: Interpolated PDFs of real healthy motors in the two-dimensional principal components space estimated by KDE. • High gears noise. • Joint damage. 21 Chapter 2 Modeling and Diagnosis of Electric Motors in a Quality Control Scenario Table 2.1: Classification accuracy in the case of finite element motor, changing ngrid , the points number in which the PDF is estimated, and the current signals acquisition time in steady-state. Label H means healthy motor, labels 1-10B mean broken bars with the relative number, labels 1-10C mean broken connectors with relative number. ngrid Acquisition time (s) 128 × 128 0.3 0.15 64 × 64 0.3 32 × 32 0.15 0.3 0.15 % H 1B 2B 3B 4B 5B 6B 7B 8B 9B 10B 1C 2C 3C 4C 5C 6C 7C 8C 9C 10C 100 100 100 100 100 100 100 100 99.82 100 100 100 99.98 99.88 99.79 99.98 100 100 100 100 100 100 100 100 100 100 100 100 99.98 95.34 99.74 99.43 100 96.89 91.71 97.61 98.96 100 100 100 99.99 99.89 100 100 100 100 100 100 100 100 99.89 100 100 100 99.88 99.72 99.84 99.99 100 100 100 100 100 100 100 100 100 100 99.96 100 99.94 95.97 99.53 99.32 100 95.99 95.74 98.10 98.49 100 100 100 100 99.77 100 100 100 100 100 100 100 100 99.55 100 99.99 100 99.56 98.48 99.93 99.96 100 100 100 100 100 100 100 100 99.93 99.89 99.70 99.87 98.19 91.41 98.56 99.43 99.99 91.01 93.75 96.87 96.37 99.98 99.94 99.88 99.47 96.95 Mean 99.97 99.03 99.97 99.18 99.88 98.15 • Accelerated wear and breakage of the gear. • Deformation of keys. • Misalignment of coupling hubs. • Accelerated wear of the windings of electric motors. • Fatigue failure of shaft. • Erratic distribution of power. Non-integer multiples of shaft speed may arise by belt drives, gears, etc. Often, a fault arising in a rotating machine increases the vibration amplitude associated with the fault. For instance, if a fault occurs in gears, the vibration amplitude of a whole family of sidebands increases in a specific region of its frequency spectrum, while a ball-bearing fault is characterized by an increment 22 2.4 Data-Driven Modeling and Diagnosis by Vibration Signals Figure 2.9: Interpolated PDFs of real motors with cracked rotor in the twodimensional principal components space estimated by KDE. Figure 2.10: Interpolated PDFs of real motors with wrong rotor in the twodimensional principal components space estimated by KDE. in the amplitude of a family of harmonics. In Machine Vibration Signature Analysis (MVSA), the Fourier transform is used to determine the vibration spectrum [34], and the signature at different frequencies is identified and compared with initial measurement to detect faults in the machine. The short coming of this approach is that the Fourier analysis is limited to stationary signals, while vibrations are not stationary by nature. 23 Chapter 2 Modeling and Diagnosis of Electric Motors in a Quality Control Scenario 945 :;$$<#*= >"+<$".-?+,".2"1*" 9 845 8 745 7 645 6 345 3 !"#$%&' (")"*%+,"-./%/. 0./12-./%/. Figure 2.11: K-L divergence in the case of real healthy motors. The blue dots are the mean, the blue bars are the four times standard deviation and the red asterisks are the classification output. 9 :;$$<#*= >"+<$".-?+,".2"1*" 845 8 745 7 ✁ 645 6 345 3 !"#$%&' (")"*%+,"-./%/. 0./12-./%/. Figure 2.12: K-L divergence in the case of real motors with cracked rotor. The blue dots are the mean, the blue bars are the four times standard deviation and the red asterisks are the classification output. In order to model the vibration signal, Multi-Scale Principal Component Analysis (MSPCA) is taken into account [35]. MSPCA deals with processes that operate at different scales, and have contributions from: • events occurring at different localizations in time and frequency; 24 2.4 Data-Driven Modeling and Diagnosis by Vibration Signals 945 :;$$<#*= >"+<$".-?+,".2"1*" 9 845 8 745 7 645 6 345 3 !"#$%&' (")"*%+,"-./%/. 0./12-./%/. Figure 2.13: K-L divergence in the case of real motors with wrong rotor. The blue dots are the mean, the blue bars are the four times standard deviation and the red asterisks are the classification output. Table 2.2: Classification accuracy in the case of real motors, changing ngrid , the points number in which the PDF is estimated, and the current signals acquisition time in steady-state. Motor conditions are: healthy, motor with cracked rotor and motor with wrong rotor. 128 × 128 64 × 64 ngrid Acquisition time (s) 0.7 0.5 0.3 0.7 Healthy Cracked rotor Wrong Rotor 100 98.82 98.97 100 95.08 99.47 100 77.08 99.49 Mean 99.26 98.18 92.19 32 × 32 0.5 0.3 0.7 0.5 0.3 100 99.00 99.18 100 94.74 99.36 100 86.54 98.56 100 98.29 99.85 100 94.02 99.41 100 81.45 99.21 99.39 98.03 95.03 99.38 97.81 93.55 % • stochastic processes whose energy or power spectrum changes with time and/or frequency; • variables measured at different sampling rate or containing missing data. MSPCA transforms the process data information at different scales by Wavelet Transform (WT). The information of each different scales is captured by PCA modeling. These models, which represent the process conditions, can be used to identify each fault and defect. WT is appropriated for extracting process information from vibration data since the wavelets, with their time-frequency 25 Chapter 2 Modeling and Diagnosis of Electric Motors in a Quality Control Scenario localization and multi-resolution property, can be used as a useful framework for multi-scale data representation [36]. To isolate the defects a KDE algorithm is used on the PCA residuals, and thresholds are computed for each sensor signal to determine if, for each wavelet matrix, the signals are involved in the defect or not. KDE method is widely recognized as a robust methodology to determine numerically the data PDF, in particular such estimation technique is introduced where Gaussian assumption is not recognized [37]. Fault isolation has been carried out by contribution plots that takes into account the spacial correlations. This approach is based on quantifying the contribution of each process variable to the individual scores of the PCA representation, and for each process variable summing the contributions only of those variables responsible for the out-of-control status. Diagnosis can be performed using the contribution plots because they represent the signatures of the rotating electrical machines conditions. In the QC context, a supervised classificator, with input the PCA contributions, is used to diagnose each motor defect. The results show that the identified signatures by PCA contributions, are unique for each considered defect. 2.4.1 Recalled Results Principal Component Analysis PCA is introduced in the section 2.3.1, here a improved PCA fault detection index is described. A deviation of the new data sample X from the normal correlation could change the projections onto the subspaces, either Sp or Sr . Consequently, the magnitude of either X̃ or X̂ could increase over the values obtained with normal data. The Square Prediction Error (SPE) is a statistic that measures lack of fit of a model to data. The SPE statistic indicates the difference, or residual, between a sample and its projection into the p components retained in the model. The exact description of the distribution of SPE is given in [38]: . .2 . .2 SP E ≡ .X̃ . = .X(I − P P T ). . (2.13) SP E ≤ δ 2 (2.14) The process is considered faultless if: where δ 2 is a confidence limit for SPE. A confidence limit expression for SPE, when x follows a normal distribution, is developed in [39, 36, 34]. The detectability of a fault is given by conditions proven in [40] and recalled in the following. Defining: X = X ∗ + f Ξ, (2.15) 26 2.4 Data-Driven Modeling and Diagnosis by Vibration Signals where the sample vector for normal operating conditions is denoted by X ∗ , f represents the magnitude of the fault and Ξ is a fault direction vector. Necessary and sufficient conditions for detectability are: • Ξ̃ = (I − P P T )Ξ Ó= 0, with Ξ̃ the projection of Ξ on the residual subspace; - • -f˜- = -(I − P P T )f - > 2δ, with f˜ the projection of f on the residual subspace. The drawbacks of SP E index for fault detection are mainly two: the first is related to the assumption of normal distribution to estimate the threshold of this index, the second is that the SP E is a weighted sum, with unitary coefficients, of quadratic residues X̃i . To improve the fault detection, these two drawbacks are faced assuming that the process is considered faultless if, for each i: (2.16) X̃i2 ≤ δi i = 1, . . . , m. where δi is a confidence limit for X̃i2 . To estimate the confidence limit δi , even when the normality assumption of X̃i2 is not valid, the solution is to estimate the PDF directly from X̃i2 through a non parametric approach. In [37, 41, 42], KDE is considered because it is a well established non parametric approach to estimate the PDF of statistical signals and evaluate the control limits. Assume y is a random variable and its density function is denoted by p(y). This means that: Ú k p(y)dy. P (y < k) = (2.17) −∞ Hence, by knowing p(y), an appropriate control limit can be determined for a specific confidence bound α, using Eq. (2.17). Replacing p(y), in Eq. (2.17), with the estimation of the probability density function of X̃i2 , called p̂(X̃i )2 , the control limits will be estimated by: s δi −∞ p̂(X̃i2 )dX̃i2 = α. (2.18) PCA and Eqs. 2.16 and 2.18 are applied to Tennessee Eastman Process (TEP) [43] in order to show the advantages of these solutions. The process consists of five major unit operations: a reactor, a product condenser, a vaporliquid separator, a recycle compressor, and a product stripper. The process has 12 manipulated variables, 22 continuous process measurements, and 19 compositions sampled less frequently. In this study, total 33 variables suggested by [44] are used for monitoring. Detail descriptions about the selected variables are well explained in [44]. All composition measurements are excluded because they are hard to measure on-line practice. A sampling interval of 3 min was used to collect the simulated data for the training and testing sets. Both the 27 Chapter 2 Modeling and Diagnosis of Electric Motors in a Quality Control Scenario training and testing data sets for each fault are composed of 960 (48 hours) samples. All faults in the test data set were induced from the sample 160 (8 hours). The detailed fault information is well presented in [44]. Since the TEP is open-loop unstable, after approximately 2 hours (simulated time) the reactor pressure will exceed the upper bound of 3000 kP a and the simulation will shut down, so the decentralized control strategy, described in [45], is used. TEP model can be downloaded from the website address [46]. Table 2.3 shows the Table 2.3: TEP results with improved detection index Indices SP E % X̃i2 % Fault Fault Fault Fault Fault Fault Fault 2.33 0.63 4.07 23.84 0.7 0.47 71.42 10.42 1.07 12.56 30.65 1.55 0.64 72.72 14.78 18.52 3 5 9 12 15 16 18 Mean results related to the faults more difficult to detect. The results point out that in the case of faults not much noticeable, the fault detection index (Eq. 2.18) is more accurate to detect these faults than the SP E index. Fault isolation and diagnosis are performed by the PCA contributions: defining the new observation vector xj ∈ Rm , the total contribution of the ith process variable Xi is CON Ti = qN j=1 x̃2ij i = 1, . . . , m. (2.19) Wavelet Transform The Wavelet Transform (WT) is defined as the integral of the signal f (t) multiplied by scaled, shifted version of basic wavelet function φ(t), that is a real valued function whose Fourier transform satisfies the admissibility criteria [47]. Then the wavelet transformation c(·, ·) of a signal f (t) is defined as: c(a, b) = s R f (t) √1a φ a ∈ R+ − {0} b∈R 28 ! t−b " a dt (2.20) 2.4 Data-Driven Modeling and Diagnosis by Vibration Signals where a is the so-called scaling parameter, b is the time localization parameter. Both a and b can be continuous or discrete variables. Multiplying each coefficient by an appropriately scaled and shifted wavelet it yields the constituent wavelets of the original signal. For signals of finite energy, continuous wavelets synthesis provides the reconstruction formula: 4 3 Ú Ú 1 1 t − b da f (t) = c(a, b) √ φ db (2.21) K φ R R+ a a2 a where: Kψ = Ú +∞ −∞ | ψ̂(ξ) | dξ |ξ| (2.22) denotes a (Wavelet specific) normalization parameter in which φ̂ is the Fourier transform of φ. Mother wavelets must satisfy the following properties: s +∞ −∞ |φ(t)|dt < ∞, s +∞ −∞ |φ(t)|2 dt = 1, s +∞ −∞ φ(t)dt = 0. (2.23) To avoid intractable computations when operating at every scale of the Continuous WT (CWT), scales and positions can be chosen on a power of two, i.e. dyadic scales and positions. The Discrete WT (DWT) analysis is more efficient and just as accurate [47, 48]. In this scheme a and b are given by: a = aj0 , b = b0 aj0 k, (j, k) ∈ Z2 , Z := {0, ±1, ±2, · · · }. (2.24) The variables a0 and b0 are fixed constants that are set a0 = 2 and b0 = 1 [48]. The discrete wavelet analysis can be described mathematically as: q c(a, b) = c(j, k) = n∈Z+ f (n)φj,k (n), a = 2j , b = 2j k, j ∈ Z, k ∈ Z, (2.25) considering the simplified notation f (n) = f (n·tc ), n ∈ Z+ and tc the sampling time, the discretization of continuous time signal f (t) is considered. The inverse transform, also called discrete synthesis, is defined as: f (n) = ØØ c(j, k)φj,k (n). (2.26) j∈Z k∈Z In [49], a signal is decomposed into various scales with different time and frequency resolutions, this algorithm is known as the multiresolution signal de- 29 Chapter 2 Modeling and Diagnosis of Electric Motors in a Quality Control Scenario composition. Defining: ! " φj,k = 2−j/2 φ 2−j t − k , ! " ψj,k = 2−j/2 ψ 2−j t − k , Vj = span {φj,k , k ∈ Z} , Wj = span {ψj,k , k ∈ Z} . (j, k) ∈ Z2 (2.27) the wavelet function φj,k , is the orthonormal basis of Vj and the orthogonal wavelet ψj,k , called scaling function, is the orthonormal basis of Wj . In [48] is shown that: Vj ⊥Wj , Vm = Wm+1 ⊕ Vm+1 . Vm , Wm ⊂ L2 (R) (2.28) Defining f (n) = f as element of V0 = W1 ⊕ V1 , f can be decomposed into its components along V1 and W1 : (2.29) f = P1 f + Q1 f. with Pj the orthogonal projection onto Vj and Qj the orthogonal projection onto Wj . Defining j ≥ 1 and f (n) = c0n , it results: f (n) = 1 1 k φ1,k (n) + k∈Z dk ψ1,k (n), k∈Z cq c1k = n∈Z h(n − 2k)c0n , q d1k = n∈Z g(n − 2k)c0n , q q h(n − 2k) = éφ1,k (n), φ0,n (n)ê , g(n − 2k) = éψ1,k (n), ψ0,n (n)ê , k, n ∈ Z2 . (2.30) where the terms g and h are high-pass and low-pass filter coefficients derived from the bases ψ and φ. Considering a dataset of N (n = 1, . . . , N ) samples, and introducing a vector notation, c1k and d1k can be rewrite as [48]: c1 = Hc0 , d1 = Gc0 , with h(0) h(1) ··· h(−1) ··· h(−2) H= .. .. . ··· . h(−2k) h(1 − 2k) · · · 30 (2.31) h(N ) h(N − 2) , .. . h(N − 2k) (2.32) 2.4 Data-Driven Modeling and Diagnosis by Vibration Signals g(0) g(1) g(−2) g(−1) G= .. .. . . g(−2k) g(1 − 2k) ··· ··· ··· ··· g(N ) g(N − 2) . .. . g(N − 2k) (2.33) The procedure can be iterated obtaining: Then: cj = Hcj−1 , dj = Gdj−1 . (2.34) cj = Hj c0 , dj = Gj d0 . (2.35) where Hj is obtained by applying the H filter j times, and Gj is obtained by applying the H filter j − 1 times and the G filter once. Hence any signal may be decomposed into its contributions in different regions of the time-frequency space by projection on the corresponding wavelet basis function. The lowest frequency content of the signal is represented on a set of scaling functions. The number of wavelet and scaling function coefficients decreases dyadically at coarser scales due to the dyadic discretization of the dilation and translation parameters. The algorithms for computing the wavelet decomposition are based on representing the projection of the signal on the corresponding basis function as a filtering operation [49]. Convolution with the filter H represents projection on the scaling function, and convolution with the filter G represents projection on a wavelet. Thus, the signal f (n) is decomposed at different scales, the detail scale matrices and approximation scale matrices. Defining L the decomposition levels, the approximation scale AL and the detail scales Dj , j = 1, ..., L are the composition of cj and dj for every m variables of the data matrix X: Aj = [cj1 , cj2 , . . . , cjm ], Dj = [dj1 , dj2 , . . . , djm ]. j = 1, ..., L (2.36) To select the wavelet decomposition level L, it is considered the minimum number of decomposition levels, and used to obtain an approximation signal AL so that the upper limit of its associated frequency band is under the fundamental frequency f , is described by the following condition: 2−(L+1) fs < f. (2.37) where fs is the sampling frequency of the signals and f is the impeller rotational frequency [50]. From this condition, the decomposition level of the 31 Chapter 2 Modeling and Diagnosis of Electric Motors in a Quality Control Scenario approximation signal is the integer L given by: L = ⌈log2 (fs /f ) + 1⌉. (2.38) 2.4.2 MSPCA Formulation WT and PCA can be combined to extract both correlation within the sensors and cross correlation among sensors, in this way it is possible to extract maximum information from multivariate sensor data. MSPCA can be used as a tool for fault detection and diagnosis by means of statistical indexes. In particular, faults are detected by using Eqs. 2.16 and 2.18 and the isolation is conducted by the contributions method (Eq. 2.19). In this way it is possible to detect which sensor is most affected by fault [36]. Two fundamental theorems exist for the MSPCA formulation, they assess that PCA principles remains unchanged under the Wavelet transformation. These theorems are useful to apply MSPCA methodology [35]. è é′ Theorem 2.4.1 Let W = HL′ , G′L , G′L−1 , · · · , G′1 ∈ RN ×N the orthonormal matrix representing the orthonormal wavelet transformation operator containing the filter coefficients [35], the principal component loadings obtained by the PCA of X and W X are identical, whereas the principal component scores of W X are the wavelet transform of the scores of X. Theorem 2.4.2 MSPCA reduces to conventional PCA if neither the principal components nor the wavelet coefficients at any scale are eliminated. The developed FDD MSPCA based procedure consists of two stages: first, the training step, the faultless data are processed and a model of this data is built. MSPCA training step are summarized below: T1. Data are preprocessed and outlier replacement algorithm is used [18, 51]; T2. The Wavelet analysis is used, to refine the data, with a level of detail L which is chosen by Eq. (2.38); T3. Normalize mean and standard deviation of detail and approximation matrices and apply PCA to the approximation matrix AL , of order L, and to the L detail matrices Dj , where j = 1, ...L; T4. The PCA transformation matrix P and the signal covariance matrix S are computed for each approximation and detail matrices; T5. The X̃i signals (Eq. 2.13) are computed, for each wavelet matrix; 32 2.4 Data-Driven Modeling and Diagnosis by Vibration Signals T6. The δi thresholds are computed, for each detail matrix and for the approximation matrix of order L, using the KDE algorithm (Eq. 2.18) and a confidence bound α; In the second phase, the diagnosis step, the model previously obtained is online compared with the new data and a statistical index of failure is calculated. MSPCA diagnosis step are summarized below: D1. The previous steps, except threshold computation, T1, T2, T3, T4, and T5 are repeated for each new dataset, the data are standardized as in the training step T3 and the PCA and X̃i signals are computed using the P and S matrices, obtained in the training step; D2. If any of the X̃i signals is over the thresholds δi , the fault is detected . . and the diagnosis is performed by the .X̃i . contributions, else the next data set is analyzed (return to D1); . . D3. Compute all the residual contributions .X̃i ., for each sensor, for all details and approximation matrices and diagnose the fault type. 2.4.3 Developed Experimental Setup The diagnosis system has been prototyped on single-phase electric motors with their respectively pallets. Motors have a power of 45 W at 4500 RP M and the impeller mounted on the rotor shaft. Fig. 2.14 shows the motor mounted on its pallet and the accelerometers installed on the radial and axial directions. A laboratory equipment is used to make measurements and to validate the procedure. In rotating machines vibrations arise along two main directions: axial and radial and two accelerometers along these axis are used. The NI (National Instruments) CompactRIO (NI cRIO) 9004, is used for data acquisition [52]. The module NI 9233 of NI cRIO 9004 is used to acquire signal from accelerometers. It is characterized by a 24-bit (delta-sigma) resolution analog-to-digital converter (ADC) with a sampling frequency up to 50kS/s. An oversampling frequency is used for the ADC converter and then a digital band pass filter is applied. The filter band has been designed on the basis of an accurate study of the considered machines. The motor current is measured a current sensor, the signal is acquired by a NI 9215 module and it is processed in the same way of accelerometers. A scalable system has been developed where data are stored and different MSPCA settings have been tested and prototyped. The scalable system allows to monitor many machines with simultaneous data acquisition and fault detection analysis. 33 Chapter 2 Modeling and Diagnosis of Electric Motors in a Quality Control Scenario Figure 2.14: Single-phase 25W motor for kitchen hoods mounted on pallet; PCB accelerometers are installed on pallet. 2.4.4 Results The proposed approach described in section 2.4.2 has been tested using a Daubechies mother wavelet of order 15, defined db15 mother wavelet (defined kernel φ in section 2.4.1). The detail level L is chosen considered the motor rotation frequency, 75 Hz. The vibration analysis for this type of rotating machines highlights that frequencies over the double of rotation frequency are not of interest, then a sampling frequency of 300 Hz is chosen. Therefore applying Eq. (2.38), the level of detail obtained is L = 3. The dimension of Principal Components subspace p chosen by the Kaiser’s rule [18]. Based on the results stated in [51], a training procedure with different outlier replacement techniques is proposed. The diagnosis algorithm is performed offline, at least 2L samples are needed for the DWT analysis. Incoming batch data samples are then fed into the MSPCA model and the PCA residual contributions are computed for the matrices Dj , j = 1, . . . , L, AL . In the following, these matrices are defined scale matrices, and they are compared with the respective thresholds. When, at any scale, the number of residual contribution samples over the thresholds is greater than n · α · γ, where α is the significance level used for the threshold δi calculation (stated in section 2.4.2), n is the sample number of batch data and γ is a corrective index (fixed equal to 5), a defect is detected and the motor is considered defective instead of healthy. 34 2.4 Data-Driven Modeling and Diagnosis by Vibration Signals Once a defect is detected, the isolation and diagnosis tests are performed. At this step the PCA contributions are computed for each scale matrix that violates the δi limit, in other words the residual contributions are computed for each scale that have detected a defect. Fault isolation allows to detect which sensors are involved in the defect. By using several scales for the DWT analysis, it is possible to cluster the residual contributions of each scale and define a unique signature of the motor defect, as for the classical MVSA. More in detail, the signature of each defect is given by the contributions of each variable for each scale. This signature is used to diagnose the defect. A3 matrix contribution weights 20 15 10 5 0 Axial Radial Lem Figure 2.15: Impeller with backlash: contribution weights of approximation matrix A3 . Fig. 2.15 shows the impeller backlash defect isolation for the A3 scale matrix, in particular the thresholds are the contributions level in the case of healthy motor. Moreover the current sensor give the main contribution to this defect as highlighted in Table 2.4. A misalignment defect involves all the scale matrices and is shown in Figs. 2.16, 2.17, 2.18 and 2.19. When the impeller is misaligned the effect propagates along the rotating machine, Figs. 2.16, 2.17, 2.18 and 2.19 confirm this result, also highlighted in Table 2.4, where the radial contribution at high frequencies is higher and the axial contribution is greater in the other scale matrices. The unbalanced impeller defect is detected and isolated in Fig. 2.20, also classified in Table 2.4 where is diagnosed as an unbalance defect. In this defect the radial contribution is the main contribution as stated in vibration analysis (in section 2.4). The following experiments show the application of the algorithm using the data acquired from accelerometer mounted on the kitchen hoods. In this case study two accelerometer sensor are placed by robotic arm on the top of the 35 Chapter 2 Modeling and Diagnosis of Electric Motors in a Quality Control Scenario A3 matrix contribution weights 6 10 5 10 4 10 3 10 2 10 1 10 0 10 Axial Radial Lem D1 matrix contribution weights Figure 2.16: Misaligned impeller: contribution weights of approximation matrix A3 . 4 10 3 10 2 10 1 10 0 10 Axial Radial Lem Figure 2.17: Misaligned impeller: contribution weights of D3 scale matrix. kitchen hoods, when in the production line the kitchen hoods reach the quality control bench. Motor rotation frequency is 50 Hz and sampling frequency is 20 kHz. Figs. 2.21 and 2.22 show the contribution plot at approximation matrix A7 of each sensor. In the production line, some kitchen hoods could be assembled with unbalanced impeller, this defect introduces higher energy consumption and more noise than faultless kitchen hoods. These figures show that the contributions for the first 11 kitchen hoods, which are faultless, are lower than the last 8, which are assembled with defective motors. 36 D2 matrix contribution weights 2.4 Data-Driven Modeling and Diagnosis by Vibration Signals 4 10 3 10 2 10 1 10 0 10 Axial Radial Lem D3 matrix contribution weights Figure 2.18: Misaligned impeller: contribution weights of D2 scale matrix. 4 10 3 10 2 10 1 10 0 10 Axial Radial Lem Figure 2.19: Misaligned impeller: contribution weights of D1 scale matrix. 37 A3 matrix contribution weights Chapter 2 Modeling and Diagnosis of Electric Motors in a Quality Control Scenario 23 20 15 10 5 0 Axial Radial Lem Figure 2.20: Unbalanced impeller: contribution weights of approximation matrix A3 . Table 2.4: FDD results for motors installed in kitchen hoods. Detection Isolation 38 Scale Backlash Misalignment Broken A3 D1 D2 D3 X - X X X X X - A3 D1 D2 D3 Lem - Ax. Rad. Ax. Ax. Rad. - Contribution Accelerometer 1 2.4 Data-Driven Modeling and Diagnosis by Vibration Signals Approximation matrix A7 3000 2000 1000 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 motor number Contribution Accelerometer 2 Figure 2.21: Contribution weights of A7 scale matrix for the first accelerometer in the case of unbalanced impeller. Approximation matrix A 7 2000 1500 1000 500 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 motor number Figure 2.22: Contribution weights of A7 scale matrix for the first accelerometer in the case of unbalanced impeller. 39 Chapter 3 Modeling of Complex Systems with FDI and Prognosis Applications 3.1 Introduction Typically, complex systems belong to different systems classes such as natural (e.g., biological evolution), financial (e.g., stock markets and economies), social and industrial systems. Complex systems consist of a large number of nonlinearly interacting components, often called agents, which displays collective behaviour that does not follow trivially from the behaviours of the individual parts. These systems are open, they interchange information with environment and constantly modify their internal activity structure and patterns in the self-organization process. Complex systems have several characteristics of the structure and behaviour. In the case of complex industrial systems one of these properties is the hierarchical structure, which is related to multilevel structure organization of the system. Another property is the strong coupling between agents defined in the complex system. This property is the cause of cascading failures in complex system as complex industrial systems. Due to the strong coupling between agents, a failure in one or more components can lead to cascading failures which may have catastrophic consequences on the functioning of the system. Moreover a fault in one or more agents can lead to cascading faults which may induce the system in failure. This chapter proposes two solutions to model and diagnose two different complex systems: a paper mill plant and a turbofan engine. These solutions are applied in order to monitor these complex systems in the FDI and prognostic context. A data-driven FDI algorithms based on MSPCA (defined in section 2.4.2) is applied in the case of a paper mill plant. Paper mill plant is an industrial complex system, which consists of several coupling nonlinear systems (e.g. paper machine, stock preparation) and subprocesses (e.g, fun pump, tine unit, mixing unit, pope reel, pulping, broke handling machine, deflaking). Paper mill plant is often monitored by a lot of sensors, which measures many different variables, which can be discrete, logic and acquired at different sam41 Chapter 3 Modeling of Complex Systems with FDI and Prognosis Applications pling frequency. These variables could also describe qualitatively the states of the systems. The second solution is related to a prognostic management of a turbofan engine. A common turbofan consists of coupling nonlinear systems such as a compressor, combustor, and a turbine which drives the compressor. In addition, it has a fan in front of the core compressor and a second power turbine behind the core turbine to drive the fan. Turbofan engines is a complex system that requires adequate monitoring to ensure flight safety and timely maintenance. This chapter is based on the problem and the results presented in [53, 54]. 3.2 Modeling and Diagnosis of a Paper Mill Plant: FDI Application In paper mill plants, the competition for increasing efficiency and reducing costs is a primary purpose. Fault detection and diagnosis can help by minimize the loss of production. In particular, for the stock preparation subprocess a signal based FDI procedure is developed. MSPCA is used to monitor some critical variables of the stock preparation of a paper mill plant in order to diagnose faults and malfunctions. MSPCA simultaneously extracts both, cross correlation across the sensors (PCA approach) and auto-correlation within a sensor (Wavelet approach). The advantage of MSPCA is validated on considered paper mill plant where several sensors are installed to control and monitor the automation system. Chemical process plant safety, production specifications, environmental regulations, operational constraints and plant economics are some of the main reasons driving an upward interests in research and development of more robust methods for process monitoring and control. Another reason, which motivates the use of advanced techniques to monitor the plant, is that modern control paradigms are heavily dependent on the quality of data provided by sensors. This result in a growing necessity of discriminate normal plant situation from abnormal situations. Though infrequent, these abnormal situations have caused a significant impact on the safety and economy of process industry [36]. It is well known that chemical processes are well equipped with measurement sensors such as temperature, flow-rate, pressure and electrical entities sensors. Availability of many sensors provides valuable redundancy for fault detection and identification because sensors measurements are highly correlated under normal conditions. These correlations are mainly due to physical and chemical principles governing the process operations such as mass and energy balances [40]. The paper industry is a productive sector with a high level of automation and, considering that a modern paper mill has hundreds of 42 3.2 Modeling and Diagnosis of a Paper Mill Plant: FDI Application sensors and actuators connected to its automation systems, it is evident that some systematic methods are needed to process the data. On-line process monitoring with fault detection and diagnosis can provide efficiency improvement for a wide range of processes, as stated in [55, 10]. A large number of applications have been reviewed, e.g. by Isermann and Ballé [56] and Patton et al. [57]. Venkatasubramanian et al. [58, 59, 60] published an article series reviewing monitoring methods with particular attention in the field of chemical processes. They classified the methods as model-based, signal-based and knowledge-based. Signal-based approaches to FDI in paper mill and, generally in chemical plants, are consolidated and well studied because for large-scale processes the development of model-based FDI methods require considerable and eventually too high effort, and moreover because large amount of data are collected, as stated in [61, 62]. Common features of the statistical methods (a subset of the signal-based methods) are able to reduce the correlations between variables and the dimensionality of the data. These characteristics enable efficient extraction of the relevant information from the data. MSPCA, that is a combination of PCA and wavelet analysis, it is able to remove the autocorrelations of variables by means of wavelet analysis, and to eliminate cross-correlations between variables with PCA [36]. MSPCA proposed by Bakshi [35] deals with processes that operate at different scales, and have contributions from: • events occurring at different localizations in time and frequency; • stochastic processes whose energy or power spectrum changes with time and/or frequency; • variables measured at different sampling rate or containing missing data. The primary motivation for jointly using PCA and Wavelet Transform comes from the idea that, in PCA, the correlation among sensors is used to transform the multivariate space into a subspace which preserves maximum variance of the original space. However, PCA fails to make use of correlation within the sensor along the time line. Wavelets, on the other hand, capture the correlation within a sensor whereas PCA correlates across sensors [36]. Thus, wavelets and PCA based analysis of multivariate data combine two extremes: signal trend and correlation. MSPCA is a technique that extracts maximum information from multivariate sensor data. Detection and diagnosis of stock preparation process of a paper mill plant is considered. Recursive MSPCA is applied for on-line fault detection and diagnosis: once a fault is detected a multi-scale fault identification is performed. 43 Chapter 3 Modeling of Complex Systems with FDI and Prognosis Applications 3.2.1 Recalled Results The mathematical framework is introduced in section 2.4.1. In order to apply a recursive MSPCA for on-line fault detection and diagnosis only the following equation is introduced, which defines the projection of a single sample in the principal component space. A new observation vector x ∈ Rm can be projected into the lower-dimensional score space with: t = xP ∈ Rp . (3.1) 3.2.2 Case Study Variable name Description CIC T8 mixture consistency [%] DIC205 MV Mix short fiber consistency in [%] DIC219 MV Mix long fiber consistency in [%] DIC226 MV Mix CTMP fiber consistency in [%] PT207 Broke Machine inlet pressure [bar] LT242 long fiber tank T6 level [%] LIV T13 Broke Machine water level [%] LIV T3 Broke Machine short fiber level [%] LIV T1 T2 short fiber tank T1 and T2 level [%] JT MOTOR P20 high press. water pump current [A] FCV 212 AO1 3 short fiber input valve [%] JT MOTOR E26 Broke Machine motor current [A] JT MOTOR E16 Broke deflaking motor current [A] FCV406 KROFTA Sludge Mix in valve [%] FB MACC Broke Machine Set point [%] Table 3.1: Stock preparation sensors An important aspect of the developed work consists of applying the MSPCA formulation to an industrial data set where no a priori assumptions could be introduced. Data from a stock preparation process of a paper mill plant are analyzed, in a first test some faults are simulated in order to validate the developed algorithm, then real faults are detected and isolated during the normal process operation. The stock preparation is a process of the paper mill plant 44 3.2 Modeling and Diagnosis of a Paper Mill Plant: FDI Application where pulp is usually refined and blended to the appropriate proportion of hardwood, softwood or recycled fiber, and diluted to as uniform and constant consistency. The pH is controlled and various fillers such as whitening agents, size and wet or dry strength additives are added if necessary. Additional fillers such as clay or titanium dioxide increase opacity to increment the quality of print. In the considered process, different types of pulp are normally treated in separate but similar process lines until combined at a blend chest. From high density storage or from slusher/pulper the pulp is pumped to a low density storage chest (tank). From there it is typically diluted to about 4% consistency before being pumped to an unrefined stock chest. From the unrefined stock chest, stock is again pumped, with consistency control, through a refiner. Refining is an operation whereby the pulp slurry passes between a pair of discs, one of which is stationary and the other rotating at speeds of typically 1000 or 1200 RP M for 50 and 60 Hz AC, respectively. The discs have raised bars on their faces and pass each other with narrow clearance. This action unravels the outer layer of the fibers, causing the fibrils of the fibers to partially detach and bloom outward, increasing the surface area to promoting bonding. Refining thus increases tensile strength. For example, tissue paper is relatively unrefined whereas packaging paper is more highly refined. Hardwood fibers are typically 1 mm long and smaller in diameter than the 4 mm length typical of softwood fibers. Refining can cause the softwood fiber tube to collapse resulting in undesirable properties in the sheet. From the refined stock, or blend chest, stock is again controlled in the consistency as it is being pumped to a machine chest. It may be refined or additives may be added en route to the machine chest. The machine chest is basically a consistency leveling chest having about 15 minutes retention. This is enough retention time to allow any variations in consistency entering the chest to be leveled out by the action of the basis weight valve receiving feedback from the on line basis weight measuring scanner. The stock preparation process ends in the machine chest, where the paper machine picks the refined pulp and realizes the paper. This brief description of stock preparation process shows the complex dynamics of paper mill plants. An analytical description of the plant is not suitable for fault detection and diagnosis, then a signal based approach is considered. The knowledge of the considered process is necessary to choose most meaningful process signals to monitor for FDI purposes, and allow a proper setup of MSPCA procedure. In particular, Table 3.1 summarizes the set of sensors chosen for applies the MSPCA FDI procedure, these sensors measure the most relevant stock preparation process dynamics taking into account the process knowledge. 45 Chapter 3 Modeling of Complex Systems with FDI and Prognosis Applications 70 60 SPE 50 40 30 20 10 0 0 500 1000 samples 1500 2000 (a) 80 SPE A3 60 40 20 0 0 50 100 samples150 200 250 300 (b) Figure 3.1: Simulated abrupt fault. (a) SPE of reconstructed PCA; (b) SPE of approximation matrix A3 3.2.3 Results MSPCA approach has been implemented, using a Haar wavelet kernel, a level of detail L = 3, and the dimension of Principal Components subspace p is chosen by the Kaiser’s rule [18]. For the training stage of PCA a faultless dataset is selected and an outlier elimination is performed in order to ensure robustness to analysis procedure. In [51] the training procedure with different outlier elimination techniques is presented. The hypothesis, under which this procedure can be implemented, is the Gaussian distribution of signals. Here, the outlier elimination is performed using a Huber estimator for matrix T with 46 3.2 Modeling and Diagnosis of a Paper Mill Plant: FDI Application 7 SPE Contribution A3 6 5 4 3 2 1 0 1 2 3 4 5 6 7 Sensors 8 9 10 11 12 13 14 15 (a) 0.7 SPE Contribution 0.6 0.5 0.4 0.3 0.2 0.1 0 1 2 3 4 5 6 7 8 9 Sensors 10 11 12 13 14 15 (b) Figure 3.2: Simulated abrupt fault. (a) SPE contribution of approximation matrix A3 ; (b) SPE contribution of reconstructed PCA parameter k = 4 [51]. The fault detection and diagnosis algorithm is performed on-line recursively, by means of a l samples moving window using, at each step, at least 2L samples needed by the DWT. When the number of SPE values over the threshold is greater than l ∗ α ∗ m, with α the significance level using in the six step of MSPCA algorithm (treated in section 2.4.2) and l = 96, a fault is detected. The paper mill is a complex process and there are different levels of joint dynamics and also the algorithm has a high sensitivity to small changes that can be caused by transient and peaks in the signals. These problems lead to the presence of false alarms and for this reason a variable called m has been 47 Chapter 3 Modeling of Complex Systems with FDI and Prognosis Applications added. To prevent false alarms, from the experimental data has been placed m = 8. If a fault is detected in detail or approximation level j and at sample n, the time when the fault occurs is n ∗ 2j . With the objective of validate and test the proposed MSPCA method two faults are simulated on the nominal plant signals. Subsequently the algorithm has been used on stock preparation process data during normal operation. In particular, the attention has been focused on three section of the stock preparation process, the pulping, broke handling machine and deflaking. These subprocesses are monitored by 15 sensors. A fault is detected and identified, during an observation period, this fault arise in the broke machine. The Tables 3.2, 3.3 and 3.4 show the matrices which arise the fault, changes in percentage of the values out of threshold from the faultless case and the fault time. Simulated Faults To validate and set algorithm parameters, since no a priori assumption is made about the process, two faults are simulated. In the first simulation, an abrupt fault is added to the sensor number 5, which measures the inlet pressure to the broke handling machine. The fault has a width of 4% of the sensor average value in the faultless case, and the fault occurs at sample 1912. As shown in Figs. 3.1(a) and 3.1(b) and Table 3.2, the fault is detected by the SPE approximation matrix A3 at the window 237, with a delay of 80 samples. The reconstructed PCA does not detect the fault, even if there is an increase of the SPE. This shows that the addition of the wavelet increases sensitivity in detecting the fault than in the case without wavelet. The fault is identified in the sensor 5, which has the greatest variation in contributions, as shown in Figs. 3.2(a) and 3.2(b). In the second simulation, a sinusoidal signal is added Table 3.2: Fault diagnosis with simulated abrupt fault. A3 D1 D2 D3 Reconst. PCA Fault X 275% 0.9% 1% 0.5% 74% Window Sample 237 1896-1992 to the sensor number 3, which measures the current of P 17 pump with filtered water at normal pressure. The fault has a width of 1.5% of the average value of the sensor in the faultless case, and the fault occurs at sample 634. As shown in Fig. 3.3 and Table 3.3, the fault is detected by the SPE detail matrix D1 in 48 3.2 Modeling and Diagnosis of a Paper Mill Plant: FDI Application the window 312 with a delay of 86 samples. The fault is identified in the sensor 3, which has the greatest variation in contributions, as shown in Fig. 3.4. 100 SPE D1 80 60 40 20 0 0 100 200 300 400 samples 500 600 700 800 Figure 3.3: Simulated harmonic fault. SPE of detail matrix D1 SPE Contribution D1 2 1.5 1 0.5 0 1 2 3 4 5 6 7 8 9 Sensors 10 11 12 13 14 15 Figure 3.4: Simulated harmonic fault. SPE contribution of detail matrix D1 Real Fault The last experiment concerns with a real fault occurred in the stock preparation subprocess of paper mill during normal operation. The fault occurrence is identified on sensors 5 and 13, which measure the inlet pressure to the broke handling machine and the current of the motor broke deflacking. From sample 49 Chapter 3 Modeling of Complex Systems with FDI and Prognosis Applications Table 3.3: Fault diagnosis with simulated harmonic fault. Fault A3 D1 D2 D3 Reconst. PCA 31.8% X 89.5% 19.8% 19.5% 12.3% Window Sample 312 624-720 600 the motor is idling and is regarded as a soft fault, which is not considered as a motor fault but due to a malfunction of the machine as a jam. After such an interval it takes an abrupt fault identified in both sensors. As shown in Figs. 3.6(a) and 3.6(b) and Table 3.4, the fault is detected by the SPE approximation matrix A3 in the window 74 with a delay of 88 samples. Also the SPE of reconstructed PCA detected three faults at samples 3979, 4180 and 4481, that correspond to peaks in the sensor 5. The fault is identified in the sensors 5 and 13, which has the greatest variation in contributions at all levels, shown in Figs. 3.7, 3.8(a), 3.8(b), 3.9(a) and 3.9(b). Considering the SPE contribution at each level it is possible to define the signature of the isolated fault and identify its nature. The signature associated with the fault is unique and no other faults generated a similar signature in contribution plots. Table 3.4: Fault diagnosis with real fault. 50 A3 D1 D2 D3 Reconst. PCA Fault X 1218% 124% 120% 156% X 345% Window Sample 74 592-688 3979-4100 4180-4361 4481-4770 3.3 Modeling and diagnosis of a Turbofan Engine: Prognosis Application Broke handling machine inlet pressure [bar] 4 3 2 1 0 0 1000 2000 3000 4000 5000 6000 samples 7000 8000 9000 7000 8000 9000 (a) Broke handling machine motor current [A] 100 80 60 40 20 0 −20 0 1000 2000 3000 4000 5000 6000 samples (b) Figure 3.5: Signals of broke handling with fault. (a) Sensor 5, inlet pressure; (b) Sensor 13, motor broke deflacking current 3.3 Modeling and diagnosis of a Turbofan Engine: Prognosis Application Residual life time of systems is a determinant factor for machinery and environment safety. In this section the issue of estimate the residual useful life (RUL) of turbofan engines is addressed. Systems, plants and machinery prognosis is the forecast of the remaining operational life, future condition, or probability of reliable operation, and it is based on an equipment that acquires condition monitoring data. This approach 51 Chapter 3 Modeling of Complex Systems with FDI and Prognosis Applications 4 10 2 SPE 10 0 10 −2 10 0 1000 2000 3000 4000 5000 samples 6000 7000 8000 9000 (a) 4 SPE A3 10 2 10 0 10 0 200 400 600 samples 800 1000 1200 (b) Figure 3.6: Real fault. (a) SPE of reconstructed PCA in logarithmic scale; (b) SPE of approximation matrix A3 in logarithmic scale to modern maintenance practice promises to reduce downtime, spares inventory, maintenance costs and safety hazards [63]. The assumption under which the prognosis became effective is that failure mechanism of systems involve several degraded health-states, or systems are subjected to wear. Tracking and forecasting the evolution of health-states and impending failures, in the form of Remaining Useful Life (RUL), is a critical challenge and regarded one of the main topics of Condition Based Maintenance (CBM) [64]. CBM is a maintenance technology that employs such tasks as monitoring, classification, and forecasting to increase system readiness and safety while reducing costs 52 3.3 Modeling and diagnosis of a Turbofan Engine: Prognosis Application 40 35 30 SPE 25 20 15 10 5 0 1 2 3 4 5 6 7 8 9 Sensors 10 11 12 13 14 15 Figure 3.7: Real fault. SPE contribution of reconstructed PCA attributed to reduced maintenance and inventory, increased capacity, and enhanced logistics and supply chain performance. Many approaches exist to monitor the health state or to estimate the RUL of systems, they can be divided into physics-based prognostic models and data-driven prognostic models. Physics-based models typically involve building mathematical models to describe physics relations of the system, failure and wear propagation [63, 65, 66]. Data-driven approaches attempt to derive models directly from collected Condition Monitoring (CM) data, they produce prediction output directly in terms of CM data. Conventional data-driven methods include simple projection model such as exponential smoothing [67, 68]. Most of these trend forecasting techniques assume that there is some drift in measured system signals that reflects the health degradation. Artificial Neural Network (ANN) is one of the most commonly data-driven technique in the prognostic algorithms. In [69] a Recurrent Wavelet Neural Network (RWNN) is developed to predict rolling element bearing crack propagation. The network performs a tracking of enlarged crack. In [70] a Neuro-Fuzzy (NF) network is used to predict spur gear condition value one step ahead. Fuzzy interference structure is determined by experts, whereas fuzzy membership functions are trained by neural network. Adaptive training technique was proposed in [71] to improve the NF model. Multiple-step-ahead prediction is also performed in [72] for rolling element bearing condition monitoring and estimation, by feeding the predicted value back into the network input until desired prediction horizon is reached. Some rules are used to vary data sampling period taking into account the change ratio of consecutive condition index value. Particle filtering has also been implemented to provide non-linear projection in forecasting the growth of 53 Chapter 3 Modeling of Complex Systems with FDI and Prognosis Applications 600 500 SPE A3 400 300 200 100 0 1 2 3 4 5 6 7 8 9 Sensors 10 11 12 13 14 15 (a) 70 60 SPE D3 50 40 30 20 10 0 1 2 3 4 5 6 7 8 9 Sensors 10 11 12 13 14 15 (b) Figure 3.8: Real fault. (a) SPE contribution of approximation matrix A3 ; (b) SPE contribution of detail matrix D3 a crack on a turbine engine blade [73]. In [74] a recursive Bayesian technique is proposed to calculate failure probability based on the joint density function of many CM data features. The use of Hidden Markov Models (HMMs) in bearing fault prognosis is proposed in [75]. In an HMM a system is modeled to be a stochastic process in which the subsequent states have no causal connection with previous states. Typically the HMMs are trained to estimate health or fault states. Indeed HMMs are able to estimate unobservable health-states using observable sensor signals or defined features computed by other algorithms. Some approaches combine fault diagnosis and prognosis in a unified 54 3.3 Modeling and diagnosis of a Turbofan Engine: Prognosis Application 180 160 140 SPE D2 120 100 80 60 40 20 0 1 2 3 4 5 6 7 8 9 Sensors (a) 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 Sensors (b) 10 11 12 13 14 15 14 12 SPE D1 10 8 6 4 2 0 Figure 3.9: Real fault. (a) SPE contribution of detail matrix D2 ; (b) SPE contribution of detail matrix D1 framework, or need to extract features from data, that are used by HMMs for estimate the health-state. In [76, 75] the features are computed by Principal Component Analysis, where measured signals are vibrations. The features are extracted by amplitude demodulation as in [77]. Whether signals from sensors are used or that features are computed, the inputs of HMM need to be chosen as most reliable as possible. Indeed appropriate features are able to capture unique properties of fault conditions and health state. Another motivation that induces authors to generate features, is to reduce the computation complexity of HMMs algorithms. 55 Chapter 3 Modeling of Complex Systems with FDI and Prognosis Applications In the present work a HMM is used to estimate the RUL of a turbofan, features are extracted by an ANN that is trained to identify faultless parameter of the turbofan in different flight conditions. Residuals are obtained at the end of each flight and a set of indexes are generated. The HMM uses these indexes and computes RUL estimation, as the number of remaining flights. These models give estimations on residual life and health-state by modeling observations (inputs) as probability density functions. Thus it is possible to define a model composed by a set of states that are described by a probability density function already. This permits the use of Bayesian inference algorithms for estimate health conditions. Data are generated by the model simulator CMAPSS (Commercial Modular Aero-Propulsion System Simulation). Signals consist of time series of sensed measurements typically available from aircraft gas turbine engines. The data were used as challenge for the Prognostics and Health Management (PHM) data competition at PHM’08 [78]. 3.3.1 Problem Definition and Process Model Turbofan engines constitute a complex system that requires adequate monitoring to ensure flight safety and timely maintenance [79, 80, 81]. Therefore it is essential to assess prognostic techniques, that can help to provide early detection and isolation of precursor and/or incipient fault condition to a component failure, and can also help manage as well as predict the progression of various faults to component failure. The prognostic module would also perform failure prognosis, which involves both forecasting of system degradation based on observed system condition (current diagnostic state and available operating data), and prediction of useful remaining life of the turbofan engine. Prognostics has taken center stage in condition-based maintenance where it is desirable to estimate RUL of a system. Estimating the RUL of a component or system with uncertainty bounds that are narrow enough offers the prospect for increased system safety along with more cost-effective maintenance. Based on RUL estimation, the operation team can perform on-demand maintenance or CBM, otherwise from the traditional time-based practice in which components are managed to life limits based upon fleet-wide statistics and average expected usage. The traditional approach is necessarily conservative, requiring the replacement of parts irrespective of how much of their useful life is actually expended. For example, aircraft engine turbine discs are usually retired at the time when 1 out of every 1000 discs has initiated a short detectable fatigue crack. On a life-distribution plot, this is the “−3 sigma” life curve. This implies that over 99.9% of expensive turbine rotor discs are retired before their useful life has been consumed, a practice that is extremely wasteful [82], so conventional maintenance strategies (like corrective and preventive maintenance) 56 3.3 Modeling and diagnosis of a Turbofan Engine: Prognosis Application are not adequate to fulfill the needs of expensive and high availability systems as the turbofan engine. In contrast, a condition-based predictive maintenance, which is needed to assess the future health of critical components of engines based on observed data and available knowledge about the system, is used for making proactive decisions about preventive and/or evasive actions with the objectives of maximizing the service life of replaceable/serviceable components, minimizing operational risks, and saving costs, with the same safety margin, obtained during inefficient schedule-based preventive maintenance. To accomplish this demanding task, engine monitoring systems (EMS) have become increasingly standard in the last two decades, in step with advances in aircraft engines and computer technology. So the goal, to be asked, is to automate the procedures for monitoring and prognosis in order to reduce costs and maintain high system reliability. Moreover, in turbofan engines the control task is an essential part of the jet engine, that result in a mechatronic system. Turbofans are most effective when they can operate at or near their mechanical, thermal, flow or pressure limitations such as rotor speeds, turbine temperatures, internal pressures, etc. Controlling at but not exceeding a limit is a very important aspect of engine control which must, therefore, provide both regulation and limit management. Minimum control requirements include a main fuel control to provide limit protection. More advanced controls schedule engine geometry and provide fan and booster stall protection, control variable parasitic engine flows, and need to monitor many engine parameters [81]. A common turbofan has as its “core” a compressor, combustor, and a turbine which drives the compressor. In addition it has a fan in front of the core compressor and a second power turbine behind the core turbine to drive the fan as shown in Fig. 3.10. The flow capacity of the fan is designed to be larger than the compressor so that the excess air can be bypassed around the core and exhausted through a separate nozzle. The bypass approach reduces engine specific thrust but increases propulsion efficiency thereby reducing fuel consumption and is the engine chosen for subsonic commercial airplanes [81]. Some of the particular features of turbofan engine are [83]: 1. single stage fan with high pressure ratio. 2. a low pressure compressor. 3. high stage pressure rise mixed-flow compressor. 4. double-annular combustor. 5. high pressure turbine. 6. low pressure turbines. 57 Chapter 3 Modeling of Complex Systems with FDI and Prognosis Applications 7. variable cycle capability with forward blocker doors and an aft variable area bypass injector. 8. advanced exhaust nozzle technology. The diagram in Fig. 3.10 shows the main elements of the turbofan engine model [78]. The simulated turbofan has equipped with 21 sensors that describe Figure 3.10: Simplified diagram of turbofan engine the engine state and has 5 input that are considered as external conditions. Tables 3.6 and 3.5 resume the signals acquired with their descriptions. Variable name alt MN TRA Wf Fn Description Altitude Mach number Throttle resolver angle Fuel flow Net thrust Table 3.5: C-MAPSS inputs 3.3.2 Hidden Markov Model and Prognosis Procedure HMMs are a class of Markov models composed by a set of states that map observations in a probability density function of each one. The resulting model 58 3.3 Modeling and diagnosis of a Turbofan Engine: Prognosis Application Variable name T2 T24 T30 T50 P2 P15 P30 Nf Nc epr Ps30 phi NRf NRc BPR farB htBleed Nf_dmd PCNfR_dmd W31 W32 Description Total temperature at fan inlet Total temperature at LPC outlet Total temperature at HPC outlet Total temperature at LPT outlet Pressure at fan inlet Total pressure in bypass-duct Total pressure at HPC outlet Physical fan speed Physical core speed Engine pressure ratio (P50/P2) Static pressure at HPC outlet Ratio of fuel flow to Ps30 Corrected fan speed Corrected core speed Bypass Ratio Burner fuel-air ratio Bleed Enthalpy Demanded fan speed Demanded corrected fan speed HPT coolant bleed LPT coolant bleed Table 3.6: C-MAPSS outputs consists of two stochastic processes, one of which is not directly observable but can be estimated through the other one, that produces the sequence of observations. These models have many applications in speech recognition where they were studied [84]. Define the states as S = {S1 , S2 , . . . , SN }, with N the state number, the state at time t as qt and the observations O = o1 o2 . . . oT , the HMM is defined as λ = (A, B, π) where: . A = {aij } is the state transition probability distribution, aij = P [qt+1 = Sj |qt = Si ]; . π = {πi } is the initial state distribution, πi = P [q1 = Si ]; . B = {bj (O)} is the continuous observations probability distribution, qK bj (O) = m=1 cjm N(O|µjm , σ jm ). 1 ≤ i, j ≤ N , 1 ≤ t ≤ T , cjm are the mix factors, N(O|µjm , σ jm ) are the Gaussian distributions of a Gaussian Mixture Model (GMM) and K is the number of mix factors [85, 86]. The probability of being in state Si at time t, 59 Chapter 3 Modeling of Complex Systems with FDI and Prognosis Applications and state Sj , at time t + 1, given the model and the observation sequence is: ξt (i, j) = P [qt = Si , qt+1 = Sj |O, λ], (3.2) so the expected number of transitions from Si to Sj is: ε(i, j) = T −1 Ø ξt (i, j). (3.3) t=1 The three basic problems [84] in Hidden Markov Model are: . given the observation sequence O = o1 o2 . . . oT , and a model λ = (A, B, π), compute P (O|λ), the probability of the observation sequence given the model; . given the observation sequence O = o1 o2 . . . oT , and the model λ = (A, B, π), choose a corresponding state sequence S1 , S2 , . . . , SN which best explains the observations; . fix the model parameters λ = (A, B, π) to maximize P (O|λ). These problems are solved by three efficient and well defined procedures: forwardbackward algorithm [87], Viterbi algorithm [88] and Baum-Welch algorithm [89]. The computational complexity is related to forward-backward inference algorithm that is N 2 T . The HMM can be used to describe the fault progression process of physical systems. In Fig. 3.11 the HMM model is shown, it represents the fault progression where each state is an health-state, this model is called left-right model. The HMM based scheme is useful for prognostic because by monitoring the progression of the state sequence it is possible to have a qualitative information of the current degradation state and the wear progression, a quantitative information of the RUL, and it is possible to predict the system life evolution. To obtain these information, the following algorithms are used: 1. the progression of states and the current state is calculated by Viterbi algorithm [88, 90]; 2. the RUL is estimated using transition matrix that defines the evolution of failure progression in terms of quantity. The mean time steps∗ to failure state, given the current state, is calculate with Monte Carlo simulation [64]: considering n simulations, during each of the n simulation runs, next health-state is estimated based on the transition probabilities by generating an uniformly distributed random number between 0 and 1. This process is repeated considering the calculated next state as the current state until the failure-state is reached. Then, the number of 60 3.3 Modeling and diagnosis of a Turbofan Engine: Prognosis Application Figure 3.11: Fault progression process described by a HMM transitions is counted as the RUL value. This is repeated for all samples yielding n RULi values and steps∗ is the mean of the n RULi value: n steps∗ = 1Ø RU Li . n i=1 (3.4) The advantage of this method is that requires no assumptions about the knowledge of the HMM model. The prognosis algorithm consists of two procedures: the training and the prognosis. The training consists of: . the initial parameters of HMM model are chosen to better perform the HMM training: 1. the initial state distribution π = {πi } is uniformly distributed; 2. the initial state transition probability distribution A = {aij } is set based on the structure of the HMM model chosen; 3. the dataset is clustered with GMM algorithm. The outputs are cjm , µjm and σ jm to obtain B = {bj (O)}, the continuous observations probability distribution; . the number of HMM states is determined computing the Bayesian Information Criterion (BIC) [91]: BIC = L ∗ ln T − 2 ln P (O|λ), (3.5) where L is the number of parameters. The minimum of this index gives the information of the states number of HMM; 61 Chapter 3 Modeling of Complex Systems with FDI and Prognosis Applications . to avoid over fitting, during the training step, add a constant called σmin , to the diagonal of σ jm matrix. This entity prevent the possibility of matrix σ jm to became singular and the following expression of GMM covariance matrix update is used: σ jm = qT j=1 γt (j, m) ∗ (ot − µjm ) ∗ (ot − µjm )′ + σmin I, qT j=1 γt (j, m) (3.6) where γt (j, m) is the probability of being in state j a time t with the mth mixture component accounting for ot ; . the HMM is trained by Baum-Welch algorithm [89]. The prognosis steps consist of: . when a new data is collected the inference is calculated to obtain εdata (i, j); . subtract the model HMM matrix εmodel (i, j) with the new data matrix εdata (i, j) εnew (i, j) = εmodel (i, j) − εdata (i, j), (3.7) to obtain the εnew (i, j) and so the new transition matrix is: εnew (i, j) , 1 ≤ i, j ≤ N ; Anew = {aij } = qN j=1 εnew (i, j) (3.8) . apply the Viterbi algorithm to evaluate the current state; . by current state and new transition matrix, calculate RUL ( 3.4). 3.3.3 Features Extraction The HMM inference algorithm has a computational complexity of N 2 T using the forward-backward procedure, however it is not possible to use the simulator signals directly because it generates 21 turbofan variables sampled at 1Hz for a flight of about 1 hour, and a variable number of flights depends on the degradation rate of the turbofan. The result is that a feature extraction is needed to reduce the number of variables and samples. Reduction of computational complexity is made by an Artificial Neural Network (ANN), considering that what matters is to estimate the number of remaining flights as a measure of the RUL, then the ANN is used to extract a scalar set of indexes that describe the situation of all sensors for the whole flight. Another objective is to estimate the engines faultless RUL using some faultless simulations for training the prognostic procedure. 62 3.3 Modeling and diagnosis of a Turbofan Engine: Prognosis Application The ANN model is trained to fit engine parameters given the flight condition inputs. Then for each flight the following error indexes are evaluated from all parameters residuals of entire flight: • MSE, Mean Square Error; • Std, Standard Deviation of the error; • Ave, Average error. The ANN used is a 3 layer Multi Layer Perceptron (MLP) network with 5 input neurons and 21 hidden layer neurons and output neurons. The ANN inputs and outputs are those described in the section 3.3.1, reported in Tables 3.5 and 3.6 respectively. The ANN training step is performed using a set of faultless turbofan simulations. Then a clustering technique is applied in order to reduce the sample number. For each cluster the sample number is reduced and the ANN is trained with the obtained dataset. Once the ANN is trained the procedure that calculate the features is applied, this procedure is summarized in the following algorithm. For each flight do: 1. collect data sample of the current flight simulation; 2. simulate the turbofan parameters behaviour through ANN; 3. compute residuals of ANN parameters identification; 4. compute features of the current flight as MSE, Std and Ave of residuals. 3.3.4 Implementation and Results Data are collected from the C-MAPSS simulator from data competition at PHM’08. Data are referred to a turbofan engine that simulates parameters values from flight conditions. Each flight is recorded at sampling frequency of 1Hz and consists of 7 flight conditions repeated for every flight. Then an engine is simulated until its wear index reaches zero, this means that the turbofan ended its remaining operational life. These simulations are repeated for several engines in different conditions, faultless cases, fault Fan, fault High Pressure Turbine (HPT), fault High Pressure Compressor (HPC) and fault Low Pressure Turbine (LPT). The HMM training step is performed using the dataset generated by the neural network in its training step, this mean that the training dataset is the same for both algorithms and it is a faultless dataset. HMM training steps are summarized below: 1. using GMM algorithm, cluster the degradation curves previously extracted by the neural network, which represent the faultless operation of turbofan; 63 Chapter 3 Modeling of Complex Systems with FDI and Prognosis Applications 2. evaluate BIC index to choice the N number of states for HMM; 3. set the lower bound of the covariance matrix σmin to 10−16 and train the HMM with the data and GMM; Once the training is done, it is possible to run the prognosis steps on the fault data, summarized in the following steps: 1. obtained new data sample from ANN, calculate the current health-state by Viterbi algorithm; 2. update the state transition matrix with the new observations (Eq. 3.8) and calculate the RUL with Monte Carlo technique (Eq. 3.4). Fig. 3.12 shows a sample of features extracted by ANN from flight data, that are used for training step. These features are taken to train the HMM model. In Fig. 3.5 the BIC index, computed by 3.5 is shows, its optimum is the minimum value reached. In this case the HMM states number is 30, in other words there are 30 health-states from the normal operation condition to the failure condition state. In Fig. 3.14 is shown the data clustering of the training step by means of GMM in the scatter plot of dataset. Once the ANN and HMM are trained the −4 x 10 6 MSE 5 4 3 2 1 0 0 50 100 150 Flights 200 250 300 Figure 3.12: Faultless features extracted from ANN training data simulations on prognosis can be made on test datasets. In particular, a fault arises at unknown flight in the simulated turbofan to generate a test bench for the integrated procedure. The simulated faults are non safety critical for obviously reasons and they decrease the turbofan life. This difference can be seen by comparing the number of flights of faultless dataset in Fig. 3.12 and of test datasets in Figs. 3.15, 3.16, 3.17 and 3.18. 64 3.3 Modeling and diagnosis of a Turbofan Engine: Prognosis Application 4 −1.6 x 10 −1.7 BIC −1.8 −1.9 −2 −2.1 −2.2 0 10 20 30 40 50 States number 60 70 80 Figure 3.13: Bayesian information criterion of fan engine 1 −4 Second MSE dataset x 10 −5 −10 −15 0 1 First MSE dataset 2 −4 x 10 Figure 3.14: Cluster of training data The following simulations report how the developed procedure is able to track the true RUL even if perturbations occur. Fig. 3.15 shows the prognostic results of the turbofan life when a fault occurs in the fan. In particular Fig. 3.15(a) shows the RUL tracking and the Fig. 3.15(b) shows the health-state progress. The fault on fan is visible at flight 60 when the RUL has a jump. Next simulation is performed in the case of a fault in HPT. In this case the jump in the RUL estimation is better highlighted as shown in Fig. 3.16(a). Again the 65 Chapter 3 Modeling of Complex Systems with FDI and Prognosis Applications RUL (Flights) 150 100 50 0 0 20 40 60 80 100 Flights 120 140 160 180 120 140 160 180 (a) 30 States sequence 25 20 15 10 5 0 0 20 40 60 80 100 Flights (b) Figure 3.15: (a) Turbofan estimated RUL in presence of FAN fault, solid blue line is the estimated RUL, dashed red line is true RUL; (b) Turbofan health states sequence in presence of FAN fault, solid blue line is the states health sequence, dash red line is the failure state state sequence, shown in Fig. 3.16(b) quickly track degradation evolution when the fault occurs. The last two simulations start with an initial error on the estimate because the algorithm doesn’t know the initial RUL value and cannot takes into account the degradation induced by the fault. The prognosis of the turbofan affected by the HPC fault is shown in Fig. 3.17. The RUL shown in 66 3.3 Modeling and diagnosis of a Turbofan Engine: Prognosis Application 140 RUL (Flights) 120 100 80 60 40 20 0 0 20 40 60 Flights 80 100 120 80 100 120 (a) 30 States sequence 25 20 15 10 5 0 0 20 40 60 Flights (b) Figure 3.16: (a) Turbofan estimated RUL in presence of HPT fault, solid blue line is the estimated RUL, dashed red line is true RUL; (b) Turbofan health states sequence in presence of HPT fault, solid blue line is the states health sequence, dash red line is the failure state Fig. 3.17(a) converges to the real RUL after the fault occurrence. The healthstate sequence is reported in Fig. 3.17(b). The last simulation involve the turbofan with a fault on the LPT and is shown in Fig. 3.18. Fig. 3.18(a) shows the initial RUL overestimate and the correct tracking reached after the fault occurrence. At the same time the health-state sequence shown in Fig. 3.18(b) 67 Chapter 3 Modeling of Complex Systems with FDI and Prognosis Applications RUL (Flights) 150 100 50 0 0 20 40 60 80 100 60 80 100 Flights (a) 30 States sequence 25 20 15 10 5 0 0 20 40 Flights (b) Figure 3.17: (a) Turbofan estimated RUL in presence of HPC fault, solid blue line is the estimated RUL, dashed red line is true RUL; (b) Turbofan health states sequence in presence of HPC fault, solid blue line is the states health sequence, dash red line is the failure state follows the degradation of the turbofan engine. These simulations highlight important considerations, first the initial error of the RUL is derived from the different working point of the new data from the trained model. Second once the fault occurs, the RUL suddenly falls down, it shows the algorithm robustness and its ability to correct the predicted RUL 68 3.3 Modeling and diagnosis of a Turbofan Engine: Prognosis Application 140 RUL (Flights) 120 100 80 60 40 20 0 0 20 40 60 80 100 60 80 100 Flights (a) 30 States sequence 25 20 15 10 5 0 0 20 40 Flights (b) Figure 3.18: (a) Turbofan estimated RUL in presence of LPT fault, solid blue line is the estimated RUL, dashed red line is true RUL; (b) Turbofan health states sequence in presence of LPT fault, solid blue line is the states health sequence, dash red line is the failure state in presence of faults. Third, the RUL converges in all types of fault. 69 Chapter 4 Modeling and diagnosis of EEG signals in BCI Applications 4.1 Introduction Brains are an aggregate of neurons that communicate with each other to achieve a common goal. Neurons are the elementary units, and both electrical and chemical messages constitute the base through which neurons communicate. Through these messages, neurons are able to achieve a coherent oscillatory activity, or neuronal synchronization, among large and sometimes distant populations, so neurons behave as coupled oscillators. The brain is assumed to be a classical example of a complex, self-organizing system. As such, it exhibits hallmarks of nonlinearity, multistability, and “nondiffusivity” (large coherent fluctuations). Brain oscillations as measured by electroencephalographic (EEG) recordings are usually classified as δ (1–4 Hz), θ (4–7 Hz), α (8–12 Hz), β (13–30 Hz), γ (30–100+ Hz) and µ (8-13 Hz) rhythms. These oscillations are produced by large ensembles of synchronized neuronal activity and the resulting electrophysiological signals in the different frequency bands are associated with different functional states (e.g. sleep, wake, perception and attention). Computational studies adopt a variety of abstractions in order to deal with complex dynamical systems like the brain. In this context a mathematical models of brain oscillations is the socalled Kuramoto model of coupled phase oscillators [92]. Kuramoto model, which is a nonlinear, non-stationary and networked dynamical system, posits that the activity of a local system (neuron/ neural column/ cortical area) can be sufficiently represented by its circular phase alone. Moreover this model is able to reproduce synchronization phenomena in coupled systems such as brain. Brain-Computer Interfaces (BCIs) are devices which translate the brain activity of the user into specific signals, which may be used for communicating or controlling external devices [4, 5] without the use of peripheral nerves and muscles [6]. BCIs represent an interesting option to people affected by neuromuscolar disorders, but whose brain activity is normal, such as in patients 71 Chapter 4 Modeling and diagnosis of EEG signals in BCI Applications affected by Amyotrophic Lateral Sclerosis (ALS). Three different types of stimuli are commonly adopted to drive a BCI: visual stimuli, tactile stimuli and auditory stimuli. Visual stimuli were the first to be studied, and typically lead to the best classification results [93, 94]. Visual stimuli, however, can not be used when the user’s sight has been compromised (e.g. limited horizontal eyes movement, incapability to focus the gaze, etc . . . ), which is the most critical problem faced by both visual BCI and non-BCI systems (such as Eye Gaze systems [95]). In these cases tactile stimuli and auditory stimuli can be adopted instead. Tactile BCI proved to be a good choice for navigation purposes [96, 97, 98], but only recently it has been used as a communication device [99]. Visual BCI systems have been intensively researched in the literature, however they can not be adopted by users suffering of visual impairments. Auditory BCI systems represent a valid alternative, even if they yield to lower classification scores. This chapter proposes an auditory BCI paradigm for systems based on P300 signals which are generated by auditory stimuli characterized by different sound typologies and locations. This paradigm is able to improve the classification scores with respect to the classical auditory BCI systems. A Head Related Transfer Function approach is chosen to virtualize auditory stimuli. When virtualized audio is used, the user has to focus the attention both on the type and location of the stimulus, thus generating P300 signals whose amplitude is higher than that generated without audio virtualization. The auditory algorithm processes the Electroencephalography (EEG) signals in order to model, by data-driven algorithms, patterns that describe the user intentions, in particular if the user focuses the attention on auditory stimuli and when he doesn’t. These patterns can be used as typical features in order to diagnoses the user intentions. Supervised classification is performed by Support Vector Machines, in which gaussian radial basis functions are used as kernel functions, in order to diagnose the user attention on a specific stimulus. The system has been validated with 14 users, who were asked to choose one among five common spoken words, previously virtualized and transmitted to stereophonic headphones. Classification results prove that the proposed auditory BCI system performed similarly to common visual BCI P300 systems, representing then an alternative to visual BCI for users with visual impairments. This chapter is based on the problem and the results presented in [100]. 4.2 Auditory BCI Different typologies of EEG signals have been used in the literature for developing auditory BCIs (e.g. cortical potentials, sensorimotor rhythm, steady state evoked potential and P300), and auditory BCIs represent at the moment the 72 4.2 Auditory BCI most suitable alternative to visual BCI. Slow Cortical Potential (SCP) signals were studied in [101]. The users involved in the experiment received either visual, auditory or combined visual/auditory feedback of their SCPs. Results showed that even if the visual feedback led to the highest number of correct answers, auditory stimuli could be used as well. In [102], instead, the authors adopted an auditory BCI driven by the Sensory Motor Rhythm (SMR) signal. Experimental results showed that auditory stimuli led to similar final results as visual stimuli, even if in the first case the training time was longer. A different approach to auditory paradigm exploits the Steady-State Auditory Evoked Potentials (SSAEP). These are elicited by click-trains, amplitude/frequency modulated tones. A steady-state response is represented by a significant amount of power at the modulation amplitude/frequency of a stimulus [103]. Many of the auditory BCIs available in the literature, however, are based on the P300 component of the Event Related Potential (ERP). In [104] P300 responses to two simultaneous auditory stimulus streams were classified. The users had to choose among one of the two streams and focus their attention by counting the target stimulus. The outcome of the experiment was that a user could possibly direct his/her attention using auditory stimuli only. In [105] a fourchoice BCI was tested with both healthy users and patients affected by ALS. The users were presented auditory and visual stimuli and they had to choose the words “yes” or “no” among “yes”, “no”, “pass” and “end”, according to a classic Oddball Paradigm (OP). The results showed that a target probability of 25% was enough to elicit a reliable P300 signal both in healthy users and ALS patients. A P300 speller driven by auditory stimuli was first presented in [106]. The authors created a 5 × 5 letter matrix similar to that adopted in common visual P300 BCI spellers (see [107]). Column and row flashes were replaced with auditory stimuli that were coded to particular columns and rows in the matrix (i.e. spoken number of column and row). Even if the presence of a visual support matrix was still needed, more than half of the users were able to focus their attention so that the auditory stimulus could be correctly detected and classified, even if with an average accuracy and bit rate lower than those achievable through visual BCIs. Similar results were obtained in [108], where the authors extended the letter matrix to 36 characters and added visual cues early in the training phase. A larger amount of choices did not compromise the classification performances, while the addition of visual cues allowed for a better accuracy during the online phase. The above mentioned articles prove that auditory BCI is a possible alternative to visual BCI, however at the cost of lower classification scores and average bit rates. An alternative method to improve performances has been presented in [109], where the authors adopted spatial auditory stimuli. Users had to sit in the middle of a room surrounded by five speakers with 45◦ angle between them. All speakers were given a unique 73 Chapter 4 Modeling and diagnosis of EEG signals in BCI Applications complex audio stimulus, so that the discriminating cue was both the physical property and spatial location of the stimulus. The results showed an increment in the classification score w.r.t. the case where a single speaker only was adopted. Moreover by increasing the number of runs (times that the audio stimuli were repeated) it was also possible to achieve results similar to that of visual BCIs, however impacting negatively on the bit rate. The main drawback of the proposed solution was that the user had to stand still in the middle of a room surrounded by speakers. The present paper tries to overcome this obstacle by using a single stereo headphone where audio stimuli are virtualized. Sound virtualization has already been studied in [110] to show that spatial location can be a cue determining factor for BCI applications. The auditory paradigm aims to give the user the opportunity to choose one between five different audio stimuli, retaining at the same time the possibility for the user to be moved within the home environment. Moreover the audio stimuli presented to the users are simple words referred to common daily life activities, rather than audio tones set at specific frequencies [101, 104, 109, 110], numbers [106], or instrumental sounds [102, 108]. It is the authors claim that the use of words of the common language in auditory BCIs can lead to a straightforward communication paradigm, reducing at the same time the training time needed to use the BCI correctly. 4.3 Spatial Audio Given an audio source in a room, the human ear can perceive mainly two information: the sound and the position of the source. In anechoic chamber, in case of source in front of the listener, the human auditory system can recognize variation of sound source direction of about 1◦ on the horizontal plane. In case of source behind or beside the listener, the sensibility significantly decreases to about 10◦ . On the vertical plane there are no differences between sources in front of and behind the listener and also in this case the order is about 10◦ [111, 112, 113]. In order to obtain spatial audio, one of the most used technique is the binaural recording: the aim is to get a very realistic recording of a sound event, which takes place in a real environment, through a single pair of microphones, placed on an artificial head at the ears. It is necessary to obtain spatial audio which can later be used as auditory stimulus directly fed into the user’s headphone: the binaural recording thus represents a natural approach to obtain highly realistic sound images. In this context the Head Related Transfer Function (HRTF) assumes a great importance. HRTF is an impulse response that describes how a sound coming from a well-defined direction is perceived by the human ear. With a set of two HRTFs, one for each ear, any direction 74 4.3 Spatial Audio of sound source propagation can be synthesized (Fig. 4.1(b)). (a) The five audio stimuli directions, played by headphones, with an off-set of 45◦ . (b) Schematic of left and right HRTF relative to a sound source coming from a well-defined direction α. Figure 4.1: Spatial hearing 75 Chapter 4 Modeling and diagnosis of EEG signals in BCI Applications Therefore, given the left and right HRTFs relative to a desired sound direction α, a mono signal s becomes directive through the operation of convolution: outR = s ∗ HRTFR (α) outL = s ∗ HRTFL (α) (4.1) (4.2) Database of HRTFs for several sound directions in anechoic environment can be found in the literature: the one used has been realized by MIT Media Lab [114]. Five audio signals, namely the words “bathroom”, “bedroom”, “kitchen”, “help” and “stop” have been virtualized through the use of ten different HRTFs, i.e. five different sound directions per ear with an off-set of 45◦ (Fig. 4.1(a)). 4.4 Testing Methodologies 4.4.1 Participants Fourteen healthy subjects (10 males, 4 females, mean age 25.4, standard deviation ± 2.85, range 22 − 33) participated in the study. All subjects were volunteering group members and had some previous experience with visual BCI, mainly based on imagined movement and P300 tasks. No one had previous experiences with auditory BCI. The lack of experience is not a main issue: the proposed BCI system, based on auditory stimuli represented by common spoken words, is simpler to use than auditory BCI systems in which stimuli are represented by tones or instrumental sounds, thus requiring short training phase. 4.4.2 Data Acquisition The EEG was recorded monopolarly using an electrode cap with 8 active highpurity gold (Au) electrodes (g.tec medical engineering GmbH) following the American Electroencephalographic Society modified version of the 10-20 system [115]. These are located at positions Fz, Cz, Po7, P3, Pz, P4, Po8, and Oz (see Fig. 4.2). Channels are referenced to the left earlobe and grounded to the left mastoid. Signals were acquired and amplified using a g.MobiLab+ (g.tec medical engineering GmbH, Germany). Data collection and stimulus presentation were controlled by the BCI2000 software package [116]. 4.4.3 EEG Signals Modeling Prior to recording periods, participants were asked to minimize eye movements and muscle contractions during the experiment. Each participant was equipped 76 4.4 Testing Methodologies Figure 4.2: Electrode set for recording and analysis. Eight data channels are according to the International 10-20 electrodes system; the reference and ground electrodes are selected as the left earlobe and the left mastoid, respectively. with stereophonic headphones, and was requested to repeatedly fulfill the following auditory task: listen to a sequence of five words and focus his/her attention when the target word was played (i.e. mentally counting how many times the target word was listened to). Each run contained 1 target word and 4 non-target words: both the sequence of the five words and their spatial orientation were randomly chosen. The users were not requested to consciously identify the word spatial orientation, however this association is unconsciously made by the users, thus increasing the P300 activity as already shown in [109]. Each run was repeated 150 times, for a total of 750 audio stimuli of which 150 were target stimuli and 600 non-target stimuli. A ratio of 1 to 5 between target and non-target stimuli has been shown to be rare enough to produce a P300 77 Chapter 4 Modeling and diagnosis of EEG signals in BCI Applications response [105]. A stimulus duration of 1500 ms and an Inter Stimulus Interval (ISI) of 250 ms were chosen. Electrooculogram (EOG) was not recorded, then the artifact rejection was not considered, but the artifact reduction was implemented using the following filters: a high pass filter at 0.1 Hz, a low pass filter at 30 Hz and a notch filter at 50 Hz. A Common Average Reference (CAR) spatial filter was applied to the temporal filtered signals [117]. Acquired signals were segmented into epochs of 800 ms starting at the onset of a stimulus. The data, that was originally sampled at a rate of 256 Hz, was decimated and moving average filtered by a frequency of 20 Hz. This resulted in 150 target trials (i.e. number of audio stimuli listened) and 600 non-target trials. A Support Vector Machine (SVM) was used for data classification [118, 119], with the following gaussian radial basis function used as kernel function: φ(ëxi − xj ë) = e−aëxi −xj ë , (4.3) where xi , xj , are the i-th and j-th data sample. The kernel function parameter a is chosen as the value that maximizes the average between the target and non-target classification accuracy. To increase sensitivity, outcomes of multiple runs for the same task can be averaged. In this way, the influence of single trials can be decreased and the selection score can be more robust. One possibility is to average the raw trials timeseries for each task and classify them as a single trial. Another option is to classify each original trial individually and average over the classifier scores: which implies the use of two or more iterations (i.e. number of runs repeated before the classificator generates the output). This second approach is opted, since it showed better performances. Datasets from the BCI experiments contained four times more non-target stimuli than targets. Although the classification task is essentially binary, chance level for classification is 80%, which could potentially be obtained by simply assigning all samples to the non-target group. Therefore, to evaluate the performances different type of classification accuracy indexes are considered. • The classification accuracy, which refers to the binary classification and is defined as the percentage of trials in which both the target or non-target stimuli are correctly scored. • The target accuracy, which is defined as the percentage of trials in which a target stimulus is correctly scored. • The non-target accuracy, which is defined as the percentage of trials a non-target stimulus is correctly scored. • The selection accuracy, which denotes the percentage of trials in which the BCI system returns the target action thought by the user. 78 4.5 Results The selection accuracy index is evaluated for all iterations, therefore it is the average of the classifier scores for each trial. In order to have a single target output from the BCI system, just the target which has the largest classification output is chosen thus multiple targets are not allowed and one target always exists. 4.4.4 Information Transfer Rate The Information Transfer Rate (ITR) measures the amount of information carried by every selection and, is a performance index for the evaluation of BCI systems. The ITR facilitates the performance comparison with other BCI applications and it is calculated in bits per selection with the following formula [120]: B = log2 N + P log2 P + (1 − P ) log2 3 1−P N −1 4 , (4.4) where N represents the number of classes (five in the present case of study) and P is the selection accuracy. The ITR in bits per minute was obtained by multiplying the bit rate B by the classification speed V , that is the average number of selections per minute, as follows: IT R = B · V. (4.5) Eq. (4.4) shows that even though the selection accuracy may increase when using two or more iterations, the ITR may stay the same or even decrease when V decreases, that is to say when selection takes more time. This is typically the case of our auditory BCI, which requires audio stimuli of long duration based on words of common language rather than digital tones. 4.5 Results 4.5.1 Classification Performance Table 4.1 gives the classification, target and non-target accuracy for the BCI experiment when the SVM is required to perform a classification within a single run. In this case only one subject reaches 70% of target accuracy, while the remaining subjects scored a target accuracy below the 70% limit, which is assumed to be the minimal limit for useful BCI operations [121]. Please note that target accuracy being lower than non-target accuracy is considered normal: whenever the ratio between target and non-target words is small, the classificator tends to weight non-target words more than target ones. When using multiple iterations, instead, the score went up quickly for most of the subjects, as shown in Fig. 4.3, which summarizes the selection accuracy in 79 Chapter 4 Modeling and diagnosis of EEG signals in BCI Applications Table 4.1: Classification accuracy, target accuracy and non-target accuracy for auditory stimuli (stimulus duration 1500 ms, ISI 250 ms) within a single run. Peak amplitude for the auditory condition is determined as the maximum amplitude in the range from 0 to 800 ms. Participant Classification accuracy (%) 76,8 78,8 84,0 82,9 84,0 78,0 86,2 82,5 78,9 87,3 82,5 80,2 90,6 85,1 82,5 3,9 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Mean SD Target accuracy (%) 42,0 44,0 51,6 48,4 52,0 33,9 58,1 47,3 36,7 61,3 47,3 40,6 71,0 54,8 49,1 9,7 Non-target accuracy (%) 85,5 86 90,7 90,0 90,4 86,8 92,0 89,5 87,3 92,7 89,5 88,1 94,7 91,3 89,6 2,6 function of iterations required by the SVM to perform classification, for users 1, 5, 6, 8, 9 and 13. Selection accuracy % 100 90 Participant 1 Participant 5 Participant 6 Participant 8 Participant 9 Participant 13 80 70 60 50 40 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Iterations Figure 4.3: Selection scores, for auditory stimuli (stimulus duration 1500 ms, ISI 250 ms), plotted as a function of the number of iterations for the users 1, 5, 6, 8, 9 and 13. 80 4.5 Results The average value of iterations to reach the 70% selection score is 5, as shown in Fig. 4.4. Mean selection score for a single run is about 50%, as shown in Fig. 4.4 and Table 4.1. The participants reached the 80% selection score after Selection accuracy % 95 85 75 65 55 45 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Iterations Figure 4.4: Mean selection accuracy, for auditory stimuli (stimulus duration 1500 ms, ISI 250 ms), plotted as a function of the number of iterations for fourteen participants. ten iterations and 90% after fourteen iterations. Fig. 4.5 shows the boxplot of the selection score for all iterations. On each box, the central mark is the median, the edges of the box are the lower and higher quartiles. When the lower quartile is considered, seven subjects are over the 70% selection accuracy. Considering the median, six subjects are over the 70% selection accuracy. If the higher quartile is considered, instead, eleven participants are over the selection score limit, and only subjects number 6 and 11 are below. Subject 6, with the minimum selection value, does not reach the 70% selection score. The maximum selection score achieved is 99% and the minimum is 32.3%. Selection scores are comparable to those achievable with visual and auditory P300 spellers [106]. 4.5.2 ITR Performance ITR performances are shown in Fig. 4.6 for six participants. When using multiple iterations, ITR for most subjects went down quickly as shown in Fig. 4.6. This is a consequence of the classification speed (V ) reduction: since ISI is 250 ms and stimulus duration is 1500 ms, each additional iteration increases the classification time of 1750 ms by no trials within each run (i.e 7.5 s). The worst ITR is 0.2 bits/min, the best result is 3.8 bits/min, which is achieved by the seventh participant. The average value of iterations needed to 81 Chapter 4 Modeling and diagnosis of EEG signals in BCI Applications Selection accuracy % 110 100 90 80 70 60 50 40 30 1 2 3 4 5 6 7 8 9 Participants 10 11 12 13 14 Figure 4.5: Selection accuracy boxplot, for auditory stimuli (stimulus duration 1500 ms, ISI 250 ms), of all participants. Boxplot is evaluated with all iterations. 4 Participant 1 Participant 5 Participant 6 Participant 8 Participant 9 Participant 13 ITR bits/min 3.5 3 2.5 2 1.5 1 0.5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Iterations Figure 4.6: ITR for auditory stimuli (stimulus duration 1500 ms, ISI 250 ms), plotted as a function of the number of iterations for the subjects 1, 5, 6, 8, 9 and 13. reach the 70% selection score is 5. At this iteration value, ITR is 1.3 bits/min as shown in Fig. 4.7. Mean ITR for one run is 2.4 bits/min, as shown in Fig. 4.7. When the participants reach the 80% selection score after ten iterations, the ITR is 1 bits/min. After fourteen iterations, which corresponds to 90% selection accuracy, ITR is 0.9 bits/min. Fig. 4.8 shows the boxplot of the ITR for all iterations. The boxplot shows that seven subjects are, for all iterations, below 1 bits/min and seven subjects are above this value. The best 82 4.5 Results ITR bits/min 2.6 2.4 2.2 2 1.8 1.6 1.4 1.2 1 0.8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Iterations Figure 4.7: Mean ITR, for auditory stimuli (stimulus duration 1500 ms, ISI 250 ms), plotted as a function of the number of iterations for fourteen participants. 4 ITR bits/min 3.5 3 2.5 2 1.5 1 0.5 0 1 2 3 4 5 6 7 8 9 Participants 10 11 12 13 14 Figure 4.8: ITR boxplot for auditory stimuli (stimulus duration 1500 ms, ISI 250 ms), of all participants. Boxplot is evaluated for all iterations. ITR median is 1.8 bits/min, which is achieved by subject 13. Subject 6 shows the worst ITR performances. ITR are not high compared to visual and auditory P300 spellers [106]. ITR performances, as shown in Eq. (4.4) and ( 4.5), depend from speed and selection accuracy. Speed is related to stimulus duration and ISI. In the present study, the time interval, between the onset of one stimulus until the next, is 1.75 s, that is much higher respect to visual and auditory P300 speller based systems. This high time interval entails a lower ITR but a more natural way of communicating 83 Chapter 4 Modeling and diagnosis of EEG signals in BCI Applications with the user, because the subject has not to pay attention to different tones, timbres or pitches but to single words only. ITR results are comparable to those achievable by auditory P300 in BCI [108, 109]. 84 Chapter 5 Concluding Remarks In this dissertation, a contribution to complex system dynamics modeling and diagnosis is presented. With particular attention to real systems, different applications are discussed as many case studies. This concluding chapter summarizes the results achieved by the solution proposed in the previous chapters, giving an insight of possible future works. 5.1 Modeling and Diagnosis of Electric Motor in a Quality Control Scenario The first contribution of the dissertation is that of developing two data-driven diagnostic modules which can be applied to detect faults and defects of electric motors. The first one is based on stator current signals. A FDD solution is proposed in order to model and diagnose the faults dynamics, which can not be described by analytical equations. The FDD procedure uses the PCA in data pre-processing to reduce the currents space in two dimensions. The PDF of PCA-transformed signals is estimated by KDE. PDFs are the models that can be used to identify each fault and defect. Diagnosis has been carried out using the K-L divergence, which measures the difference between two probability distributions. This divergence is used as a distance measure between classified statistic signatures obtained by KDE. The results show that the proposed datadriven diagnosis procedure is able to detect and diagnose different induction motor faults and defects. The second one is based on vibration signals. A FDD solution is proposed in order to model and diagnose the faults dynamics, which can not be described by analytical equations. The FDD procedure used is the MSPCA, which guarantees robustness and reliability of the detection and diagnosis of defects. The identified signature is unique for each defect, experiments on single-phase motors prove this fact. The laboratory bench test is composed by high performance monitoring programmable automation system. Nevertheless, after some analysis it is observed that, for the considered machines, vibrations phenomena arise at low frequencies. Then a low cost 85 Chapter 5 Concluding Remarks measurement system is used that consists of MEMS (Micro Electro-Mechanical Systems) sensors. Possible future works for the FDD solutions for rotating electric machine should be mainly focused on the improvement of fault diagnosis. In the case of vibration signals, this could be achieved using the Wavelet Packet Transform (WPT), which is an extension of the classical wavelet analysis applied in MSPCA, and a filter that chooses the details and approximation scale matrices obtained by the WPT in a way to maximize the separation of the classes related to the motor conditions. A possible solution is to consider as filter the Common Spatial Pattern (CSP). In the case of current signals, a possible future work is the extension of the algorithm to on-line FDD procedure in order to avoid one of the major drawback of the algorithm which concerns the data batch processing because it needs to acquire several current samples for the fault diagnosis procedure. 5.2 Modeling of Complex Systems with FDI and Prognosis Applications The second contribution of the dissertation is that of developing two solutions for modeling and diagnose two different complex systems: a paper mill plant and a turbofan engine. These solutions are applied in order to monitor these complex systems in FDI and prognostic contexts. Since these complex systems consist of several coupling nonlinear systems, they cannot be modeled by using a single model for all operating conditions. For this reason the proposed solution suffer the problem of change in the operating condition. In this case the model, obtained by data-driven procedures, could be not reliable for FDI and Prognosis applications. Possible future works for complex dynamics modeling should be mainly focused on the improvement of FDI and prognosis solutions in order to obtain models more robust to changes of operating conditions. Furthermore, since complex systems have a nonlinear behaviour, seems natural to extend these algorithms by kernel models. 5.3 Modeling and diagnosis of EEG signals in BCI Applications The last contribution of the dissertation is that of developing an auditory BCI paradigm for systems based on P300 signals which are generated by auditory stimuli characterized by different sound typologies and locations. In order 86 5.3 Modeling and diagnosis of EEG signals in BCI Applications to achieve this objective, EEG are modeled by data-driven approaches since it would not have been possible through analytical models. The main contribution has been the development of an auditory BCI paradigm, which diagnoses the user intention by auditory stimuli as common spoken words. Possible future works should be mainly focused on the improvement of the user intentions diagnosis. A possible solution could be the development of a hybrid modeling solution for EEG signals based on data-driven approaches and Kuramoto model since this model is able to reproduce synchronization phenomena, which could be linked to P300 signal as recent findings indicate. 87 Bibliography [1] A. Zecevic and D. Siljak, Control of Complex Systems: Structural Constraints and Uncertainty, ser. Communications and Control Engineering. Springer, 2010. [2] A. Giantomassi, Modeling estimation and identification of complex system dynamics. LAP Lambert Academic Publishing, October 2012. [3] C. Aldrich and L. Auret, Unsupervised Process Monitoring and Fault Diagnosis with Machine Learning Methods, ser. Advances in Computer Vision and Pattern Recognition. Springer, 2013. [4] N. Birbaumer and L. Cohen, “Brain-computer interfaces: communication and restoration of movement in paralysis,” The Journal of physiology, vol. 579, no. 3, pp. 621–636, 2007. [5] D. J. McFarland and J. R. Wolpaw, “Brain-computer interfaces for communication and control,” Communications of the ACM, vol. 54, no. 5, pp. 60–66, 2011. [6] J. Wolpaw, N. Birbaumer, D. McFarland, G. Pfurtscheller, and T. Vaughan, “Brain-computer interfaces for communication and control,” Clinical Neurophysiology, vol. 6, no. 113, 2002. [7] W. Thomson and M. Fenger, “Current signature analysis to detect induction motor faults,” IEEE Ind. Appl. Mag., vol. 7, no. 4, pp. 26–34, 2001. [8] F. Ferracuti, A. Giantomassi, S. Iarlori, G. Ippoliti, and S. Longhi, “Induction motor fault detection and diagnosis using kde and kullback-leibler divergence,” in Industrial Electronics Society, IECON 2013 - 39th Annual Conference of the IEEE, 2013, pp. 2923–2928. [9] F. Ferracuti, A. Giantomassi, and S. Longhi, “Mspca with kde thresholding to support qc in electrical motors production line,” in Manufacturing Modelling, Management, and Control, vol. 7, no. 1, 2013, pp. 1542–1547. [10] F. Ferracuti, A. Giantomassi, G. Ippoliti, and S. Longhi, “Multi-scale pca based fault diagnosis for rotating electrical machines,” in European 89 Bibliography Workshop on Advanced Control and Diagnosis, 8th ACD, Ferrara, Italy, 2010, pp. 296 – 301. [11] M. Stuart, E. Mullins, and E. Drew, “Statistical quality control and improvement,” European Journla of Operational Research, vol. 88, pp. 203– 214, 1995. [12] K. Linderman, R. G. Schroeder, S. Zaheer, and A. S. Choo, “Six sigma: a goal-theoretic perspective,” Journal of Operations Management, vol. 21, no. 2, pp. 193 – 203, 2003. [13] M. El Hachemi Benbouzid, “A review of induction motors signature analysis as a medium for faults detection,” IEEE Trans. Ind. Electron., vol. 47, no. 5, pp. 984–993, 2000. [14] S. Nandi and H. Toliyat, “Condition monitoring and fault diagnosis of electrical machines-a review,” in Industry Applications Conference, 1999. Thirty-Fourth IAS Annual Meeting. Conference Record of the 1999 IEEE, vol. 1, 1999, pp. 197–204 vol.1. [15] N. Feki, G. Clerc, and P. Velex, “Gear and motor fault modeling and detection based on motor current analysis,” Electric Power Systems Research, vol. 95, no. 0, pp. 28 – 37, 2013. [16] Z. I. Botev, J. F. Grotowski, and D. P. Kroese, “Kernel density estimation via diffusion,” Annals of Statistics, vol. 38, no. 5, pp. 2916–2957, 2010. [17] M. P. Wand and M. C. Jones, “Multivariate plug-in bandwidth selection,” Computational Statistics, vol. 9, pp. 97–116, 1994. [18] I. T. Jolliffe, Principal component analysis. Berlin: Springer, 2002. [19] M. Manish, H. H. Yuea, S. J. Qin, and C. Lingb, “Multivariate process monitoring and fault diagnosis by multi-scale PCA,” Comput. Chem. Eng., vol. 26, pp. 1281–1293, 2002. [20] E. Parzen, “On Estimation of a Probability Density Function and Mode,” The Annals of Mathematical Statistics, vol. 33, no. 3, pp. 1065–1076, 1962. [21] M. P. Wand and M. C. Jones, Kernel Smoothing. CRC, Dec. 1994. Chapman and Hall [22] A. Mugdadi and I. A. Ahmad, “A bandwidth selection for kernel density estimation of functions of random variables,” Computational Statistics & Data Analysis, vol. 47, no. 1, pp. 49 – 62, 2004. 90 Bibliography [23] D. Comaniciu, “An algorithm for data-driven bandwidth selection,” IEEE T. Pattern Anal., vol. 25, no. 2, pp. 281–288, 2003. [24] S. J. Sheather, “Density estimation,” Statist. Sci, pp. 588–597, 2004. [25] S. Kullback and R. A. Leibler, “On information and sufficiency,” Annals of Mathematical Statistics, vol. 22, pp. 49–86, 1951. [26] J. Bangura, R. Povinelli, N. A. O. Demerdash, and R. Brown, “Diagnostics of eccentricities and bar/end-ring connector breakages in polyphase induction motors through a combination of time-series data mining and time-stepping coupled FE-state-space techniques,” IEEE Trans. Ind Appl., vol. 39, no. 4, pp. 1005–1013, 2003. [27] E. Keogh, “UCR time series data mining archive,” 2013. [Online]. Available: http://www.cs.ucr.edu/~eamonn/iSAX/iSAX.html [28] Y. Fan and G. Zheng, “Research of high-resolution vibration signal detection technique and application to mechanical fault diagnosis,” Mechanical Systems and Signal Processing, vol. 21, no. 2, pp. 678 – 687, 2007. [29] B.-S. Yang and K. J. Kim, “Application of dempster-shafer theory in fault diagnosis of induction motors using vibration and current signals,” Mechanical Systems and Signal Processing, vol. 20, no. 2, pp. 403 – 420, 2006. [30] F. Immovilli, A. Bellini, R. Rubini, and C. Tassoni, “Diagnosis of bearing faults in induction machines by vibration or current signals: A critical comparison,” Industry Applications, IEEE Transactions on, vol. 46, no. 4, pp. 1350 –1359, july-aug. 2010. [31] V. T. Tran, B.-S. Yang, M.-S. Oh, and A. C. C. Tan, “Fault diagnosis of induction motor based on decision trees and adaptive neuro-fuzzy inference,” Expert Systems with Applications, vol. 36, no. 2, Part 1, pp. 1840 – 1849, 2009. [32] N. Sawalhi and R. Randall, “Simulating gear and bearing interactions in the presence of faults: Part I. the combined gear bearing dynamic model and the simulation of localised bearing faults,” Mechanical Systems and Signal Processing, vol. 22, no. 8, pp. 1924 – 1951, 2008. [33] N. Sawalhi and R. Randall, “Simulating gear and bearing interactions in the presence of faults: Part II: Simulation of the vibrations produced by extended bearing faults,” Mechanical Systems and Signal Processing, vol. 22, no. 8, pp. 1952 – 1966, 2008. 91 Bibliography [34] P. Rodriguez, A. Belahcen, and A. Arkkio, “Signatures of electrical faults in the force distribution and vibration pattern of induction motors,” Electric Power Applications, IEE Proceedings -, vol. 153, no. 4, pp. 523 –529, july 2006. [35] B. R. Bakshi, “Multiscale PCA with application to multivariate statistical process monitoring,” AIChE Journal, vol. 44, pp. 1596–1610, 1998. [36] M. Manish, H. H. Yuea, S. J. Qin, and C. Lingb, “Multivariate process monitoring and fault diagnosis by multi-scale PCA,” Computers & Chemical Engineering, vol. 26, pp. 1281–1293, 2002. [37] P. E. Odiowei and Y. Cao, “Nonlinear dynamic process monitoring using canonical variate analysis and kernel density estimations,” Industrial Informatics, IEEE Transactions on, vol. 6, no. 1, pp. 36 –45, feb. 2010. [38] J. E. Jackson, A User’s Guide to Principal Components. Wiley-Interscience, 2003. New York: [39] J. Jackson and G. Mudholkar, “Control procedures for residuals associated with principal component analysis,” Technometrics, vol. 21, pp. 341–349, 1979. [40] R. Dunia and S. Qin, “Joint diagnosis of process and sensor faults using principal component analysis,” Control Engineering Practice, vol. 6, pp. 457–469, 1998. [41] J. Yu, “Bearing performance degradation assessment using locality preserving projections,” Expert Systems with Applications, vol. 38, no. 6, pp. 7440 – 7450, 2011. [42] J. Yu, “Bearing performance degradation assessment using locality preserving projections and gaussian mixture models,” Mechanical Systems and Signal Processing, vol. 25, no. 7, pp. 2573 – 2588, 2011. [43] J. Downs and E. Vogel, “A plant-wide industrial process control problem,” Computers & Chemical Engineering, vol. 17, no. 3, pp. 245 – 255, 1993. [44] L. H. Chiang, E. L. Russel, and R. D. Braatz, Fault detection and diagnosis in industrial systems. Berlin: Springer, 2001. [45] N. L. Ricker, “Decentralized control of the tennessee eastman challenge process,” Journal of Process Control, vol. 6, no. 4, pp. 205 – 221, 1996. [46] “Tennessee eastman process model,” 2002. [Online]. Available: http: //depts.washington.edu/control/LARRY/TE/download.html 92 Bibliography [47] X. Li, S. Dong, and Z. Yuan, “Discrete wavelet transform for tool breakage monitoring,” Int. Journal of machine tool manufacture, vol. 99, pp. 1944–1955, 1999. [48] I. Daubechies, “Orthonormal bases of compactly supported wavelets,” Communications on Pure and Applied Mathematics, vol. 41, pp. 909– 996, 1988. [49] S. Mallat, “A theory for multiresolution signal decomposition: the wavelet representation,” Pattern Analysis and Machine Intelligence, vol. 11, pp. 674–693, 1989. [50] J. Antonino-Daviu, M. Riera-Guasp, J. Roger-Folch, F. MartinezGimenez, and A. Peris, “Application and optimization of the discrete wavelet transform for the detection of broken rotor bars in induction machines,” Applied and Computational Harmonic Analysis, vol. 21, pp. 268 –279, 2006. [51] K. A. Ho, K. J. Tvarlapat, M. J. Piovoso, and R. Hajare, “A method of robust multivariate outlier replacement,” Computer and Chemical Engineering, vol. 26, pp. 17–39, 2002. [52] NI, “National instruments inc.” 2009. [Online]. Available: //www.ni.com http: [53] F. Ferracuti, A. Giantomassi, S. Longhi, and N. Bergantino, “Multi-scale pca based fault diagnosis on a paper mill plant,” in Emerging Technologies Factory Automation (ETFA), 2011 IEEE 16th Conference on, 2011, pp. 1–8. [54] A. Giantomassi, F. Ferracuti, A. Benini, S. Longhi, G. Ippoliti, and A. Petrucci, “Hidden markov model for health estimation and prognosis of turbofan engines,” in ASME 2011 International Design Engineering Technical Conferences & Computers and Information in Engineering Conference, 2011, pp. 1–6. [55] H. Cheng, M. Nikus, and S. L. Jamsa-Jounela, “Evaluation of pca methods with improved fault isolation capabilities on a paper machine simulator,” Chemometrics and Intelligent Laboratory Systems, vol. 92, pp. 186–199, 2008. [56] R. Isermann and P. Ballé, “Trends in the application of model-based fault detection and diagnosis of technical processes,” Control Eng. Pract., vol. 5, pp. 709–719, 1997. 93 Bibliography [57] R. J. Patton, F. J. Uppal, and C. J. Lopez-Toribio, “Soft computing approaches to fault diagnosis for dynamic systems: a survey,” in IFAC Symposium on Fault Detection, Supervision and Safety dor Technical Processes, Budapest, Hungary, 2000, pp. 298–311. [58] V. Venkatasubramanian, R. Rengaswamy, and S. N. Kavuri, “A review of process fault detection and diagnosis part I: quantitative model-based methods,” Comput. Chem. Eng., vol. 27, pp. 293–311, 2000. [59] V. Venkatasubramanian, R. Rengaswamy, and S. N. Kavuri, “A review of process fault detection and diagnosis part II: qualitative models and search strategies,” Comput. Chem. Eng., vol. 27, pp. 313–326, 2000. [60] V. Venkatasubramanian, R. Rengaswamy, and S. N. Kavuri, “A review of process fault detection and diagnosis part III: process history based methods,” Comput. Chem. Eng., vol. 27, pp. 327–346, 2000. [61] R. Isermann, Fault-Diagnosis Systems. Berlin: Springer-Verlag, 2006. [62] L. H. Chiang, E. L. Russell, and R. D. Braatz, “Fault diagnosis in chemical processes using fischer discriminant analysis, discriminant partial least squares, and principal component analysis,” Chemometrics and Intelligent Laboratory Systems, vol. 50, pp. 243–252, 2000. [63] A. Heng, S. Zhang, A. C. C. Tan, and J. Mathew, “Rotating machinery prognostics: State of the art, challenges and opportunities,” Mechanical Systems and Signal Processing, vol. 23, no. 3, pp. 724–739, April 2009. [64] F. Camci and R. B. Chinnam, “Health-state estimation and prognostics in machinery processes,” IEEE Transactions on Automation Science and Engineering, vol. 7, no. 3, pp. 581–597, 2010. [65] Y. Li, S. Billington, C. Zhang, T. Kurfess, S. Danyluk, and S. Liang, “Adaptive prognostics for rolling element bearing condition,” Mechanical Systems and Signal Processing, vol. 13, no. 1, pp. 103–113, January 1999. [66] Y. Li, T. Kurfess, and S. Y. Liang, “Stochastic prognostics for rolling element bearings,” Mechanical Systems and Signal Processing, vol. 14, no. 5, pp. 747–762, September 2000. [67] S. Goto, Y. Afachi, S. Katafuchi, T. Furue, Y. Uchida, M. Sueyoushi, H. Hatazaki, and M. Nakamura, “On-line deterioration prediction and residual life evaluation of rotating equipment based on vibration measurement,” in Proceedings of the SICE Conference, Japan, 2008, pp. 812–817. 94 Bibliography [68] C. Ciandrini, M. Gallieri, A. Giantomassi, G. Ippoliti, and S. Longhi, “Fault detection and prognosis methods for a monitoring system of rotating electrical machines,” in Industrial Electronics (ISIE), 2010 IEEE International Symposium on, 2010, pp. 2085–2090. [69] P. Wang and G. Vachtsevanos, “Fault prognostics using dynamic wavelet neural networks,” Artificial Intelligence for Engineering Design, Analysis and Manufacturing, vol. 15, no. 11, pp. 349–365, January 2002. [70] W. Q. Wang, M. F. Golnaraghi, and F. Ismail, “Prognosis of machine health condition using neuro-fuzzy systems,” Mechanical Systems and Signal Processing, vol. 18, no. 4, pp. 813–831, July 2004. [71] W. Wang, “An adaptive predictor for dynamic system forecasting,” Mechanical Systems and Signal Processing, vol. 21, no. 2, pp. 809–823, February 2007. [72] Y. Shao and K. Nezu, “Prognosis of remaining bearing life using neural networks,” Proceedings of the institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering, vol. 214, no. 3, pp. 217–230, 2000. [73] M. Orchard, B. Wu, and G. Vachtsevanos, “A particle filter framework for failure prognosis,” in Proceedings of the World Tribology Congress III, Washington, 2005. [74] S. Zhang, L. Ma, Y. Sun, and J. Mathew, “Asset health reliability estimation based on condition data,” in Proceedings of the 2nd WCEAM and the 4th ICCM, Harrogate, UK, 2007, pp. 2195–2204. [75] X. Zhang, R. Xu, C. Kwan, S. Y. Liang, Q. Xie, and L. Haynes, “An integrated approach to bearing fault diagnostics and prognostics,” in Proceedings of American Control Conference, Portland OR, USA, 2005, pp. 2750–2755. [76] C. Kwan, X. Zhang, R. Xu, and L. Haynes, “A novel approach to fault diagnostics and prognostics,” in Proceedings of the IEEE International Conference on Robotics and Automation, Taipei, Taiwan, 2003, pp. 604– 609. [77] H. Ocak and K. A. Loparo, “A new bearing fault detection and diagnosis scheme based on hidden markov modeling of vibration signals,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing ICASSP ’01, Taipei, Taiwan, 2001, pp. 3141–3144. 95 Bibliography [78] A. Saxena, K. Goebel, D. Simon, and N. Eklund, “Damage propagation modeling for aircraft engine run-to-failure simulation,” in International Conference on Prognostics and Health Management PHM’08, Denver CO, USA, 2008. [79] I. Y. Tumer and A. Bajwa, “A survey of aircraft engine health monitoring systems,” in Joint Propulsion Conference, Los Angeles CA, USA, 1999. [80] M. Kurosaki, T. Morioka, K. Ebina, M. Maruyama, T. Yasuda, and M. Endoh, “Fault detection and identification in an IM270 gas turbine using measurements for engine control,” Journal of Engineering for Gas Turbines and Power, vol. 126, no. 4, pp. 726–732, October 2004. [81] H. A. S. III and H. Brown, “Control of jet engines,” Control Engineering Practice, vol. 7, pp. 1043–1059, 1999. [82] S. Vittal, P. Hajela, and A. Joshi, “Review of approaches to gas turbine life management,” in Proceeding of 10th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference, New York , NY, USA, 2004. [83] S. Chatterjee and J. Litt, “Online model parameter estimation of jet engine degradation for autonomous propulsion control,” in NASA, Technical Manual TM2003-212608, 2003. [84] L. R. Rabiner, “A tutorial on hidden markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, pp. 257–286, February 1989. [85] L. A. Liporace, “Maximum likelihood estimation for multivariate observations of markov sources,” IEEE Trans. Informat. Theory, vol. 28, no. 5, pp. 729–734, September 1982. [86] B. H. Juang, S. E. Levinson, and M. M. Sondhi, “Maximum likelihood estimation for multivariate mixture observations of markov chains,” IEEE Trans. Informat. Theory, vol. 32, no. 2, pp. 307–309, March 1986. [87] L. E.Baum and J. A. Egon, “An inequality with applications to statistical estimation for probabilistic functions of markov process and to a model for ecology,” Bull. Amer. Meteorol. Soc., vol. 73, pp. 360–363, 1967. [88] A. J. Viterbi, “Error bounds for convolutional codes and an asymptotically optimal decoding algorithm,” IEEE Trans. Informat. Theory, vol. 13, no. 2, pp. 260–269, April 1967. [89] L. E.Baum, “An inequality and associated maximization technique in statistical estimation for probabilistic functions of markov processes,” Inequalities, vol. 73, pp. 1–8, 1972. 96 Bibliography [90] G. D. Forney, “The viterbi algorithm,” Proceedings of the IEEE, vol. 61, no. 3, pp. 268–278, March 1973. [91] G. Schwarz, “Estimating the dimension of a model,” Annals of Statistics, vol. 6, no. 2, pp. 461–464, march 1978. [92] Y. Kuramoto, Chemical oscillations, waves, and turbulence, ser. Chemistry Series. Dover Publications, 2003, originally published: Springer Berlin, New York, Heidelberg, 1984. [93] A. Lenhardt, M. Kaper, and H. Ritter, “An adaptive P300-based online brain-computer interface,” Neural Systems and Rehabilitation Engineering, IEEE Transactions on, vol. 16, no. 2, pp. 121–130, 2008. [94] C. Guger, S. Daban, E. Sellers, C. Holzner, G. Krausz, R. Carabalona, F. Gramatica, and G. Edlinger, “How many people are able to control a P300-based brain-computer interface (BCI)?” Neuroscience letters, vol. 462, no. 1, pp. 94–98, 2009. [95] A. Calvo, A. Chiò, E. Castellina, F. Corno, L. Farinetti, P. Ghiglione, V. Pasian, and A. Vignola, “Eye tracking impact on quality-of-life of als patients,” in Computers Helping People with Special Needs, ser. Lecture Notes in Computer Science, K. Miesenberger, J. Klaus, W. Zagler, and A. Karshmer, Eds. Springer Berlin Heidelberg, 2008, vol. 5105, pp. 70–77. [96] J. Van Erp, “Presenting directions with a vibrotactile torso display,” Ergonomics, vol. 48, no. 3, pp. 302–313, 2005. [97] M. Thurlings, J. Erp, A. Brouwer, and P. Werkhoven, “EEG-based navigation from a human factors perspective,” Brain-Computer Interfaces, pp. 71–86, 2010. [98] A. Brouwer and J. Van Erp, “A tactile P300 brain-computer interface,” Frontiers in neuroscience, vol. 4, 2010. [99] M. van der Waal, M. Severens, J. Geuze, and P. Desain, “Introducing the tactile speller: an ERP-based brain-computer interface for communication,” Journal of Neural Engineering, vol. 9, no. 4, 2012. [100] F. Ferracuti, A. Freddi, S. Iarlori, S. Longhi, and P. Peretti, “Auditory paradigm for a p300 bci system using spatial hearing,” in Intelligent Robots and Systems (IROS), 2013 IEEE/RSJ International Conference on, 2013, pp. 871–876. 97 Bibliography [101] T. Hinterberger, N. Neumann, M. Pham, A. Kübler, A. Grether, N. Hofmayer, B. Wilhelm, H. Flor, and N. Birbaumer, “A multimodal brainbased feedback and communication system,” Experimental Brain Research, vol. 154, pp. 521–526, 2004. [102] F. Nijboer, A. Furdea, I. Gunst, J. Mellinger, D. J. McFarland, N. Birbaumer, and A. Kübler, “An auditory brain-computer interface (BCI),” Journal of Neuroscience Methods, vol. 167, no. 1, pp. 43–50, 2008. [103] D.-W. Kim, H.-J. Hwang, J.-H. Lim, Y.-H. Lee, K.-Y. Jung, and C.H. Im, “Classification of selective attention to auditory stimuli: Toward vision-free brain-computer interfacing,” Journal of Neuroscience Methods, vol. 197, no. 1, pp. 180 – 185, 2011. [104] N. Hill, T. Lal, K. Bierig, N. Birbaumer, and B. Schölkopf, “An auditory paradigm for brain–computer interfaces,” Advances in neural information processing systems, pp. 569–576, 2004. [105] E. W. Sellers and E. Donchin, “A P300-based brain-computer interface: Initial tests by ALS patients,” Clinical Neurophysiology, vol. 117, no. 3, pp. 538–548, 2006. [106] A. Furdea, S. Halder, D. Krusienski, D. Bross, F. Nijboer, N. Birbaumer, and A. Kübler, “An auditory oddball (P300) spelling system for braincomputer interfaces,” Psychophysiology, vol. 46, no. 3, pp. 617–625, 2009. [107] L. Farwell and E. Donchin, “Talking off the top of your head: toward a mental prosthesis utilizing event-related brain potentials,” Electroencephalography and Clinical Neurophysiology, vol. 70, no. 6, pp. 510–523, 1988. [108] D. Klobassa, T. Vaughan, P. Brunner, N. Schwartz, J. Wolpaw, C. Neuper, and E. Sellers, “Toward a high-throughput auditory P300-based brain-computer interface,” Clinical neurophysiology: official journal of the International Federation of Clinical Neurophysiology, vol. 120, no. 7, p. 1252, 2009. [109] M. Schreuder, B. Blankertz, and M. Tangermann, “A new auditory multiclass brain-computer interface paradigm: spatial hearing as an informative cue,” PLoS One, vol. 5, no. 4, p. e9813, 2010. [110] R. Sonnadara, C. Alain, L. Trainor et al., “Effects of spatial separation and stimulus probability on the event-related potentials elicited by occasional changes in sound location,” Brain research, vol. 1071, no. 1, pp. 175–185, 2006. 98 Bibliography [111] W. M. Hartmann, “Localization of Sound in Rooms,” J. Acoust. Soc. Amer., vol. 74, no. 5, pp. 1380–1391, 1983. [112] J. Blauert, Spatial Hearing - Revised Edition: The Psychophysics of Human Sound Localization. MIT Press, 1996. [113] J. S. Bradley and G. A. Soulodre, “The Influence of Late Arriving Energy on Spatial Impression,” J. Acoust. Soc. Amer., vol. 97, no. 4, pp. 2263– 2271, 1995. [114] M. Gardner and K. Martin, “HRTF Measurements of a KEMAR DummyHead Microphone,” MIT Media Lab, Tech. Rep., 1994. [115] F. Sharbrough, G. E. Chatrian, R. P. Lesser, H. Luders, M. Nuwer, and T. W. Picton, “American Electroencephalographic Society guidelines for standard electrode position nomenclature,” J. Clin. Neurophysiol., vol. 8, pp. 200–202, 1991. [116] G. Schalk, D. McFarland, T. Hinterberger, N. Birbaumer, and J. Wolpaw, “BCI2000: a general-purpose brain-computer interface (BCI) system,” IEEE Transactions on Biomedical Engineering, vol. 51, no. 6, pp. 1034– 1043, 2004. [117] D. J. McFarland, L. M. McCane, S. V. David, and J. R. Wolpaw, “Spatial filter selection for EEG-based communication,” Electroencephalography and Clinical Neurophysiology, vol. 103, no. 3, pp. 386–394, 1997. [118] V. N. Vapnik, Statistical learning theory, 1st ed. Wiley, Sep. 1998. [119] V. Vapnik, “An overview of statistical learning theory,” Neural Networks, IEEE Transactions on, vol. 10, no. 5, pp. 988 –999, sep 1999. [120] J. Wolpaw, H. Ramoser, D. McFarland, and G. Pfurtscheller, “EEGbased communication: improved accuracy by response verification,” IEEE Transactions on Rehabilitation Engineering, vol. 6, no. 3, pp. 326 –333, 1998. [121] A. Kübler, N. Neumann, B. Wilhelm, T. Hinterberger, and N. Birbaumer, “Predictability of brain-computer communication,” Journal of Psychophysiology, vol. 18, no. 2, pp. 121–129, 2004. 99