Application of deep learning in system health management
Published:
In 2016, with this wealth of research experience and skills, I decided to apply for the JSPS fellowship programme. I also cherished the opportunity to work at the University of Tokyo in Japan with world leading experts in Artificial Intelligence. I won the 2-year funding and started my research on deep learning applications in system health management. I am investigating various deep learning architectures including autoencoders and recurrent neural networks so that they can used for fault detection and isolation. As the aerospace industry continuously strives to improve its performance ; operational pressures expect to reduce the time required for any diagnostic investigations. Here, there is value of having many data collection sources that can be used to provide rich information (e.g. operating variables, environmental conditions, etc.), if a disruption occurs during operation. However, most often data sources are disparate. With the ever-increasing size of big data produced by modern systems, coupled with the complexities of contextual components (for correlating information); can create barriers that were not anticipated by design engineers during the design phase of the system life-cycle. This eventually results in higher levels of uncertainty during the diagnosis process (Khan et al 2014b). In this context, novel approaches are required which can configure applications, as well as mechanisms for making better decisions at the system-level. In the nominal environment, such problems warrant advanced capabilities to monitor in-service operations, record and share expert knowledge, and address critical aspects of on-board software. To address this issue, diagnostic systems based on conventional techniques are being replaced by AI-based ones which can increase the efficiency of the monitoring technology. AI-based approaches can be categorized into (1) knowledge-driven (knowledge-based) approaches including expert system and qualitative-reasoning, and (2) data-driven approaches including statistical process control (SPC), machine learning approach and neural networks – including deep learning . Figure 1 illustrates some of the AI approaches that have been used for system health monitoring applications over the years. One notable development is the application of deep learning. These architectures aim to model high level representations of data and classify (predict) patterns by stacking multiple layers of information processing modules in hierarchical architectures. There are advantages of using them, but since it is an evolving research area, its applicability for system health management applications must been researched with an aim to increase overall system resilience or potential cost benefits for maintenance, repair, and overhaul activities.
Health management can be described as the process of diagnosing and preventing system failures, whilst predicting the reliability and remaining useful lifetime (RUL) of its components (Sin and Jun 2015). The aim is to collect (relevant) data from various sensor sources and carry out the necessary processing including the extraction of key features, fault diagnosis and prediction. Based on this analysis, the system will be able to recommend further actions according to user requirements. This phase plays an important role in adding resilience to the overall solution and regulating availability during service operation. Finally, some recommended actions will be issued including for fault alarms, alternatives to maintain availability and in-service feedback. Depending on the recommendation, the human operator will either choose to delay any action – if the failure can be tolerated until the next scheduled maintenance, or take an immediate action e.g. in the case of failures that can affect safety.
This process has been illustrated in Figure 2. Some key aspects that can be noted include: • Any recommended decisions are only good as the data that was collected to represent the current state of the system operation. There is always some uncertainty in the raw data collected by the sensor sources. This can be a challenge as the implementation algorithms may not take such factors into account. • False alarms have been identified as a major annoyance during maintenance activities (Khan 2014a). These are fault calls when no actual fault exists, or a call for a maintenance action when none was needed. System level false alarms can send serviceable components for repair; or if the result is questioned, the predefined system level tests are repeated to gain confidence in the initial conclusions. Such difficulties are associated due to a lack of knowledge on the extent of degradation of a system’s components whilst in operation. • System models and related algorithms need to be updated from time-to-time in order to account for any unanticipated conditions. Typically, once a system model is developed it would remain unchanged, therefore the ability to adapt models and algorithms according in-service performance is important. • Recording and storing acquired on-field knowledge for future application developments and improvements. • There are often problems to collate meaningful information and analyse of all the acquired knowledge to improve diagnosis and resilience. The used of data and knowledge fusion strategies have been under development to address this issue. • There might be several problems with acquired data including: missing attributes, where several parameters may not have been measured during failure manifestation), improper data format, corrupt data, bad sensors or even the human operator errors.
Deep learning and its application
Many AI techniques have progressively developed over the past few decades. But some have become more popular in recent years which is largely attributed to an increase in computation power and big data. Deep learning is one of them, which essentially is a re-branding of neural networks. It is typically described as an application of neural networks to learning tasks that contain more than one hidden layer. The focus is to model high-level abstractions in data to determine a high-level meaning, that can either be applied as supervised, partially supervised, or unsupervised learning. In theory, a neural network with more than two layers i.e. input and output, can be classified as a deep architecture, however it is not just about the number of layers, but rather the idea of automated construction of more complex features on every step. This means that stacking other algorithms (such as a random forest) several times, use probabilities instead of class labels, and this can be considered as deep learning too. Back-propagation , which has existed for decades, theoretically allows to train a network with many layers. Prior to technological advances in computation power, researchers did not have widespread success training neural networks with more than 2 layers simple because of the many calculations that would be required to adjust the weights in the network. This also suffered from a problem of vanishing and exploding gradients . The network architecture was typically initialized using random numbers, and used the gradient of the network’s weights with respect to the network’s error.
How many layers? It was mentioned earlier that deep learning works because of the architecture of the network, but more importantly, the optimization routine applied to that architecture. As can be noted in Figure 4, each hidden layer is connected to many other hidden layers within the overall network. When an optimization routine is applied to the network, each hidden layer can become an optimally weighted, non-linear combination of the layers below it. As the size of each sequential hidden layer keeps decreasing, each hidden layer becomes a lower dimensional projection of the previous one. So, the information from each layer is being summarized in each subsequent layer of the deep network by a non-linear, optimally weighted, and lower dimensional projection. None-the-less, the training process can be a challenging and lengthy task when the network has many layers and multiple connections between layers and neurons; but nowadays many researchers are implementing this training phase in graphical processing units to leverage the power of parallel processing and reduce training time. However, once trained, classifying information becomes straightforward and fast to complete.
Several AI approaches have been used for health monitoring over the years, with interest to the aerospace industry and related applications. Many approaches that have emerged are difficult to implement exclusively with the software, mechanical and electrical domains and hence cross domain strategies must be devised. As the device fabrication process continues to improve, failure rates of hardware components have steadily declined over the years to the point where non-hardware failures are emerging as more dominant issues. Yet, with the increased scale of system designs, there is more emphasis now to reduce troubleshooting complexities and the time-to-fix problems when investigating failures with health monitoring systems. AI techniques have helped with some of these aspects. But efforts seem to concentrate on increasing fault detection at lower design levels. Of course, when detections occur closer to the actual fault event, isolation becomes possible. However, on the system level, decision making should be carried out based on a range of learning processes; hence health monitoring for high-value assets will improve downtime and cost implications. It is important to recognise that, despite the view which is prevalent among academic researchers, from a practical industrial viewpoint achieving efficient and effective implementations of any AI method for health monitoring applications is not completely solved. There are therefore many sub categories for exploration, both relating to designing algorithms for processing and the computing architectures. This project considered emerging research relating to deep learning of system health management. These methods are known to overcome the vanishing gradient problem; which was severely limiting the depth of neural networks. It is as simple as that. Typically, architectures were trained using backpropagation gradient descent where the weights were updated for each layer as a function of the derivative of the previous layer. However, there were limitations to this approach if network architectures increased and hence practitioners would often only use a single hidden-layer. But now, as there is the possibility to implement larger networks, it opens a door of opportunities to techniques such as auto-encoders for unsupervised problems, convolutional neural networks to classification, recurrent neural networks for time series, etc. Several opportunities exist for fundamental advancement in this area: the associated design challenges are equally significant: i) justification of resource costs and implementation complexity ii) gaining confidence in new strategies, iii) overcoming barriers for certification of certain application areas, iv) understanding the options of integrating deep learning methods at various design levels.
Although the effectiveness of the various approach was not addresses in detail, this work presented some of the requirements and recent advances of the engineering community. It is difficult to assess whether deep learning will be at the academic frontier in upcoming years – as academic research now-a-days tends to have a short shelf-life and get replaced by new ideas and trendy topics. But due to recent industrial efforts, the machine learning field has moved quickly, and perhaps a few years from now may look nothing like what is call “deep learning”. That being said, there is an unprecedented interest from a number of technology organizations, other academic disciplines, and even the general public, on the topic. Despite the hype and how academics perceive it, deep learning seems quite valuable in the monetary sense. It has enabled an array of real-world commercial products and services that were not technologically feasible before and hence it could prove useful for the system health management community.