Date of Completion


Embargo Period



Dr. Ugur Pasaogullari, Dr. Hongyi Xu, Dr. Sanjeev Damle

Field of Study

Advanced Manufacturing for Energy Systems (AMES)


Master of Science

Open Access

Open Access


Gas turbines are expensive, revenue generating machines and, as such, there is a strong interest in practicing state-of-the-art maintenance techniques to keep them running at healthy performance levels. One way to monitor and evaluate gas turbine health is to train a machine learning model on historical run-to-failure sensor data to differentiate between healthy and unhealthy performance. The biggest barrier to building these models is the scarcity of run-to-failure data. Not only is this data expensive and time consuming to acquire, but the data is often not publicly released for competitive purposes. This thesis uses a publicly available run-to-failure dataset previously created through the C-MAPSS gas turbine engine simulation software to train and test Support Vector Machine (SVM) machine learning algorithms that classify the health of a gas turbine engine. In particular, this thesis studies the performance of the models when they are asked to make predictions on a new dataset containing faulty data; defined as data that violate the laws of thermodynamics. Faulty data can realistically be acquired if sensors aren’t calibrated and/or the wrong type of sensor is used to measure a certain parameter. A novel physics-based filter is introduced that scans the dataset and removes all instances of violations of thermodynamic laws. The ability of the physics-based filter to identify remove faulty data is highly dependent on the signal processing steps taken, if any, before applying the filter. The trained machine learning models are tested on new datasets where the faulty data has either been left in, or some/all of it has been removed using the physics-based filter. By testing scenarios in which the physics-based filter is applied either before or after signal processing occurs, the importance of evaluating the quality of a dataset before using it for analysis is illuminated.

Major Advisor

Dr. Ugur Pasaogullari