Search results

Title: EMBEDDED SYSTEM DESIGN FOR TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING ALGORITHMS
Creator: Han, Yan
Date: 2016, 2016-12
Description: Traffic sign recognition system, taken as an important component of an intelligent vehicle system, has been an active research area and it has...
Show moreTraffic sign recognition system, taken as an important component of an intelligent vehicle system, has been an active research area and it has been investigated vigorously in the last decade. It is an important step for introducing intelligent vehicles into the current road transportation systems. Based on image processing and machine learning technologies, TSR systems are being developed cautiously by many manufacturers and have been set up on vehicles as part of a driving assistant system in recent years. Traffic signs are designed and placed in locations to be easily identified from its surroundings by human eyes. Hence, an intelligent system that can identify these signs as good as a human, needs to address a lot of challenges. Here, ―good‖ can be interpreted as accurate and fast. Therefore, developing a reliable, real-time and robust TSR system is the main motivation for this dissertation. Multiple TSR system approaches based on computer vision and machine learning technologies are introduced and they are implemented on different hardware platforms. Proposed TSR algorithms are comprised of two parts: sign detection based on color and shape analysis and sign classification based on machine learning technologies including nearest neighbor search, support vector machine and deep neural networks. Target hardware platforms include Xilinx ZedBoard FPGA and NVIDIA Jetson TX1 that provides GPU acceleration. Overall, based on a well-known benchmark suite, 96% detection accuracy is achieved while executing at 1.6 frames per seconds on the GPU board.
Ph.D. in Computer Engineering, December 2016
Show less

Title: DEEP LEARNING FOR IMAGE PROCESSING WITH APPLICATIONS TO MEDICAL IMAGING
Creator: Zarshenas, Amin
Date: 2019
Description: Deep Learning is a subfield of machine learning concerned with algorithms that learn hierarchical data representations. Deep learning has...
Show moreDeep Learning is a subfield of machine learning concerned with algorithms that learn hierarchical data representations. Deep learning has proven extremely successful in many computer vision tasks including object detection and recognition. In this thesis, we aim to develop and design deep-learning models to better perform image processing and tackle three important problems: natural image denoising, computed tomography (CT) dose reduction, and bone suppression in chest radiography (“chest x-ray”: CXR). As the first contribution of this thesis, we aimed to answer to probably the most critical design questions, under the task of natural image denoising. To this end, we defined a class of deep learning models, called neural network convolution (NNC). We investigated several design modules for designing NNC for image processing. Based on our analysis, we design a deep residual NNC (R-NNC) for this task. One of the important challenges in image denoising regards to a scenario in which the images have varying noise levels. Our analysis showed that training a single R-NNC on images at multiple noise levels results in a network that cannot handle very high noise levels; and sometimes, it blurs the high-frequency information on less noisy areas. To address this problem, we designed and developed two new deep-learning structures, namely, noise-specific NNC (NS-NNC) and a DeepFloat model, for the task of image denoising at varying noise levels. Our models achieved the highest denoising performance comparing to the state-of-the-art techniques.As the second contribution of the thesis, we aimed to tackle the task of CT dose reduction by means of our NNC. Studies have shown that high dose of CT scans can increase the risk of radiation-induced cancer in patients dramatically; therefore, it is very important to reduce the radiation dose as much as possible. For this problem, we introduced a mixture of anatomy-specific (AS) NNC experts. The basic idea is to train multiple NNC models for different anatomic segments with different characteristics, and merge the predictions based on the segmentations. Our phantom and clinical analysis showed that more than 90% dose reduction would be achieved using our AS NNC model.We exploited our findings from image denoising and CT dose reduction, to tackle the challenging task of bone suppression in CXRs. Most lung nodules that are missed by radiologists as well as by computer-aided detection systems overlap with bones in CXRs. Our purpose was to develop an imaging system to virtually separate ribs and clavicles from lung nodules and soft-tissue in CXRs. To achieve this, we developed a mixture of anatomy-specific, orientation-frequency-specific (ASOFS) expert deep NNC model. While our model was able to decompose the CXRs, to achieve an even higher bone suppression performance, we employed our deep R-NNC for the bone suppression application. Our model was able to create bone and soft-tissue images from single CXRs, without requiring specialized equipment or increasing the radiation dose.
Show less

Title: APPLICATION OF MACHINE LEARNING TO ELECTRICAL DATA ANALYSIS
Creator: Bao, Zhen
Date: 2017, 2017-05
Description: The dissertation is composed of four parts: modeling demand response capability by internet data centers processing batch computing jobs,...
Show moreThe dissertation is composed of four parts: modeling demand response capability by internet data centers processing batch computing jobs, cloud storage based power consumption management in internet data center, identifying hot socket problem in smart meters, and online event detection for non-intrusive load monitoring without knowing label. Mathematical models are constructed to fulfill the research of the four targets, and numerical examples are used to test the effectiveness of the models. The first two parts optimize jobs in Data Center in order to find the best way of utilizing the existing computing resources and storage. Mixed-integer programming (MIP) is used in the formulation. The purpose of the third part is to identify the hot socket problem in smart meter. Machine learning method has been used to locate the bad installation of smart meters by analyzing historical data from smart meters. The fourth part is non-intrusive load monitoring for residential load in houses. Signal processing and deep learning methods are used to identify the specific loads from high frequency signals.
Ph.D. in Electrical Engineering, May 2017
Show less

Title: ACTIVE INFERENCE FOR PREDICTIVE MODELS OF SPATIO-TEMPORAL DOMAINS
Creator: Komurlu, Caner
Date: 2019
Description: Active inference is the method of selective information gathering during prediction in order to increase a predictive machine learning model's...
Show moreActive inference is the method of selective information gathering during prediction in order to increase a predictive machine learning model's prediction performance. Unlike active learning, active inference does not update the model, but rather provides the model with useful information during prediction to boost the prediction performance. To be able to work with active inference, a predictive model needs to exploit correlations among variables that need to be predicted. Then the model, while being provided with true values for some of the variables, can make more accurate predictions for the remaining variables.In this dissertation, I propose active inference methods for predictive models of spatio-temporal domains. I formulate and investigate active inference in two different domains: tissue engineering and wireless sensor networks. I develop active inference for dynamic Bayesian networks (DBNs) and feed-forward neural networks (FFNNs).First, I explore the effect of active inference in the tissue engineering domain. I design a dynamic Bayesian network (DBN) model for vascularization of a tissue development site. The DBN model predicts probabilities of blood vessel invasion in regional scale through time. Then utilizing spatio-temporal correlations between regions represented as variables in the DBN model, I develop an active inference technique to detect the optimal time to stop a wet lab experiment. The empirical study shows that the active inference is able to detect the optimal time and the results are coherent with domain simulations and lab experiments.In the second phase of my research, I develop variance-based active inference techniques for dynamic Bayesian networks for the purpose of battery saving for wireless sensor networks (WSN). I propose the expected variance reduction active inference method to detect variables that reduce the overall variance the most. I first propose a DBN model of a WSN. I then compare the prediction performance of the DBN with Gaussian processes and linear chain graphical models on three different WSN data using several baseline active inference methods. After showing that DBNs perform better than the baseline predictive models, I compare the performance of expected variance reduction active inference method with the performances of baseline methods on the DBN, and show the superiority of the expected variance reduction on the three WSN data sets.Finally, to address the inference complexity and the limitation of representing linear correlations due to Gaussian assumption, I replace the DBN representation with a feed-forward neural network (FFNN) model. I first explore techniques to integrate observed values into predictions on neural networks. I adopt the input optimization technique. Finally, I discover two problems: model error and optimization overfitting. I show that the input optimization can mitigate the model error. Lastly, I propose a validation-based regularization approach to solve the overfitting problem.
Show less

Title: Fast mesh based reconstruction for cardiac-gated SPECT and methodology for medical image quality assessment
Creator: Massanes Basi, Francesc
Date: 2018
Description: In this work, we are studying two different subjects that are intricately connected. For the first subject we are considering tools to...
Show moreIn this work, we are studying two different subjects that are intricately connected. For the first subject we are considering tools to improve the quality of single photon emission computed tomography (SPECT) imaging. Currently, SPECT images assist physicians to evaluate perfusion levels within the myocardium, aide in the diagnosis of various types of carcinomas, and measure pulmonary function. The SPECT technique relies on injecting a radioactive material into the patient's body and then detecting the emitted radiation by means of a gamma camera. However, the amount of radioactive material that can be given to a patient is limited by the negative effects that the radiation will have on the patient's health. This causes SPECT images to be highly corrupted by noise. We will focus our work on cardiac SPECT, which adds the challenge of the heart's continuous motion during the acquisition process. First, we describe the methodology used in SPECT imaging and reconstruction. Our methodology uses a content adaptive model, which uses more samples on the regions of the body that we want to be reconstructed more accurately and less in other areas. Then we describe our algorithm and our novel implementation that lets us use the content adaptive model to perform the reconstruction. In this work, we show that our implementation outperforms the reconstruction method used for clinical applications. In the second subject we are evaluating tools to measure image quality in the context of medical diagnosis. In signal processing, accuracy is typically measured as the amount of similarity between an original signal and its reconstruction. This similarity is traditionally a numeric metric that does not take into account the intended purpose of the reconstructed images. In the field of medical imaging, a reconstructed image is meant to aid a physician to perform a diagnostic task. Therefore, the quality of the reconstruction should be measured by how much it helps to perform the diagnostic task. A model observer is a computer tool that aims to mimic the performance of human observer, usually a radiologist, at a relevant diagnosis task. In this work we present our linear model observer designed to automatically select the features needed to model a human observer response. This is a novelty from the model observers currently being used in the medical imaging field, which instead usually have ad-hoc chosen features. Our model observer dependents only on the resolution of the image, not the type of imaging technique used to acquire the image.
Show less

Title: Public Event Identification Traffic Data Using Machine Learning Approach
Creator: Yang, Hanyi
Date: 2020
Description: This study developed a shock waved diagram based deep learning model (SW-DLM) to predict the occurrence of public events in real-time...
Show moreThis study developed a shock waved diagram based deep learning model (SW-DLM) to predict the occurrence of public events in real-time according to their impacts on nearby highway traffic. Specifically, using point traffic volume data as a boundary condition, shock wave analysis is first conducted to understand the impacts and features of a public event on a nearby highway-ramp intersection. Next, this analysis develops the SWG algorithm to efficiently generate and expand shock wave diagrams in real-time according to the data collection rate. Built upon that, this study contributes a novel approach, which encodes a shock wave diagram with an optimal grid of pixels balancing resolution and computation load. Using the features extracted from encoded time-series shock wave diagrams as inputs, a deep learning approach, Long-short term memory (LSTM) model, is applied to predict the occurring of a public event. The numerical experiments based on the field data demonstrate that using encoded shock wave diagrams rather than point traffic data can significantly improve the accuracy of the deep learning for predicting the occurring of a public event. The SW-DLM presents satisfied prediction performance on the average as well as on an individual day with or without traffic accident interference, happening nearby the venue of a public event. The implementation of this approach to real-time traffic provision tools such as GPS will alert travelers en route on-going events in a transportation network and help travelers to make a smart trip plan and avoid traffic congestion. Moreover, it promotes smart city development by providing a strong capability to monitor the transportation system and conduct real-time traffic management intelligently.
Show less

Title: Reconfigurable High-Performance Computation and Communication Platform for Ultrasonic Applications
Creator: Wang, Boyang
Date: 2021
Description: In industrial and medical applications, ultrasonic signals are used in nondestructive testing (NDT), medical imaging, navigation, and...
Show moreIn industrial and medical applications, ultrasonic signals are used in nondestructive testing (NDT), medical imaging, navigation, and communication. This study presents the architecture of high-performance computational systems designed for ultrasonic nondestructive testing, data compression using machine learning, and a multilayer perceptron neural network for ultrasonic flaw detection and grain size characterization. We researched and developed a real-time software-defined ultrasonic communication system for transmitting information through highly reverberant and dispersive solid channels. Orthogonal frequency-division multiplexing is explored to combat the severe multipath effect in the solid channels and achieve an optimal bitrate solution. In this study, a reconfigurable, high-performance, low-cost, and real-time ultrasonic data acquisition and signal processing platform is designed based on an all-programmable system-on-chip (APSoC). We designed the unsupervised learning models using wavelet packet transformation optimized by convolutional autoencoder for massive ultrasonic data compression. The proposed learning models can achieve a compression accuracy of 98% by using only 6% of the original data. For ultrasonic signal analysis in NDT applications, we utilized the multilayer perceptron neural network (MLPNN) to detect flaw echoes masked by strong microstructure scattering noise (i.e., about zero dB SNR or less) with detection accuracy above 99%. It is of high interest to characterize materials using ultrasonic scattering properties for grain size estimation and classification. We successfully designed an MLPNN to classify the grain sizes of materials with an accuracy of 99%. Furthermore, a software-defined ultrasonic communication system based on the APSoC is designed for real-time data transmission through solid channels. Transducers with a center frequency of 2.5 MHz are used to transmit and receive information-bearing ultrasonic waves in solid channels where the communication bit rate can reach up to 1.5 Mbps.
Show less

Title: AUTOMATION OF ULTRASONIC FLAW DETECTION APPLICATIONS USING DEEP LEARNING ALGORITHMS
Creator: Virupakshappa, Kushal
Date: 2021
Description: The Industrial Revolution-4.0 promises to integrate multiple technologies including but not limited to automation, cloud computing, robotics,...
Show moreThe Industrial Revolution-4.0 promises to integrate multiple technologies including but not limited to automation, cloud computing, robotics, and Artificial Intelligence. The non-Destructive Testing (NDT) industry has been shifting towards automation as well. For ultrasound-based NDT, these technological advancements facilitate smart systems hosting complex signal processing algorithms. Therefore, this thesis introduces the effective use of AI algorithms in challenging NDT scenarios. The first objective is to investigate and evaluate the performance of both supervised and unsupervised machine learning algorithms and optimize them for ultrasonic flaw detection utilizing Amplitude-scan (A-scan) data. Several inferences and optimization algorithms have been evaluated. It has been observed that proper choice of features for specific inference algorithms leads to accurate flaw detection. The second objective of this study is the hardware realization of the ultrasonic flaw detection algorithms on embedded systems. Support Vector Machine algorithm has been implemented on a Tegra K1 GPU platform and Supervised Machine Learning algorithms have been implemented on a Zynq FPGA for a comparative study. The third main objective is to introduce new deep learning architectures for more complex flaw detection applications including classification of flaw types and robust detection of multiple flaws in B-scan data. The proposed Deep Learning pipeline combines a novel grid-based localization architecture with meta-learning. This provides a generalized flaw detection solution wherein additional flaw types can be used for inference without retraining or changing the deep learning architecture. Results show that the proposed algorithm performs well in more complex scenarios with high clutter noise and the results are comparable with traditional CNN and achieve the goal of generality and robustness.
Show less

Title: DEVELOPMENT OF BIOMARKERS OF SMALL VESSEL DISEASE IN AGING
Creator: Makkinejad, Nazanin
Date: 2021
Description: Age-related neuropathologies including cerebrovascular and neurodegenerative diseases play a critical role in cognitive dysfunction, and...
Show moreAge-related neuropathologies including cerebrovascular and neurodegenerative diseases play a critical role in cognitive dysfunction, and development of dementia. Designing methodologies for early prediction of these diseases are much needed. Since multiple pathologies commonly coexist in brains of older adults, clinical diagnosis lacks the specificity to isolate the pathology of interest, and gold standard is determined only at autopsy. Magnetic resonance imaging (MRI) provides a non-invasive tool to study abnormalities in brain characteristics that is unique to each pathology. Utilizing ex-vivo MRI for brain imaging proves to be useful as it eliminates two important biases of in-vivo MRI. First, no additional pathology would develop between imaging and pathologic examination, and second, frail older adults would not be excluded from MRI.Hence, the aims of this dissertation were two-fold: to study brain correlates of age- related neuropathologies, and to develop and validate classifiers of small vessel diseases by combining ex-vivo MRI and pathology in a large community cohort of older adults. The structure of the project is as follows.First, the association of amygdala volume and shape with transactive response DNA-binding protein 43 (TDP-43) pathology was investigated. Using a regularized regression technique, higher TDP-43 was associated with lower amygdala volume. Also, shape analysis of amygdala showed unique patterns of spatial atrophy associated with TDP-43 independent of other pathologies. Lastly, using linear mixed effect models, amygdala volume was shown to explain an additional portion of variance in cognitive decline above and beyond what was explained by the neuropathologies and demographics.Second, the previous study was extended to analyze other subcortical regions including the hippocampus, thalamus, nucleus accumbens, caudate, and putamen, and was also conducted in a larger dataset. The results showed unique contribution of TDP-43, neurofibrillary tangles (hallmark characteristic of Alzheimer’s disease pathology), and atherosclerosis (a cerebrovascular pathology) to atrophy on the surface of subcortical structures. Understanding the independent effects of each pathology on volume and shape of different brain regions can form a basis for the development of classifiers of age-related neuropathologies.Third, an in-vivo classifier of arteriolosclerosis was developed and validated. Arteriolosclerosis is one of the main pathologies of small vessel disease, is associated with cognitive decline and dementia, and currently has no standard biomarker available. In this work, the classifier was developed ex-vivo using machine learning (ML) techniques and was then translated to in-vivo. The in-vivo classifier was packaged as a software called ARTS, which outputs a score that is the likelihood of arteriolosclerosis when the required input is given to the software. It was tested and validated in various cohorts and showed to have high performance in predicting the pathology. It was also shown that higher ARTS score was associated with greater cognitive decline in domains that are specific to small vessel disease.Fourth, motivated by current trends and superiority of deep learning (DL) techniques in classification tasks in computer vision and medical imaging, a preliminary study was designed to use DL for training an ex-vivo classifier of arteriolosclerosis. Specifically, convolutional neural networks (CNNs) were applied on 3 Tesla ex-vivo MR images directly without providing prior information of brain correlates of arteriolosclerosis. One interesting aspect of the results was that the network learnt that white matter hyperintense lesions contributed the most to classification of arteriolosclerosis. These results were encouraging, and more future work will exploit the capability of DL techniques alongside the traditional ML approaches for more automation and possibly better performance.Finally, a preliminary classifier of arteriolosclerosis and small vessel atherosclerosis was developed since the existence of both pathologies in brain have devastating effects on cognition. The methodology was similar to the one used for development of arteriolosclerosis classifier with minor differences. The classifier showed a good performance in-vivo, although the testing needs to be assessed in more cohorts.The comprehensive study of age-related neuropathologies and their contribution to abnormalities of subcortical brain structures offers a great potential to develop a biomarker of each pathology. Also, the finding that the MR-based classifier of arteriolosclerosis showed high performance in-vivo demonstrate the potential of ex-vivo studies for development of biomarkers that are precise (because they are based on autopsy, which is the gold standard) and are expected to work well in-vivo. The implications of this study include development of biomarkers that could potentially be used in refined participant selection and enhanced monitoring of the treatment response in clinical drug and prevention trials.
Show less

Title: UTILITY OF WATERSHED MODELS: IMPROVING TMDL DEVELOPMENT THROUGH A MARGIN OF SAFETY ESTIMATION AND UNCERTAINTY COMMUNICATION
Creator: Nunoo, Robert
Date: 2020
Description: Watershed models are used to represent the physical, chemical, and biological mechanisms that determine the fate and transport of pollutants...
Show moreWatershed models are used to represent the physical, chemical, and biological mechanisms that determine the fate and transport of pollutants in waterbodies (Daniel 2011). These models, in general, are used for exploratory, planning, and regulatory purposes (Harmel et al. 2014). Watershed models have numerous applications; one such use is the development of total maximum daily load (TMDL). TMDL is the amount of pollution a waterbody can receive without becoming impaired. Because of the challenge of uncertainty associated with models and the TMDL development process, the United States Clean Water Act Section 303 (d)(1)(c) requires that a margin of safety (MOS) be specified to account for uncertainty in TMDLs. The question of how MOS is estimated in TMDL was identified as a problem by the National Research Council (NRC 2001). Since the identification of the problem about two decades ago, there have been very few inventories or audits of approved TMDL studies. This study describes a natural language processing and machine learning aided review of the MOS in approved TMDLs from 2002 to 2016. The study determined whether the MOS values incorporated followed a pattern and examined whether there exist a relationship between MOS values and some ecological conditions. Relatively few TMDLs were based on some form of calculation to estimate explicit MOS values; these TMDLs constituted only 16% of the reviewed sample. The remaining 84% used conventional values, but few of those studies provided reasons for their selected values. A statistical assessment of those MOS values revealed that the MOS depended on States (location of waterbody), USEPA regions, waterbody type, designated water use, TMDL model used, and dataavailability. The findings indicate that few TMDL developers are following the National Research Council’s suggestions of using a rigorous uncertainty estimation approach for rational choices for the MOS. An adaptive approach based on Bayes-Discrepancy was proposed for estimating an MOS for a TMDL. The approach is based on the Bayesian hierarchical framework of estimating uncertainty associated with watershed models. With this approach, TMDL developers can communicate the effects of their watershed model. The approach was applied to a Ferson Creek model of the Fox River watershed to access variability and uncertainty in the model results, and also estimate possible MOS values for two monitoring stations in the watershed. Results suggest that an MOS of 0.04 mg/L could lead to a 0.1 probability of violating the water quality standard for an underpredicting model. The Bayes-discrepancy estimation method will enable TMDL developers and watershed managers to strike a balance between implementation options and water quality concerns.
Show less

Title: A SCALABLE SIMULATION AND MODELING FRAMEWORK FOR EVALUATION OF SOFTWARE-DEFINED NETWORKING DESIGN AND SECURITY APPLICATIONS
Creator: Yan, Jiaqi
Date: 2019
Description: The world today is densely connected by many large-scale computer networks, supporting military applications, social communications, power...
Show moreThe world today is densely connected by many large-scale computer networks, supporting military applications, social communications, power grid facilities, cloud services, and other critical infrastructures. However, a gap has grown between the complexity of the system and the increasing need for security and resilience. We believe this gap is now reaching a tipping point, resulting in a dramatic change in the way that networks and applications are architected, developed, monitored, and protected. This trend calls for a scalable and high-fidelity network testing and evaluation platform to facilitate the transformation from in-house research ideas to real-world working solutions. With this objective, we investigate means to build a scalable and high-fidelity network testbed using container-based emulation and parallel simulation; our study focuses on the emerging software-defined networking (SDN) technology. Existing evaluation platforms facilitate the adoption of the SDN architecture and applications to production systems. However, the performance of those platforms is highly dependent on the underlying physical hardware resources. Insufficient resources would lead to undesired results, such as low experimental fidelity or slow execution speed, especially with large-scale network settings. To improve the testbed fidelity, we first develop a lightweight virtual time system for Linux container and integrate the system into a widely-used SDN emulator. A key issue with an ordinary container-based emulator is that it uses the system clock across all the containers even if a container is not being scheduled to run, which leads to the issue of both performance and temporal fidelity, especially with high workloads. We investigate virtual time approaches by precisely scaling the time of interactions between containers and physical devices. Our evaluation results indicate a definite improvement in fidelity and scalability. To improve the testbed scalability, we investigate how the centralized paradigm of SDN can be utilized to reduce the simulation workload. We explore a model abstraction technique that effectively transforms the SDN network devices to one virtualized switch model. While significantly reducing the model execution time and enabling the real-time simulation capability, our abstracted model also preserves the end-to-end forwarding behavior of the original network.With enhanced fidelity and scalability, it is realistic to utilize our network testbed to perform a security evaluation of various SDN applications. We notice that the communication network generates and processes a huge amount of data. The logically-centralized SDN control plane, on the one hand, has to process both critical control traffic and potentially big data traffic, and on the other hand, enables many efficient security solutions, such as intrusion detection, mitigation, and prevention. Recently, deep neural networks achieve state-of-the-art results across a range of hard problem spaces. We study how to utilize the big data and deep learning to secure communication networks and host entities. For classifying malicious network traffic, we have performed the feasibility study of off-line deep-learning based intrusion detection by constructing the detection engine with multiple advanced deep learning models. For malware classification on individual hosts, another necessity to secure computer systems, existing machine learning-based malware classification methods rely on handcrafted features extracted from raw binary files or disassembled code. The diversity of such features created has made it hard to build generic malware classification systems that work effectively across different operational environments. To strike a balance between generality and performance, we explore new graph convolutional neural network techniques to effectively yet efficiently classify malware programs represented as their control flow graphs.
Show less

Title: AI IN MEDICINE: ENABLING INTELLIGENT IMAGING, PROGNOSIS, AND MINIMALLY INVASIVE SURGERY
Creator: Getty, Neil
Date: 2022
Description: While an extremely rich research field, compared to other applications of AI such as natural language processing (NLP) and image processing...
Show moreWhile an extremely rich research field, compared to other applications of AI such as natural language processing (NLP) and image processing/generation, AI in medicine has been much slower to be applied in real-world clinical settings. Often the stakes of failure are more dire, the access of private and proprietary data more costly, and the burden of proof required by expert clinicians is much higher. Beyond these barriers, the often typical data-driven approach towards validation is interrupted by a need for expertise to analyze results. Whereas the results of a trained Imagenet or machine translation model are easily verified by a computational researcher, analysis in medicine can be much more multi-disciplinary demanding. AI in medicine is motivated by a great demand for progress in health-care, but an even greater responsibility for high accuracy, model transparency, and expert validation.This thesis develops machine and deep learning techniques for medical image enhancement, patient outcome prognosis, and minimally invasive robotic surgery awareness and augmentation. Each of the works presented were undertaken in di- rect collaboration with medical domain experts, and the efforts could not have been completed without them. Pursuing medical image enhancement we worked with radiologists, neuroscientists and a neurosurgeon. In patient outcome prognosis we worked with clinical neuropsychologists and a cardiovascular surgeon. For robotic surgery we worked with surgical residents and a surgeon expert in minimally invasive surgery. Each of these collaborations guided priorities for problem and model design, analysis, and long-term objectives that ground this thesis as a concerted effort towards clinically actionable medical AI. The contributions of this thesis focus on three specific medical domains. (1) Deep learning for medical brain scans: developed processing pipelines and deep learn- ing models for image annotation, registration, segmentation and diagnosis in both traumatic brain injury (TBI) and brain tumor cohorts. A major focus of these works is on the efficacy of low-data methods, and techniques for validation of results without any ground truth annotations. (2) Outcome prognosis for TBI and risk prediction for Cardiovascular Disease (CVD): we developed feature extraction pipelines and models for TBI and CVD patient clinical outcome prognosis and risk assessment. We design risk prediction models for CVD patients using traditional Cox modeling, machine learning, and deep learning techniques. In this works we conduct exhaustive data and model ablation study, with a focus on feature saliency analysis, model transparency, and usage of multi-modal data. (3) AI for enhanced and automated robotic surgery: we developed computer vision and deep learning techniques for understanding and augmenting minimally invasive robotic surgery scenes. We’ve developed models to recognize surgical actions from vision and kinematic data. Beyond model and techniques, we also curated novel datasets and prediction benchmarks from simulated and real endoscopic surgeries. We show the potential for self-supervised techniques in surgery, as well as multi-input and multi-task models.
Show less

Title: ROBUST AND EXPLAINABLE RESULTS UTILIZING NEW METHODS AND NON-LINEAR MODELS
Creator: Onallah, Amir
Date: 2022
Description: This research focuses on robustness and explainability of new methods, and nonlinear analysis compared to traditional methods and linear...
Show moreThis research focuses on robustness and explainability of new methods, and nonlinear analysis compared to traditional methods and linear analysis. Further, it demonstrates that making assumptions, reducing the data, or simplifying the problem results in negative effect on the outcomes. This study utilizes the U.S. Patent Inventor database and the Medical Innovation dataset. Initially, we employ time-series models to enhance the quality of the results for event history analysis (EHA), add insights, and infer meanings, explanations, and conclusions. Then, we introduce newer algorithms of machine learning and machine learning with a time-to-event element to offer more robust methods than previous papers and reach optimal solutions by removing assumptions or simplifications of the problem, combine all data that encompasses the maximum knowledge, and provide nonlinear analysis.
Show less

Title: Sharpen Quality Investing: A PLS-based Approach
Creator: Jiao, Zixuan
Date: 2022
Description: I apply a disciplined dimension reduction technique called Partial Least Square (PLS) to construct a new quality factor by aggregating...
Show moreI apply a disciplined dimension reduction technique called Partial Least Square (PLS) to construct a new quality factor by aggregating information from 16 individual signals. It earns significant risk-adjusted returns and outperforms quality factors constructed by alternative techniques, namely, PCA, Fama-Macbeth regression, a combination of PCA and Fama-Mabeth regression and a Rank-based approach. I show that my quality factor performs even better during rough economic patches and thus appears to hedge periods of market distress. I further show adding our quality factor to an opportunity set consisting of the other classical factors increases the maximum Sharpe ratio.
Show less

Title: Intelligent Job Scheduling on High Performance Computing Systems
Creator: Fan, Yuping
Date: 2021
Description: Job scheduler is a crucial component in high-performance computing (HPC) systems. It sorts and allocates jobs according to site policies and...
Show moreJob scheduler is a crucial component in high-performance computing (HPC) systems. It sorts and allocates jobs according to site policies and resource availability. It plays an important role in the efficient use of system resources and users satisfaction. Existing HPC job schedulers typically leverage simple heuristics to schedule jobs. However, the rapid growth in system infrastructure and the introduction of diverse workloads pose serious challenges to the traditional heuristic approaches. First, the current approaches concentrate on CPU footprint and ignore the performance of other resources. Second, the scheduling policies are manually designed and only consider some isolated job information, such as job size and runtime estimate. Such a manual design process prevents the schedulers from making informative decisions by extracting the abundant environment information (i.e., system and queue information). Moreover, they can hardly adapt to workload changes, leading to degraded scheduling performance. These challenges call for a new job scheduling framework that can extract useful information from diverse workloads and the increasingly complicated system environment, and finally make well-informed scheduling decisions in real time.In this work, we propose an intelligent HPC job scheduling framework to address these emerging challenges. Our research takes advantage of advanced machine learning and optimization methods to extract useful workload- and system-specific information and to further educate the framework to make efficient scheduling decisions under various system configurations and diverse workloads. The framework contains four major efforts. First, we focus on providing more accurate job runtime estimations. Estimated job runtime is one of the most important factors affecting scheduling decisions. However, user provided runtime estimates are highly inaccurate and existing solutions are prone to underestimation which causes jobs to be killed. We leverage and enhance a machine learning method called Tobit model to improve the accuracy of job runtime estimates at the same time reduce underestimation rate. More importantly, using TRIP’s improved job runtime estimates boosts scheduling performance by up to 45%. Second, we conduct research on multi-resource scheduling. HPC systems are undergoing significant changes in recent years. New hardware devices, such as GPU and burst buffer, have been integrated into production HPC systems, which significantly expands the schedulable resources. Unfortunately, the current production schedulers allocate jobs solely based on CPU footprint, which severely hurts system performance. In our work, we propose a framework taking all scalable resources into consideration by transforming this problem into multi-objective optimization (MOO) problem and rapid solving it via genetic algorithm. Next, we leverage reinforcement learning (RL) to automatically learn efficient workload- and system-specific scheduling policies. Existing HPC schedulers either use generalized and simple heuristics or optimization methods that ignore workload and system characteristics. To overcome this issue, we design a new scheduling agent DRAS to automatically learn efficient scheduling policies. DRAS leverages the advance in deep reinforcement learning and incorporates the key features of HPC scheduling in the form of a hierarchical neural network structure. We develop a three-phase training process to help DRAS effectively learn the scheduling environment (i.e., the system and its workloads) and to rapidly converge to an optimal policy. Finally, we explore the problem of scheduling mixed workloads, i.e., rigid, malleable and on-demand workloads, on a single HPC system. Traditionally, rigid jobs are the main tenants of HPC systems. In recent years, malleable applications, i.e., jobs that can change sizes before and during execution, are emerging on HPC systems. In addition, dedicated clusters were the main platforms to run on-demand jobs, i.e., jobs needed to be completed in the shortest time possible. As the sizes of on-demand jobs are growing, HPC systems become more cost-efficient platforms for on-demand jobs. However, existing studies do not consider the problem of scheduling all three types of workloads. In our work, we propose six mechanisms, which combine checkpointing, shrink, expansion techniques, to schedule the mixed workloads on one HPC system.
Show less

Title: Towards Trustworthy Multiagent and Machine Learning Systems
Creator: Xie, Shangyu
Date: 2022
Description: This dissertation aims to systematically research the "trustworthy" Multiagent and Machine Learning systems in the context of the Internet of...
Show moreThis dissertation aims to systematically research the "trustworthy" Multiagent and Machine Learning systems in the context of the Internet of Things (IoT) system, which mainly consists of two aspects: data privacy and robustness. Specifically, data privacy concerns about the protection of the data in one given system, i.e., the data identified to be sensitive or private cannot be disclosed directly to others; robustness refers to the ability of the system to defend/mitigate the potential attacks/threats, i.e., maintaining the stable and normal operation of one system.Starting from the smart grid, a representative multiagent system in the IoT, I demonstrate two works on improving data privacy and robustness in aspects of different applications, load balancing and energy trading, which integrates secure multiparty computation (SMC) protocols for normal computation to ensure data privacy. More significantly, the schemes can be readily extended to other applications in IoT, e.g., connected vehicles, mobile sensing systems.For the machine learning, I have studied two main areas, i.e., computer vision and natural language processing with the privacy and robustness correspondingly. I first present the comprehensive robustness evaluation study of the DNN-based video recognition systems with two novel proposed attacks in both test and training phase, i.e., adversarial and poisoning attacks. Besides, I also propose the adaptive defenses to fully evaluate such two attacks, which can thus further advance the robustness of system. I also propose the privacy evaluation for the language systems and show the practice to reveal and address the privacy risks in the language models. Finally, I demonstrate a private and efficient data computation framework with the cloud computing technology to provide more robust and private IoT systems.
Show less

Title: Deep Learning Methods For Wireless Networks Optimization
Creator: Zhang, Shuai
Date: 2022
Description: The resurgence of deep learning techniques has brought forth fundamental changes to how hard problems could be solved. It used to be held that...
Show moreThe resurgence of deep learning techniques has brought forth fundamental changes to how hard problems could be solved. It used to be held that the solutions to complex wireless network problems require accurate mathematical modeling of the network operation, but now the success of deep learning has shown that a data-driven method could generate powerful and useful representations such that the problem could be solved efficiently with surprisingly competent performance. Network researchers have recognized this and started to capitalize on the learning methods’ prowess. But most works follow the existing black-box learning paradigms without much accommodation to the nature and essence of the underlying network problems. This thesis focuses on a particular type of classical problem: multiple commodity flow scheduling in an interference-limited environment. Though it does not permit efficient exact algorithms due to its NP-hard complexity, we use it as an entry point to demonstrate from three angles how the learning-based methods can help improve the network performance. In the first part, we leverage the graphical neural network (GNN) techniques and propose a two-stage topology-aware machine learning framework, which trains a graph embedding unit and a link usage prediction module jointly to discover links that are likely to be used in optimal scheduling. The second part of the thesis is an attempt to find a learning method that has a closer algorithmic affinity to the traditional DCG method. We make use of reinforcement learning to incrementally generate a better partial solution such that a high quality solution may be found in a more efficient manner. As the third part of the research, we revisit the MCF problem from a novel viewpoint: instead of leaning on the neural networks to directly generate the good solutions, we use them to associate the current problem instance with historical ones that are similar in structure. These matched instances’ solutions offer a highly useful starting point to allow efficient discovery of the new instance’s solution.
Show less

Title: Systematic Serendipity: A Study in Discovering Anomalous Astrophysics
Creator: Giles, Daniel K
Date: 2020
Description: In the present era of large scale surveys, big data presents new challenges to the discovery process for anomalous data. Advances in astronomy...
Show moreIn the present era of large scale surveys, big data presents new challenges to the discovery process for anomalous data. Advances in astronomy are often driven by serendipitous discoveries. Such data can be indicative of systematic errors, extreme (or rare) forms of known phenomena, or most interestingly, truly novel phenomena which exhibit as-of-yet unobserved behaviors. As survey astronomy continues to grow, the size and complexity of astronomical databases will increase, and the ability of astronomers to manually scour data and make such discoveries decreases. In this work, we introduce a machine learning-based method to identify anomalies in large datasets to facilitate such discoveries, and apply this method to long cadence light curves from NASA's Kepler Mission. Our method clusters data based on density, identifying anomalies as data that lie outside of dense regions in a derived feature space. First we present a proof-of-concept case study and we test our method on four quarters of the Kepler long cadence light curves. We use Kepler's most notorious anomaly, Boyajian's Star (KIC 8462852), as a rare `ground truth' for testing outlier identification to verify that objects of genuine scientific interest are included among the identified anomalies. Additionally, we report the full list of identified anomalies for these quarters, and present a sample subset of identified outliers that includes unusual phenomena, objects that are rare in the Kepler field, and data artifacts. By identifying <4% of each quarter as outlying data, under 6k individual targets for the dataset used, we demonstrate that this anomaly detection method can create a more targeted approach in searching for rare and novel phenomena.We further present an outlier scoring methodology to provide a framework of prioritization of the most potentially interesting anomalies. We have developed a data mining method based on k-Nearest Neighbor distance in feature space to efficiently identify the most anomalous light curves. We test variations of this method including using principal components of the feature space, removing select features, the effect of the choice of k, and scoring to subset samples. We evaluate the performance of our scoring on known object classes and find that our scoring consistently scores rare (<1000) object classes higher than common classes, meaning that rarer objects are successfully prioritized over common objects. The most common class, categorized as miscellaneous stars without any major variability, and rotational variables compose well over two-thirds of the KIC, yet are considerably underrepresented in the top outliers. We have applied scoring to all long cadence light curves of quarters 1 to 17 of Kepler's prime mission and present outlier scores for all 2.8 million light curves for the roughly 200k objects.
Show less

Title: Machine Learning (ML) for Extreme Weather Power Outage Forecasting in Power Distribution Networks
Creator: Bahrami, Anahita
Date: 2023
Description: The Midwest region experiences a diverse range of severe weather conditions throughout the year. During the warmer months, thunderstorms,...
Show moreThe Midwest region experiences a diverse range of severe weather conditions throughout the year. During the warmer months, thunderstorms, heavy rain, lightning, tornadoes, and high winds pose a threat, while the colder season brings ice storms, snowstorms, high winds, and sleet storms, all of which can cause significant damage to the environment, properties, transportation systems, and power grids. The average climate in the Midwest is influenced by factors such as latitude, solar input, water systems' typical positions and movements, topography, the Great Lakes, and human activities. The combination of these conditions during different seasons contributes to the development of various types of storms. Therefore, it is crucial to predict the impacts of such atmospheric events on distribution and transmission lines, enabling utilities to assess and implement preventive measures and strategies to minimize the economic losses associated with these disasters. Additionally, the accurate classification of storm modes through an automated system allows operators to study trends in relation to climate change and implement necessary strategies to ensure grid reliability and resilience.In recent years, a significant number of power outages have occurred due to extreme ice formation on transmission and distribution networks, posing a threat to the power grid's resilience and reliability. To prepare power providers for snowstorms, extensive research has been conducted on snow accretion on power lines. Over the past two decades, many scientists have turned to machine learning (ML) algorithms for predicting ice accretion on overhead conductors, as ML models demonstrate superior accuracy compared to statistical forecasting models when it comes to forecasting challenging and fine-grained problems. However, most existing models primarily focus on predicting ice formation on power lines and fail to forecast the resulting damage to the distribution network. Therefore, this project proposes a model for predicting power outages caused by snow and ice storms in the distribution network. The goal is to aid in the planning process for disaster response and ensure the resilience and reliability of the power grid. The proposed outage prediction model incorporates statistical and machine learning techniques, taking into account features related to weather conditions, storm events, and information about the power network feeders.
Show less

repository.iit

Search the repository

Enabled Filters

Refine Results

Type

Date

Department

Subject

Creator