Search results

Title: Data-Driven Methods for Soft Robot Control and Turbulent Flow Models
Creator: Lopez, Esteban Fernando
Date: 2022
Description: The world today has seen an exponential increase in its usage of computers for communication and measurement. Thanks to recent technologies,...
Show moreThe world today has seen an exponential increase in its usage of computers for communication and measurement. Thanks to recent technologies, we are now able to collect more data than ever before. This has dawned a new age of data-driven methods which can describe systems and behaviors with increasing accuracy. Whereas before we relied on the expertise of a few professionals with domain-specific knowledge developed over years of rigorous study, we are now able to rely on collected data to reveal patterns, develop novel ideas, and offer solutions to the world’s engineering problems. No domain is safe. Within the engineering realm, data-driven methods have seen vast usage in the areas of control and system identification. In this thesis we explore two areas of data-driven methods, namely reinforcement learning and data-driven causality. Reinforcement learning is a method by which an agent learns to increase its selection of ideal actions and behaviors which result in an increasing reward. This method was applied to a soft-robotic concept called the JAMoEBA to solve various tasks of interest in the robotics community, specifically tunnel navigation, obstacle field navigation, and object manipulation. A validation study was conducted to show the complications that arise when applying reinforcement learning to such a complex system. Nevertheless, it was shown that reinforcement learning is capable of solving three key tasks (static tunnel navigation, obstacle field navigation, and object manipulation) using specific simulation and learning hyperparameters. Data-driven causality encompasses a range of metrics and methods which attempt to uncover causal relationships between variables in a system. Several information theoretic causal metrics were developed and applied to nine mode turbulent flow data set which represents the Moehlis model. It was shown that careful consideration into the method used was required to identify significant causal relationships. Causal relationships were shown to converge over several hundred realizations of the turbulent model. Furthermore, these results match the expected causal relationships given known information of self-sustaining processes in turbulence, validating the method’s ability to identify causal relationships in turbulence.
Show less

Title: Intelligent Job Scheduling on High Performance Computing Systems
Creator: Fan, Yuping
Date: 2021
Description: Job scheduler is a crucial component in high-performance computing (HPC) systems. It sorts and allocates jobs according to site policies and...
Show moreJob scheduler is a crucial component in high-performance computing (HPC) systems. It sorts and allocates jobs according to site policies and resource availability. It plays an important role in the efficient use of system resources and users satisfaction. Existing HPC job schedulers typically leverage simple heuristics to schedule jobs. However, the rapid growth in system infrastructure and the introduction of diverse workloads pose serious challenges to the traditional heuristic approaches. First, the current approaches concentrate on CPU footprint and ignore the performance of other resources. Second, the scheduling policies are manually designed and only consider some isolated job information, such as job size and runtime estimate. Such a manual design process prevents the schedulers from making informative decisions by extracting the abundant environment information (i.e., system and queue information). Moreover, they can hardly adapt to workload changes, leading to degraded scheduling performance. These challenges call for a new job scheduling framework that can extract useful information from diverse workloads and the increasingly complicated system environment, and finally make well-informed scheduling decisions in real time.In this work, we propose an intelligent HPC job scheduling framework to address these emerging challenges. Our research takes advantage of advanced machine learning and optimization methods to extract useful workload- and system-specific information and to further educate the framework to make efficient scheduling decisions under various system configurations and diverse workloads. The framework contains four major efforts. First, we focus on providing more accurate job runtime estimations. Estimated job runtime is one of the most important factors affecting scheduling decisions. However, user provided runtime estimates are highly inaccurate and existing solutions are prone to underestimation which causes jobs to be killed. We leverage and enhance a machine learning method called Tobit model to improve the accuracy of job runtime estimates at the same time reduce underestimation rate. More importantly, using TRIP’s improved job runtime estimates boosts scheduling performance by up to 45%. Second, we conduct research on multi-resource scheduling. HPC systems are undergoing significant changes in recent years. New hardware devices, such as GPU and burst buffer, have been integrated into production HPC systems, which significantly expands the schedulable resources. Unfortunately, the current production schedulers allocate jobs solely based on CPU footprint, which severely hurts system performance. In our work, we propose a framework taking all scalable resources into consideration by transforming this problem into multi-objective optimization (MOO) problem and rapid solving it via genetic algorithm. Next, we leverage reinforcement learning (RL) to automatically learn efficient workload- and system-specific scheduling policies. Existing HPC schedulers either use generalized and simple heuristics or optimization methods that ignore workload and system characteristics. To overcome this issue, we design a new scheduling agent DRAS to automatically learn efficient scheduling policies. DRAS leverages the advance in deep reinforcement learning and incorporates the key features of HPC scheduling in the form of a hierarchical neural network structure. We develop a three-phase training process to help DRAS effectively learn the scheduling environment (i.e., the system and its workloads) and to rapidly converge to an optimal policy. Finally, we explore the problem of scheduling mixed workloads, i.e., rigid, malleable and on-demand workloads, on a single HPC system. Traditionally, rigid jobs are the main tenants of HPC systems. In recent years, malleable applications, i.e., jobs that can change sizes before and during execution, are emerging on HPC systems. In addition, dedicated clusters were the main platforms to run on-demand jobs, i.e., jobs needed to be completed in the shortest time possible. As the sizes of on-demand jobs are growing, HPC systems become more cost-efficient platforms for on-demand jobs. However, existing studies do not consider the problem of scheduling all three types of workloads. In our work, we propose six mechanisms, which combine checkpointing, shrink, expansion techniques, to schedule the mixed workloads on one HPC system.
Show less

Title: Towards Trustworthy Multiagent and Machine Learning Systems
Creator: Xie, Shangyu
Date: 2022
Description: This dissertation aims to systematically research the "trustworthy" Multiagent and Machine Learning systems in the context of the Internet of...
Show moreThis dissertation aims to systematically research the "trustworthy" Multiagent and Machine Learning systems in the context of the Internet of Things (IoT) system, which mainly consists of two aspects: data privacy and robustness. Specifically, data privacy concerns about the protection of the data in one given system, i.e., the data identified to be sensitive or private cannot be disclosed directly to others; robustness refers to the ability of the system to defend/mitigate the potential attacks/threats, i.e., maintaining the stable and normal operation of one system.Starting from the smart grid, a representative multiagent system in the IoT, I demonstrate two works on improving data privacy and robustness in aspects of different applications, load balancing and energy trading, which integrates secure multiparty computation (SMC) protocols for normal computation to ensure data privacy. More significantly, the schemes can be readily extended to other applications in IoT, e.g., connected vehicles, mobile sensing systems.For the machine learning, I have studied two main areas, i.e., computer vision and natural language processing with the privacy and robustness correspondingly. I first present the comprehensive robustness evaluation study of the DNN-based video recognition systems with two novel proposed attacks in both test and training phase, i.e., adversarial and poisoning attacks. Besides, I also propose the adaptive defenses to fully evaluate such two attacks, which can thus further advance the robustness of system. I also propose the privacy evaluation for the language systems and show the practice to reveal and address the privacy risks in the language models. Finally, I demonstrate a private and efficient data computation framework with the cloud computing technology to provide more robust and private IoT systems.
Show less

Title: Deep Learning Methods For Wireless Networks Optimization
Creator: Zhang, Shuai
Date: 2022
Description: The resurgence of deep learning techniques has brought forth fundamental changes to how hard problems could be solved. It used to be held that...
Show moreThe resurgence of deep learning techniques has brought forth fundamental changes to how hard problems could be solved. It used to be held that the solutions to complex wireless network problems require accurate mathematical modeling of the network operation, but now the success of deep learning has shown that a data-driven method could generate powerful and useful representations such that the problem could be solved efficiently with surprisingly competent performance. Network researchers have recognized this and started to capitalize on the learning methods’ prowess. But most works follow the existing black-box learning paradigms without much accommodation to the nature and essence of the underlying network problems. This thesis focuses on a particular type of classical problem: multiple commodity flow scheduling in an interference-limited environment. Though it does not permit efficient exact algorithms due to its NP-hard complexity, we use it as an entry point to demonstrate from three angles how the learning-based methods can help improve the network performance. In the first part, we leverage the graphical neural network (GNN) techniques and propose a two-stage topology-aware machine learning framework, which trains a graph embedding unit and a link usage prediction module jointly to discover links that are likely to be used in optimal scheduling. The second part of the thesis is an attempt to find a learning method that has a closer algorithmic affinity to the traditional DCG method. We make use of reinforcement learning to incrementally generate a better partial solution such that a high quality solution may be found in a more efficient manner. As the third part of the research, we revisit the MCF problem from a novel viewpoint: instead of leaning on the neural networks to directly generate the good solutions, we use them to associate the current problem instance with historical ones that are similar in structure. These matched instances’ solutions offer a highly useful starting point to allow efficient discovery of the new instance’s solution.
Show less

Title: Systematic Serendipity: A Study in Discovering Anomalous Astrophysics
Creator: Giles, Daniel K
Date: 2020
Description: In the present era of large scale surveys, big data presents new challenges to the discovery process for anomalous data. Advances in astronomy...
Show moreIn the present era of large scale surveys, big data presents new challenges to the discovery process for anomalous data. Advances in astronomy are often driven by serendipitous discoveries. Such data can be indicative of systematic errors, extreme (or rare) forms of known phenomena, or most interestingly, truly novel phenomena which exhibit as-of-yet unobserved behaviors. As survey astronomy continues to grow, the size and complexity of astronomical databases will increase, and the ability of astronomers to manually scour data and make such discoveries decreases. In this work, we introduce a machine learning-based method to identify anomalies in large datasets to facilitate such discoveries, and apply this method to long cadence light curves from NASA's Kepler Mission. Our method clusters data based on density, identifying anomalies as data that lie outside of dense regions in a derived feature space. First we present a proof-of-concept case study and we test our method on four quarters of the Kepler long cadence light curves. We use Kepler's most notorious anomaly, Boyajian's Star (KIC 8462852), as a rare `ground truth' for testing outlier identification to verify that objects of genuine scientific interest are included among the identified anomalies. Additionally, we report the full list of identified anomalies for these quarters, and present a sample subset of identified outliers that includes unusual phenomena, objects that are rare in the Kepler field, and data artifacts. By identifying <4% of each quarter as outlying data, under 6k individual targets for the dataset used, we demonstrate that this anomaly detection method can create a more targeted approach in searching for rare and novel phenomena.We further present an outlier scoring methodology to provide a framework of prioritization of the most potentially interesting anomalies. We have developed a data mining method based on k-Nearest Neighbor distance in feature space to efficiently identify the most anomalous light curves. We test variations of this method including using principal components of the feature space, removing select features, the effect of the choice of k, and scoring to subset samples. We evaluate the performance of our scoring on known object classes and find that our scoring consistently scores rare (<1000) object classes higher than common classes, meaning that rarer objects are successfully prioritized over common objects. The most common class, categorized as miscellaneous stars without any major variability, and rotational variables compose well over two-thirds of the KIC, yet are considerably underrepresented in the top outliers. We have applied scoring to all long cadence light curves of quarters 1 to 17 of Kepler's prime mission and present outlier scores for all 2.8 million light curves for the roughly 200k objects.
Show less

Title: Defense-in-Depth for Cyber-Secure Network Architectures of Industrial Control Systems
Creator: Arnold, David James
Date: 2024
Description: Digitization and modernization efforts have yielded greater efficiency, safety, and cost-savings for Industrial Control Systems (ICS). To...
Show moreDigitization and modernization efforts have yielded greater efficiency, safety, and cost-savings for Industrial Control Systems (ICS). To achieve these gains, the Internet of Things (IoT) has become an integral component of network infrastructures. However, integrating embedded devices expands the network footprint and softens cyberattack resilience. Additionally, legacy devices and improper security configurations are weak points for ICS networks. As a result, ICSs are a valuable target for hackers searching for monetary gains or planning to cause destruction and chaos. Furthermore, recent attacks demonstrate a heightened understanding of ICS network configurations within hacking communities. A Defense-in-Depth strategy is the solution to these threats, applying multiple security layers to detect, interrupt, and prevent cyber threats before they cause damage. Our solution detects threats by deploying an Enhanced Data Historian for Detecting Cyberattacks. By introducing Machine Learning (ML), we enhance cyberattack detection by fusing network traffic and sensor data. Two computing models are examined: 1) a distributed computing model and 2) a localized computing model. The distributed computing model is powered by Apache Spark, introducing redundancy for detecting cyberattacks. In contrast, the localized computing model relies on a network traffic visualization methodology for efficiently detecting cyberattacks with a Convolutional Neural Network. These applications are effective in detecting cyberattacks with nearly 100% accuracy. Next, we prevent eavesdropping by applying Homomorphic Encryption for Secure Computing. HE cryptosystems are a unique family of public key algorithms that permit operations on encrypted data without revealing the underlying information. Through the Microsoft SEAL implementation of the CKKS algorithm, we explored the challenges of introducing Homomorphic Encryption to real-world applications. Despite these challenges, we implemented two ML models: 1) a Neural Network and 2) Principal Component Analysis. Finally, we hinder attackers by integrating a Cyberattack Lockdown Network with Secure Ultrasonic Communication. When a cyberattack is detected, communication for safety-critical elements is redirected through an ultrasonic communication channel, establishing physical network segmentation with compromised devices. We present proof-of-concept work in transmitting video via ultrasonic communication over an Aluminum Rectangular Bar. Within industrial environments, existing piping infrastructure presents an optimal solution for cost-effectively preventing eavesdropping. The effectiveness of these solutions is discussed within the scope of the nuclear industry.
Show less

Title: Large Language Model Based Machine Learning Techniques for Fake News Detection
Creator: Chen, Pin-Chien
Date: 2024
Description: With advanced technology, it’s widely recognized that everyone owns one or more personal devices. Consequently, people are evolving into...
Show moreWith advanced technology, it’s widely recognized that everyone owns one or more personal devices. Consequently, people are evolving into content creators on social media or the streaming platforms sharing their personal ideas regardless of their education or expertise level. Distinguishing fake news is becoming increasingly crucial. However, the recent research only presents comparisons of detecting fake news between one or more models across different datasets. In this work, we applied Natural Language Processing (NLP) techniques with Naïve Bayes and DistilBERT machine learning method combing and augmenting four datasets. The results show that the balanced accuracy is higher than the average in the recent studies. This suggests that our approach holds for improving fake news detection in the era of widespread content creation.
Show less

Title: Machine Learning (ML) for Extreme Weather Power Outage Forecasting in Power Distribution Networks
Creator: Bahrami, Anahita
Date: 2023
Description: The Midwest region experiences a diverse range of severe weather conditions throughout the year. During the warmer months, thunderstorms,...
Show moreThe Midwest region experiences a diverse range of severe weather conditions throughout the year. During the warmer months, thunderstorms, heavy rain, lightning, tornadoes, and high winds pose a threat, while the colder season brings ice storms, snowstorms, high winds, and sleet storms, all of which can cause significant damage to the environment, properties, transportation systems, and power grids. The average climate in the Midwest is influenced by factors such as latitude, solar input, water systems' typical positions and movements, topography, the Great Lakes, and human activities. The combination of these conditions during different seasons contributes to the development of various types of storms. Therefore, it is crucial to predict the impacts of such atmospheric events on distribution and transmission lines, enabling utilities to assess and implement preventive measures and strategies to minimize the economic losses associated with these disasters. Additionally, the accurate classification of storm modes through an automated system allows operators to study trends in relation to climate change and implement necessary strategies to ensure grid reliability and resilience.In recent years, a significant number of power outages have occurred due to extreme ice formation on transmission and distribution networks, posing a threat to the power grid's resilience and reliability. To prepare power providers for snowstorms, extensive research has been conducted on snow accretion on power lines. Over the past two decades, many scientists have turned to machine learning (ML) algorithms for predicting ice accretion on overhead conductors, as ML models demonstrate superior accuracy compared to statistical forecasting models when it comes to forecasting challenging and fine-grained problems. However, most existing models primarily focus on predicting ice formation on power lines and fail to forecast the resulting damage to the distribution network. Therefore, this project proposes a model for predicting power outages caused by snow and ice storms in the distribution network. The goal is to aid in the planning process for disaster response and ensure the resilience and reliability of the power grid. The proposed outage prediction model incorporates statistical and machine learning techniques, taking into account features related to weather conditions, storm events, and information about the power network feeders.
Show less

Title: Adaptive Learning Approach of a Domain-Aware CNN-Based Model Observer
Creator: Bogdanovic, Nebojsa
Date: 2023
Description: Application of convolutional neural networks (CNNs) for performing defect detection tasks and their use as model observers (MO) has become...
Show moreApplication of convolutional neural networks (CNNs) for performing defect detection tasks and their use as model observers (MO) has become increasingly popular in the medical imaging field. Building upon this use of CNN MOs, we have trained the CNNs to discern between the data it was trained on, and the previously unseen images. We termed this ability domain awareness. To achieve domain awareness, we are simultaneously training a new variation of U-Net CNN to perform defect detection task, as well as to reconstruct a noisy input image. We have shown that the values of the reconstruction mean squared error can be used as a good indicator of how well the algorithm performs in the defect localization task, making a big step towards developing a domain aware CNN MO. Additionally, we have proposed an adaptive learning approach for training these algorithms, and compared them to the non-adaptive learning approach. The main results that we achieved were for the ideal observers, but we also extended these results to human observer data. We have compared different architectures of CNNs with different numbers and sizes of layers, as well as introduced data augmentation to further improve upon our results. Finally, our results show that the proposed adaptive learning approach with introduced data augmentation drastically improves upon the results of a non-adaptive approach in both human and ideal observer cases.
Show less

Title: Utilizing Concurrent Data Accesses for Data-Driven and AI Applications
Creator: Lu, Xiaoyang
Date: 2024
Description: In the evolving landscape of data-driven and AI applications, the imperative for reducing data access delay has never been more critical,...
Show moreIn the evolving landscape of data-driven and AI applications, the imperative for reducing data access delay has never been more critical, especially as these applications increasingly underpin modern daily life. Traditionally, architectural optimizations in computing systems have concentrated on data locality, utilizing temporal and spatial locality to enhance data access performance by maximizing data and data block reuse. However, as poor locality is a common characteristic of data-driven and AI applications, utilizing data access concurrency emerges as a promising avenue to optimize the performance of evolving data-driven and AI application workloads.This dissertation advocates utilizing concurrent data accesses to enhance performance in data-driven and AI applications, addressing a significant research gap in the integration of data concurrency for performance improvement. It introduces a suite of innovative case studies, including a prefetching framework that dynamically adjusts aggressiveness based on data concurrency, a cache partitioning framework that balances application demands with concurrency, a concurrency-aware cache management framework to reduce costly cache misses, a holistic cache management framework that considers both data locality and concurrency to fine-tune decisions, and an accelerator design for sparse matrix multiplication that optimizes adaptive execution flow and incorporates concurrency-aware cache optimizations.Our comprehensive evaluations demonstrate that the implemented concurrency-aware frameworks significantly enhance the performance of data-driven and AI applications by leveraging data access concurrency.Specifically, our prefetch framework boosts performance by 17.3%, our cache partitioning framework surpasses locality-based approaches by 15.5%, and our cache management framework achieves a 10.3% performance increase over prior works. Furthermore, our holistic cache management framework enhances performance further, achieving a 13.7% speedup. Additionally, our sparse matrix multiplication accelerator outperforms existing accelerators by a factor of 2.1.As optimizing data locality in data-driven and AI applications becomes increasingly challenging, this dissertation demonstrates that utilizing concurrency can still yield significant performance enhancements, offering new insights and actionable examples for the field. This dissertation not only bridges the identified research gap but also establishes a foundation for further exploration of the full potential of concurrency in data-driven and AI applications and architectures, aiming at fulfilling the evolving performance demands of modern and future computing systems.
Show less

Title: Measurement and Control of Beam Energy at the Fermilab 400 MeV Transfer Line
Creator: Mwaniki, Matilda W.
Date: 2023
Description: Linac is the first machine in the Accelerator chain at Fermilab where particles are accelerated from 35 keV to 400 MeV and travel to the...
Show moreLinac is the first machine in the Accelerator chain at Fermilab where particles are accelerated from 35 keV to 400 MeV and travel to the Booster where they are stripped of the extra electrons to become protons. Tuning Linac is performed using diagnostics to ensure stable intensity and energy while minimizing uncontrolled particle loss. I have been revisiting diagnostics in the Linac in order to understand their signals and to ensure their data is reliable. I revisited Beam Loss Monitors (BLMs) for the loss data confidence. For the confidence of energy data there were two approaches. The first approach was time-of-flight measurements using Beam Position Monitors (BPMs) and beam velocity stripline pick-up that provides beam phase data. The second approach used the relation between beam position data from BPMs and dispersion values from MAD-X simulation to calculate energy. Our goal after understanding the data from the Linac diagnostics and finding the data reliable is to control the Linac parameters using Machine Learning techniques to increase the reliability and quality of beam delivered from Linac.
Show less

Title: Multimodal Learning and Generation Toward a Multisensory and Creative AI System
Creator: Zhu, Ye
Date: 2023
Description: We are perceiving and communicating with the world in a multisensory manner, where different information sources are sophisticatedly processed...
Show moreWe are perceiving and communicating with the world in a multisensory manner, where different information sources are sophisticatedly processed and interpreted by separate parts of the human brain to constitute a complex, yet harmonious and unified intelligent system. To endow the machines with true intelligence, multimodal machine learning that incorporates data from various modalities including vision, audio, and text, has become an increasingly popular research area with emerging technical advances in recent years. Under the context of multimodal learning, the creativity to generate and synthesize novel and meaningful data is a critical criterion to assess machine intelligence.As a step towards a multisensory and creative AI system, we study the problem of multimodal generation in this thesis by exploring the field from multiple perspectives. Firstly, we analyze different data modalities in a comprehensive manner by comparing the data natures, the semantics, and their corresponding mainstream technical designs. We then propose to investigate three multimodal generation application scenarios, namely text generation from visual data, audio generation from visual data, and visual generation from textual data, with diverse approaches to give an overview of the field. For the direction of text generation from visual data, we study a novel multimodal task in which the model is expected to summarize a given video with textual descriptions, under a challenging condition where the video can only be partially seen. We propose to supplement the missing visual information via a dialogue interaction and introduce QA-Cooperative network with a dynamic dialogue history update learning mechanism to tackle the challenge. For the direction of audio generation from visual data, we present a new multimodal task that aims to generate music for a given silent dance video clip. Unlike most existing conditional music generation works that generate specific types of mono-instrumental sounds using symbolic audio representations (e.g., MIDI), and that heavily rely on pre-defined musical synthesizers, we generate dance music in complex styles (e.g., pop, breaking, etc.) by employing a Vector-Quantized (VQ) audio representation via our proposed Dance2Music-GAN (D2M-GAN) framework. For the direction of visual generation from textual data, we tackle a key desideratum in conditional synthesis, which is to achieve high correspondence between the conditioning input and generated output using the state-of-the-art generative model -- Diffusion Probabilistic Model. While most existing methods learn such relationships implicitly, by incorporating the prior into the variational lower bound in model training. In this work, we take a different route by explicitly enhancing input-output connections by maximizing their mutual information, which is achieved by our proposed Conditional Discrete Contrastive Diffusion (CDCD) framework. For each direction, we conduct extensive experiments on multiple multimodal datasets and demonstrate that all of our proposed frameworks are able to effectively and substantially improve task performance in their corresponding contexts.
Show less

repository.iit

Search the repository

Pages

Pages

Enabled Filters

Refine Results

Type

Date

Department

Subject

Creator