Search results
(81 - 100 of 124)
Pages
- Title
- A FRAMEWORK FOR MANAGING UNSPECIFIED ASSUMPTIONS IN SAFETY-CRITICAL CYBER-PHYSICAL SYSTEMS
- Creator
- Fu, Zhicheng
- Date
- 2020
- Description
-
For a cyber-physical system, its execution behaviors are often impacted by its operating environment. However, the assumptions about a cyber...
Show moreFor a cyber-physical system, its execution behaviors are often impacted by its operating environment. However, the assumptions about a cyber-physical system’s expected environment are often informally documented, or even left unspecified during the system development process. Unfortunately, such unspecified assumptions made in cyber-physical systems, such as medical cyber-physical systems, can result in patients’ injures and loss of lives. Based on the U.S. Food and Drug Administration (FDA) data, from 2006 to 2011, there were 5,294 recalls and 1,154,451 adverse events resulting in 92,600 patient injuries and 25,800 deaths. One of the most critical reasons for these medical device recalls is the violations of unspecified assumptions. These compelling data motivated us to research unspecified assumptions issues in safety-critical cyber-physical systems, and develop approaches to reduce the failures caused by unspecified assumptions.In particular, this thesis is to study the issues of unspecified assumptions in cyber-physical system design process, and to develop an unspecified assumption management framework to (1) identify unspecified assumptions in system design models; (2) facilitate domain experts to perform impact analysis on the failures caused by violating unspecified assumptions; and (3) explicitly model unspecified assumptions in system design models for system safety validation and verification.Before starting to develop the unspecified assumption management framework, we first need to study how unspecified assumptions may be introduced into cyber- physical systems. We took cases from the FDA medical device recall database to analyze the root causes of medical device failures. By analyzing these cases, we found two important facts: (1) one of the major reasons that causes medical device recalls is violation of some unspecified assumptions; and (2) unspecified assumptions are often introduced into the system design models through syntactic carriers. Based on the two findings, we propose a framework for managing unspecified assumption in cyber- physical system development process. The framework has three components. The first component is called the Unspecified Assumption Carrier Finder (UACFinder), which is to identify unspecified assumptions in system design models through automatically extracting syntactic carriers associated with unspecified assumptions. However, as the number of unspecified assumptions identified from system design models can be large, and it may not be always feasible for domain experts to validate and address the most safety-critical assumptions at different system development phases. Therefore, the second component of the framework is a methodology that uses the Failure Mode and Effects Analysis (FMEA) based prioritization approach to facilitate domain experts to perform impact analysis on unspecified assumptions identified by the UACFinder and asses their safety-critical level. The third component of the framework describes a model architecture and corresponding algorithms to model and integrate assumptions into system design models, so that system safety associated with these unspecified assumptions can be validated and formally verified by existing tools.We also have conducted case-studies on representative system models to demonstrate how UACFinder can identify unspecified assumptions from system design mod- els, and how the FMEA based prioritizing approach can facilitate domain experts to verify the appropriateness of identified assumptions. In addition, case studies are also conducted to demonstrate how system safety properties can be improved by modeling and integrating unspecified assumptions into system models. The results of case-studies indicate that the unspecified assumption management framework can identify unspecified assumptions, facilitate domain experts to validate and verify the appropriateness of identified assumptions, and explicitly specify assumptions that would cause defects in these systems.
Show less
- Title
- Integrity based landmark generation: A method to generate landmark configurations that guarantee mobile robot localization safety
- Creator
- Chen, Yihe
- Date
- 2020
- Description
-
From the bronze-age city Nineveh to the modern metropolitan like Tokyo, traffic shape cities and profoundly affect the life of people. Similar...
Show moreFrom the bronze-age city Nineveh to the modern metropolitan like Tokyo, traffic shape cities and profoundly affect the life of people. Similar to how the wide-spreading of automobile had modified the modern cities in early 20th century, we are now standing on the eve of yet another traffic revolution. With the vast spreading of autonomous/semi- autonomous robotics application, it is important for the urban designers to design or retrofit urban environment that is safe and friendly to the autonomous robots; As more robots are deployed in life-critical situations, such as autonomous passenger vehicles, it is imperative to consider their safety, and in particular, their localization safety. While it would be ideal to guarantee safety in any environment without having to physically modify said environment, this is not always possible and one may have add landmarks or active beacons to reach an acceptable level of safety for landmark-based localization. Localization safety is assessed using integrity, the primary safety metric used in open-sky aviation applications that has been recently applied to mobile robots and can ac- count for the impact of rarely occurring, undetected faults. Conventional integrity monitor- ing method has high dependency on GPS system, while the traditional Global Navigation Satellite System - Inertia Measurement Unit (GNSS-IMU) based localization does not ap- plied in the metropolitan areas due to the signal blocking and multi-pathing problem caused by high-rise structures. Thus, this dissertation concentrates on the feature based integrity monitoring method. This dissertation formulates environmental localization safety problem as a system- atic optimization problem: given the robot’s trajectory and the current landmark map, add the minimal number of new landmarks at certain location such that the integrity risk along the trajectory is below a given safety threshold. This dissertation proposes two algorithms to solve the problem: Integrity-based Landmark Generator (I-LaG) and Fast I-LaG. I-LaG adds fewer landmarks but it is relatively computationally expensive; Fast I-LaG is less com- putationally intensive at the expense of more landmarks. Both simulation and experimental results are presented.
Show less
- Title
- WHY AND WHY-NOT PROVENANCE FOR QUERIES WITH NEGATION
- Creator
- Lee, Seokki
- Date
- 2020
- Description
-
Explaining why an answer is in the result of a query or why it is missing from the result is important for many applications including...
Show moreExplaining why an answer is in the result of a query or why it is missing from the result is important for many applications including auditing, debugging data and queries, hypothetical reasoning about data, and data exploration. Both types of questions, i.e., why and why-not provenance, have been studied extensively, but mostly in isolation. A recent study shows that unification of why and why-not provenance can be achieved by developing a provenance model for queries with negation. In many complex queries, negation is natural and yields more expressive power. Thus, supporting both types of provenance and negation together can be useful for, e.g., debugging (missing) data over complex queries with negation. However, why-not provenance and — to a lesser degree — why provenance, can be very large resulting in severe scalability and usability challenges.In this thesis, we introduce a framework that unifies why and why-not provenance. We develop a graph-based provenance model that is powerful enough to encode the evaluation of queries with negation (First-Order queries). We demonstrate that our model generalizes a wide range of provenance models from the literature. Using our model, we present the first practical approach that efficiently generates explanations, i.e., parts of the provenance that are relevant to the query outputs of interest. Furthermore, we present a novel approximate summarization technique to address the scalability and usability challenges. Our technique efficiently computes pattern-based provenance summaries that balance informativeness, conciseness, and completeness. To achieve scalability, we integrate sampling techniques into provenance capture and summarization. We implement these techniques in our PUG (Provenance Unification through Graphs) system which runs on top of a relational database. We demonstrate through extensive experiments that our approach scales to large datasets and produces comprehensive and meaningful (summaries of) provenance.
Show less
- Title
- DATA SHARING WITH PRIVACY AND SECURITY
- Creator
- Qian, Jianwei
- Date
- 2019
- Description
-
Data is a non-exclusive resource and has synergistic effects. Open data sharing will enhance the utilization of big data’s value and...
Show moreData is a non-exclusive resource and has synergistic effects. Open data sharing will enhance the utilization of big data’s value and tremendously boost economic growth and transparency. Data sharing platforms have emerged worldwide, but with very limited services. Security is one of the main reasons why most data are not commonly shared. This dissertation aims to tackle several security issues in building a trustworthy data sharing ecosystem. First, I reveal the privacy risks in data sharing by designing de-anonymization and privacy inference attacks. Second, I present an analysis of the relationship between the attacker's knowledge and the privacy risk of data sharing, and try quantifying and estimating the risk. Then, I propose anonymization algorithms to protect the privacy of participants in data sharing. Finally, I survey the status quo, privacy and security concerns, and opportunities in data trading. This dissertation involves various data types with a focus on graph data and speech data; it also involves various forms of data sharing including collection, publishing, query, and trading.
Show less
- Title
- A SCALABLE SIMULATION AND MODELING FRAMEWORK FOR EVALUATION OF SOFTWARE-DEFINED NETWORKING DESIGN AND SECURITY APPLICATIONS
- Creator
- Yan, Jiaqi
- Date
- 2019
- Description
-
The world today is densely connected by many large-scale computer networks, supporting military applications, social communications, power...
Show moreThe world today is densely connected by many large-scale computer networks, supporting military applications, social communications, power grid facilities, cloud services, and other critical infrastructures. However, a gap has grown between the complexity of the system and the increasing need for security and resilience. We believe this gap is now reaching a tipping point, resulting in a dramatic change in the way that networks and applications are architected, developed, monitored, and protected. This trend calls for a scalable and high-fidelity network testing and evaluation platform to facilitate the transformation from in-house research ideas to real-world working solutions. With this objective, we investigate means to build a scalable and high-fidelity network testbed using container-based emulation and parallel simulation; our study focuses on the emerging software-defined networking (SDN) technology. Existing evaluation platforms facilitate the adoption of the SDN architecture and applications to production systems. However, the performance of those platforms is highly dependent on the underlying physical hardware resources. Insufficient resources would lead to undesired results, such as low experimental fidelity or slow execution speed, especially with large-scale network settings. To improve the testbed fidelity, we first develop a lightweight virtual time system for Linux container and integrate the system into a widely-used SDN emulator. A key issue with an ordinary container-based emulator is that it uses the system clock across all the containers even if a container is not being scheduled to run, which leads to the issue of both performance and temporal fidelity, especially with high workloads. We investigate virtual time approaches by precisely scaling the time of interactions between containers and physical devices. Our evaluation results indicate a definite improvement in fidelity and scalability. To improve the testbed scalability, we investigate how the centralized paradigm of SDN can be utilized to reduce the simulation workload. We explore a model abstraction technique that effectively transforms the SDN network devices to one virtualized switch model. While significantly reducing the model execution time and enabling the real-time simulation capability, our abstracted model also preserves the end-to-end forwarding behavior of the original network.With enhanced fidelity and scalability, it is realistic to utilize our network testbed to perform a security evaluation of various SDN applications. We notice that the communication network generates and processes a huge amount of data. The logically-centralized SDN control plane, on the one hand, has to process both critical control traffic and potentially big data traffic, and on the other hand, enables many efficient security solutions, such as intrusion detection, mitigation, and prevention. Recently, deep neural networks achieve state-of-the-art results across a range of hard problem spaces. We study how to utilize the big data and deep learning to secure communication networks and host entities. For classifying malicious network traffic, we have performed the feasibility study of off-line deep-learning based intrusion detection by constructing the detection engine with multiple advanced deep learning models. For malware classification on individual hosts, another necessity to secure computer systems, existing machine learning-based malware classification methods rely on handcrafted features extracted from raw binary files or disassembled code. The diversity of such features created has made it hard to build generic malware classification systems that work effectively across different operational environments. To strike a balance between generality and performance, we explore new graph convolutional neural network techniques to effectively yet efficiently classify malware programs represented as their control flow graphs.
Show less
- Title
- ENHANCING PRIVACY AND SECURITY IN IOT-BASED SMART HOME
- Creator
- Du, Haohua
- Date
- 2019
- Description
-
The IoT-based smart home is envisioned as a system that augments everyone’s daily life. In the past few years, the smart home attracted...
Show moreThe IoT-based smart home is envisioned as a system that augments everyone’s daily life. In the past few years, the smart home attracted immense attention from the industrial organizations and has been considered as one of the principal pillars of the fourth industrial revolution. However, while the rapidly increasing number of Internet-connected smart devices expends the functionalities of smart homes, it also raises substantial security and privacy concerns.Commonly, a smart home system is composed of three major components, smart devices, communication among devices, and smart applications connecting the devices. Thus, this dissertation aims to enhance the security and privacy of the smart home system without weakening its functionalities from the perspectives of these three components. First, I improve the security of smart devices within the smart home by monitoring their behaviors based on the contextual environment. Then, I enhance the security of the communications among the devices through visible light communication, whose receivers have to be physically visible to senders and avoid possible eavesdropping. Finally, I study two popular smart applications – the augmented reality assistant and the cloud-based surveillance system, to discuss how to define privacy, how to reduce the leakage, and how to balance the privacy and security in the smart home. This dissertation proposes the mechanisms for each component, respectively, and it also implements the design in the real-world for evaluating their effectiveness and efficiency.
Show less
- Title
- Semantics and further Use-Cases and Evaluation of the C-Saw language
- Creator
- Zhu, Henry, Zhao, Junyong, Sultana, Nik
- Date
- 2023-03-09
- Description
-
This report provides supplementary technical details to the conference paper that introduced C-Saw, a language for expressing software...
Show moreThis report provides supplementary technical details to the conference paper that introduced C-Saw, a language for expressing software architecture patterns. This report provides additional examples of using C-Saw, supplementary evaluation details, and it defines the formal semantics of the language.
Show less
- Title
- Continuous Generalization of 2’s Complement Arithmetic
- Creator
- Patel, Shivam
- Date
- 2022-11-26
- Title
- Towards In-Network Semantic Analysis: A Case Study involving Spam Classification
- Creator
- Gueyraud, Cyprien, Sultana, Nik
- Date
- 2023-03-06
- Description
-
Analyzing free-form natural language expressions “in the network”—that is, on programmable switches and smart NICs—would enable packet...
Show moreAnalyzing free-form natural language expressions “in the network”—that is, on programmable switches and smart NICs—would enable packet-handling decisions that are based on the textual content of flows. This analysis would support richer, latency-critical data services that depend on language analysis—such as emergency response, misinformation classification, customer support, and query-answering applications. But packet forwarding and processing decisions usually rely on simple analyses based on table look-ups that are keyed on well-defined (and usually fixed size) header fields. P4 is the state of the art domain-specific language for programming network equipment, but, to the best of our knowledge, analyzing free-form text using P4 has not yet been investigated. Although there is an increasing variety of P4-programmable commodity network hardware available, using P4 presents considerable technical challenges for text analysis since the language lacks loops and fractional datatypes. This paper presents the first Bayesian spam classifier written in P4 and evaluates it using a standard dataset. The paper contributes techniques for the tokenization, analysis, and classification of free-form text using P4, and investigates trade-offs between classification accuracy and resource usage. It shows how classification accuracy can be tuned between 69.1% and 90.4%, and how resource usage can be reduced to 6% by trading-off accuracy. It uses the spam filtering use-case to motivate the need for more research into in network text analysis to enable future “semantic analysis” applications in programmable networks.
Show less
- Title
- Efficient management of uncertain data
- Creator
- Feng, Su
- Date
- 2023
- Description
-
Uncertainty arises naturally in many application domains. It can be caused by an uncertain data source (sensor errors, noise, etc.). Data...
Show moreUncertainty arises naturally in many application domains. It can be caused by an uncertain data source (sensor errors, noise, etc.). Data preprocessing techniques (data curation, data integration, etc.) can also results in uncertainty to the data. Analyzing uncertain data without accounting for its uncertainty can create hard to trace errors, with severe real world implications. Certain answers are a principled method for coping with the uncertainty that arises in many practical data management tasks. Unfortunately, this method is expensive and may exclude useful (if uncertain) answers. Other techniques from incomplete database record and propagate more detailed uncertainty information. However, most of these approaches are either too expensive to be practical, or only focus on a narrow class of queries and only work for a specific representation. In this thesis, we investigate models and query semantics for uncertain data management and present a framework that is general and practically efficient, backed up by fundamental theoretical foundations and with formally proven correctness guarantees. We first propose Uncertainty Annotated Databases (UA-DB), which combine an under- and over-approximation of certain answers to combine the reliability of certain answers with the performance of a classical database system. We then introduce attribute-annotated uncertain databases (AU-DB), which extend the UA-DB model with attribute-level annotations that record bounds on the values of an attribute across all possible worlds. AU-DB extends UA-DBs to encode a compact over-approximation of possible answers which is necessary to support non-monotone queries including aggregation and set difference. With a further extension to AU-DB that supports ranking and windowed aggregation queries using native implementation on modern DBMS, our approaches scale to complex queries and large datasets, and produces accurate results. Furthermore, they significantly outperforms alternative methods for uncertain data management.
Show less
- Title
- Designs and Optimizations of Oblivious Data Access for Mitigating Access Pattern Leakage
- Creator
- Che, Yuezhi
- Date
- 2023
- Description
-
In today’s data-driven world, data outsourcing has grown, increasing the importance of data security and privacy. Data encryption, while...
Show moreIn today’s data-driven world, data outsourcing has grown, increasing the importance of data security and privacy. Data encryption, while providing some protection, is insufficient against side-channel attacks such as access pattern leakage. This thesis focuses on designing and optimizing efficient oblivious access methods to enhance data security and privacy. Traditional solutions, like Oblivious RAM (ORAM), often impose significant overheads, limiting their market adoption. Our research proposes novel oblivious data access schemes tailored to specific applications, systems, and contexts. This approach enables us to identify critical vulnerabilities and performance bottlenecks, and balance performance, security, and other relevant parameters. In this thesis, I present four published works in Chapters 3 to 6, demonstrating the effectiveness of my proposed methods: (1) optimizing Ring ORAM for multi-channel memory systems, (2) introducing a multi-range supported ORAM for locality-aware applications, (3) proposing an oblivious data access solution for NVM hybrid memory systems, and (4) developing an oblivious access method for deep neural networks (DNNs), ensuring privacy without sacrificing performance. These contributions address unique challenges across application domains, enhancing data security and privacy in contemporary computing systems. This thesis provides a comprehensive investigation of targeted oblivious access methods, highlighting the benefits of the proposed designs, and contributing to more effective solutions for access pattern leakage mitigation, ultimately improving data security and privacy in contemporary computing systems.
Show less
- Title
- Towards Utility-Driven Data Analytics with Differential Privacy
- Creator
- Wang, Han
- Date
- 2023
- Description
-
The widespread use of personal devices and dedicated recording facilities has led to the generation of massive amounts of personal information...
Show moreThe widespread use of personal devices and dedicated recording facilities has led to the generation of massive amounts of personal information or data. Some of them are high-dimensional and unstructured data, such as video and location data. Analyzing these data can provide significant benefits in real-world scenarios, such as videos for monitoring and location data for traffic analysis. However, while providing benefits, these complicated data always raise serious privacy concerns since all of them involve personal information. To address privacy issues, existing privacy protection methods often fail to provide adequate utility in practical applications due to the complexity of high-dimensional and unstructured data. For example, most video sanitization techniques merely obscure the video by detecting and blurring sensitive regions, such as faces, vehicle plates, locations, and timestamps. Unfortunately, privacy breaches in blurred videos cannot be effectively contained, especially against unknown background knowledge. In this thesis, we propose three different differentially private frameworks to preserve the utility of video and location data (both are high-dimensional and unstructured data) while meeting the privacy requirements, under different well-known privacy settings. Specifically, to our best knowledge, wepropose the first differentially private video analytics platform (VideoDP) which flexibly supports different video queries or query-based analyze with a rigorous privacy guarantee. Given the input video, VideoDP randomly generates a utility-driven private video in which adding or removing any sensitive visual element (e.g., human, and object) does not significantly affect the output video. Then, different video analyses requested by untrusted video analysts can be flexibly performed over the sanitized video with differential privacy. Secondly, we define a novel privacy notion ϵ-Object Indistinguishability for all the predefined sensitive objects (e.g., humans, vehicles) in the video, and then propose a video sanitization technique VERRO that randomly generates utility-driven synthetic videos with indistinguishable objects. Therefore, all the objects can be well protected in the generated utility-driven synthetic videos which can be disclosed to any untrusted video recipient. Third, we propose the first strict local differential privacy (LDP) framework for location-based service (LBS) (“L-SRR”) to privately collect and analyze user locations or trajectories with ε-LDP guarantees. Specifically, we design a novel LDP mechanism “staircase randomized response” (SRR) and extend the empirical estimation to further boost the utility for a diverse set of LBS Apps (e.g., traffic density estimation, k nearest neighbors search, origin-destination analysis, and traffic-aware GPS navigation). Finally, we conduct experiments on real videos and location dataset, and the experimental results demonstrate all frameworks can have good performance.
Show less
- Title
- Approximation Algorithms for Selected Network and Graph Problems
- Creator
- Wang, Xiaolang
- Date
- 2023
- Description
-
This dissertation proposes new polynomial-time approximation algorithms for selected optimization problems, including network and classic...
Show moreThis dissertation proposes new polynomial-time approximation algorithms for selected optimization problems, including network and classic graph problems. We employed distinct strategies and techniques to solve these problems. In Chapter 1, we consider a problem we term FCSA, which aims to find an optimum way how clients are assigned to servers such that the largest latency on an interactivity path between two clients (client 1 to server 1, server 1 to server 2, then server 2 to client 2) is minimized. We present a (3/2)-approximation algorithm for FCSA and a (3/2)-approximation algorithm when server capacity constraints are considered. In Chapter 2, we focus on two variants of the Steiner Tree Problem and present better approximation ratios using known algorithms. For the Steiner Tree with minimum number of Steiner points and bounded edge length problem, we provide a polynomial time algorithm with ratio 2.277. For the Steiner Tree in quasi-bipartite graphs, we improve the best-known approximation ratio to 298/245 . In Chapter 3, we address the problem of searching for a maximum weighted series-parallel subgraph in a given graph, and present a (1/2 + 1/60)-approximation for this problem. Although there is currently no known real-life application of this problem, it remains an important and challenging open question in the field.
Show less
- Title
- Understanding Location Bias in Fake News Datasets of Twitter
- Creator
- Patil, Kayenat Kailas
- Date
- 2023
- Description
-
Fake news tends to spread faster and wider than real news. It has a greater impact and can lead to negative and dangerous outcomes. With the...
Show moreFake news tends to spread faster and wider than real news. It has a greater impact and can lead to negative and dangerous outcomes. With the world spending an increasing amount of time on their mobile devices, people tend to get more of their news from their desired social media platform. It has become part of our daily lives, whether it is to keep in touch with friends and family, to getting gossip on celebrities or even shopping. In 2022, the average time a person spends per day on the internet on a social media platform has been accounted to be about 147 minutes,[1] indicating an increase in time spent scrolling through information online.It has become a widespread phenomenon in recent years, thanks in part to the rapid spread of information through social media and other online channels. It is increasingly important to explore and understand fake news and its impact on society, as well as to develop effective tools and methods for detecting and combating it. There are several factors that can tamper with the successful detection of fake news. Machine learning models often fall to such biases that result in inaccurate predictions. There are several biases that have been identified like age, gender, sex and many more. In this thesis, we are exploring location as a form of a bias and if it hinders prediction. We have looked at location from two perspectives. One, taking location as co-ordinates in the form of latitude and longitude and analyzing the likelihood of a tweet coming from a location to be fake or not. The second method we have used is that we have considered location as an entity and used natural language processing model to see if its able to predict if the given tweet is fake or not, along with masking the location mentioned in the tweet and analyzing how the performance of the model changes. Machine learning models can play an important role in fake news detection models, by analyzing large amounts of data and identifying patterns and indicators that suggest a piece of information may be false or misleading, but they are often susceptible to some form of biases. By studying biases on machine learning models on fake news datasets, we can develop more effective tools for identifying fake news and taking steps towards mitigating it, ultimately helping to protect the integrity of information and promote informed decision-making in society.
Show less
- Title
- Effect of Pre-Processing Data on Fairness and Fairness Debugging using GOPHER
- Creator
- Sarkar, Mousam
- Date
- 2023
- Description
-
At present, Artificial intelligence has been contributing to the decision-making process heavily. Bias in machine learning models has existed...
Show moreAt present, Artificial intelligence has been contributing to the decision-making process heavily. Bias in machine learning models has existed throughout and present studies’ direct usage of eXplainable Artificial Intelligence (XAI) approaches to identify and study bias. To solve the problem of locating bias and then mitigating it has been achieved by Gopher [1]. It generates interpretable top-k explanations for the unfairness of the model and it also identifies subsets of training data that are the root cause of this unfair behavior. We utilize this system to study the effect of pre-processing on bias through provenance. The concept of data lineage through tagging of data points during and after the pre-processing stage is implemented. Our methodology and results provide a useful point of reference for studying the relation of pre-processing data with the unfairness of the machine learning model.
Show less
- Title
- Image Synthesis with Generative Adversarial Networks
- Creator
- Ouyang, Xu
- Date
- 2023
- Description
-
Image synthesis refers to the process of generating new images from an existing dataset, with the objective of creating images that closely...
Show moreImage synthesis refers to the process of generating new images from an existing dataset, with the objective of creating images that closely resemble the target images, learned from the source data distribution. This technique has a wide range of applications, including transforming captions into images, deblurring blurred images, and enhancing low-resolution images. In recent years, deep learning techniques, particularly Generative Adversarial Network (GAN), has achieved significant success in this field. GAN consists of a generator (G) and a discriminator (D) and employ adversarial learning to synthesize images. Researchers have developed various strategies to improve GAN performance, such as controlling learning rates for different models and modifying the loss functions. This thesis focuses on image synthesis from captions using GANs and aims to improve the quality of generated images. The study is divided into four main parts:In the first part, we investigate the LSTM conditional GAN which is to generate images from captions. We use the word2vec as the caption features and combine these features’ information by LSTM and generate images via conditional GAN. In the second part, to improve the quality of generated images, we address the issue of convergence speed and enhance GAN performance using an adaptive WGAN update strategy. We demonstrate that this update strategy is applicable to Wasserstein GAN(WGAN) and other GANs that utilize WGAN-related loss functions. The proposed update strategy is based on a loss change ratio comparison between G and D. In the third part, to further enhance the quality of synthesized images, we investigate a transformer-based Uformer GAN for image restoration and propose a two-step refinement strategy. Initially, we train a Uformer model until convergence, followed by training a Uformer GAN using the restoration results obtained from the first step.In the fourth part, to generate fine-grained image from captions, we delve into the Recurrent Affine Transformation (RAT) GAN for fine-grained text-to-image synthesis. By incorporating an auxiliary classifier in the discriminator and employing a contrastive learning method, we improve the accuracy and fine-grained details of the synthesized images.Throughout this thesis, we strive to enhance the capabilities of GANs in various image synthesis applications and contribute valuable insights to the field of deep learning and image processing.
Show less
- Title
- A Novel Explainability Approach For Spectrum Measurement Insight
- Creator
- Nagpure, Vaishali
- Date
- 2023
- Description
-
Spectrum is an extremely valuable natural resource in high demand. Although the spectrum has been fully allocated, there is no comprehensive...
Show moreSpectrum is an extremely valuable natural resource in high demand. Although the spectrum has been fully allocated, there is no comprehensive method for understanding about how it’s being used. Spectrum measurements are highly complex spatiotemporal data sets that play a key role in understanding spectrum use and require very specialized domain information for understanding. To leverage existing and future spectrum measurements to the fullest extent, it is necessary to have a systematic way to connect them to the contextual information that helps provide meaning to the data. To analyze and interpret the measurements, a variety of contextual information is needed. This research develops a novel approach for spectrum measurement understanding that unifies five years of wideband spectrum measurement summary data together with relevant contextual information from a variety of sources in a spectrum knowledge graph. Both quantitative and qualitative information is modeled and implemented on a Neo4j graph database platform. This modeling formalizes the relationships that help spectrum stakeholders “connect the dots” and provide deeper understanding of RF spectrum utilization. The knowledge graph can be queried to extract a wide variety of insights thus making spectrum knowledge more widely accessible to a variety of stakeholders.
Show less
- Title
- A SCALABLE AND CUSTOMIZABLE SIMULATION PLATFORM FOR ACCURATE QUANTUM NETWORK DESIGN AND EVALUATION
- Creator
- Wu, Xiaoliang
- Date
- 2021
- Description
-
Recent advances in quantum information science enabled the development of quantum communication network prototypes and created an opportunity...
Show moreRecent advances in quantum information science enabled the development of quantum communication network prototypes and created an opportunity to study full-stack quantum network architectures. The scale and complexity of quantum networks require cost-efficient means for testing and evaluation. Simulators allow for testing hardware, protocols, and applications cost-effectively before constructing experimental networks. This work develops SeQUeNCe, a comprehensive, customizable quantum network simulator. We have explored SeQUeNCe for quantum communication network evaluation. We use SeQUeNCe to study the performance of the quantum network with different hardware and applications. Additionally, we extend SeQUeNCe to a parallel discrete-event simulator by using the message passing interface (MPI). We comprehensively analyze the benefit and overhead of parallelization. The parallelization technique significantly increases the scalability of SeQUeNCe. In the future, we would like to improve SeQUeNCe in three aspects. First, we plan to continue reducing overhead from parallelization and increasing the scalability of SeQUeNCe. Second, we plan to investigate means to model quantum memory, entanglement protocols, and control protocols to enrich simulation models in the SeQUeNCe library. Third, we plan to integrate hardware with SeQUeNCe to enable high-fidelity analysis.
Show less
- Title
- AI IN MEDICINE: ENABLING INTELLIGENT IMAGING, PROGNOSIS, AND MINIMALLY INVASIVE SURGERY
- Creator
- Getty, Neil
- Date
- 2022
- Description
-
While an extremely rich research field, compared to other applications of AI such as natural language processing (NLP) and image processing...
Show moreWhile an extremely rich research field, compared to other applications of AI such as natural language processing (NLP) and image processing/generation, AI in medicine has been much slower to be applied in real-world clinical settings. Often the stakes of failure are more dire, the access of private and proprietary data more costly, and the burden of proof required by expert clinicians is much higher. Beyond these barriers, the often typical data-driven approach towards validation is interrupted by a need for expertise to analyze results. Whereas the results of a trained Imagenet or machine translation model are easily verified by a computational researcher, analysis in medicine can be much more multi-disciplinary demanding. AI in medicine is motivated by a great demand for progress in health-care, but an even greater responsibility for high accuracy, model transparency, and expert validation.This thesis develops machine and deep learning techniques for medical image enhancement, patient outcome prognosis, and minimally invasive robotic surgery awareness and augmentation. Each of the works presented were undertaken in di- rect collaboration with medical domain experts, and the efforts could not have been completed without them. Pursuing medical image enhancement we worked with radiologists, neuroscientists and a neurosurgeon. In patient outcome prognosis we worked with clinical neuropsychologists and a cardiovascular surgeon. For robotic surgery we worked with surgical residents and a surgeon expert in minimally invasive surgery. Each of these collaborations guided priorities for problem and model design, analysis, and long-term objectives that ground this thesis as a concerted effort towards clinically actionable medical AI. The contributions of this thesis focus on three specific medical domains. (1) Deep learning for medical brain scans: developed processing pipelines and deep learn- ing models for image annotation, registration, segmentation and diagnosis in both traumatic brain injury (TBI) and brain tumor cohorts. A major focus of these works is on the efficacy of low-data methods, and techniques for validation of results without any ground truth annotations. (2) Outcome prognosis for TBI and risk prediction for Cardiovascular Disease (CVD): we developed feature extraction pipelines and models for TBI and CVD patient clinical outcome prognosis and risk assessment. We design risk prediction models for CVD patients using traditional Cox modeling, machine learning, and deep learning techniques. In this works we conduct exhaustive data and model ablation study, with a focus on feature saliency analysis, model transparency, and usage of multi-modal data. (3) AI for enhanced and automated robotic surgery: we developed computer vision and deep learning techniques for understanding and augmenting minimally invasive robotic surgery scenes. We’ve developed models to recognize surgical actions from vision and kinematic data. Beyond model and techniques, we also curated novel datasets and prediction benchmarks from simulated and real endoscopic surgeries. We show the potential for self-supervised techniques in surgery, as well as multi-input and multi-task models.
Show less
- Title
- Integrating Provenance Management and Query Optimization
- Creator
- Niu, Xing
- Date
- 2021
- Description
-
Provenance, information about the origin of data and the queries and/or updates that produced it, is critical for debugging queries and...
Show moreProvenance, information about the origin of data and the queries and/or updates that produced it, is critical for debugging queries and transactions, auditing, establishing trust in data, and many other use cases.While how to model and capture the provenance of database queries has been studied extensively, optimization was recognized as an important problem in provenance management which includes storing, capturing, querying provenance and so on. However, previous work has almost exclusively focused on how to compress provenance to reduce storage cost, there is a lack of work focusing on optimizing provenance capture process. Many approaches for capturing database provenance are using SQL query language and representing provenance information as a standard relation. However, even sophisticated query optimizers often fail to produce efficient execution plans for such queries because of the query complexity and uncommon structures. To address this problem, we study algebraic equivalences and alternative ways of generating queries for provenance capture. Furthermore, we present an extensible heuristic and cost-based optimization framework utilizing these optimizations. While provenance has been well studied, no database optimizer is aware of using provenance information to optimize the query processing. Intuitively, provenance records exactly what data is relevant for a query. We can use this feature of provenance to figure out and filter out irrelevant input data of a query early on and such that the query processing will be speeded up. The reason is that instead of fully accessing the input dataset, we only run the query on the relevant input data. In this work, we develop provenance-based data skipping (PBDS), a novel approach that generates provenance sketches which are concise encodings of what data is relevant for a query. In addition, a provenance sketch captured for one query is used to speed up subsequent queries, possibly by utilizing physical design artifacts such as indexes and zone maps. The work we present in this thesis demonstrates a tight integration between provenance management and query optimization can lead a significant performance improvement of query processing as well as traditional database management task.
Show less