Search results

Title: DEEP LEARNING FOR IMAGE PROCESSING WITH APPLICATIONS TO MEDICAL IMAGING
Creator: Zarshenas, Amin
Date: 2019
Description: Deep Learning is a subfield of machine learning concerned with algorithms that learn hierarchical data representations. Deep learning has...
Show moreDeep Learning is a subfield of machine learning concerned with algorithms that learn hierarchical data representations. Deep learning has proven extremely successful in many computer vision tasks including object detection and recognition. In this thesis, we aim to develop and design deep-learning models to better perform image processing and tackle three important problems: natural image denoising, computed tomography (CT) dose reduction, and bone suppression in chest radiography (“chest x-ray”: CXR). As the first contribution of this thesis, we aimed to answer to probably the most critical design questions, under the task of natural image denoising. To this end, we defined a class of deep learning models, called neural network convolution (NNC). We investigated several design modules for designing NNC for image processing. Based on our analysis, we design a deep residual NNC (R-NNC) for this task. One of the important challenges in image denoising regards to a scenario in which the images have varying noise levels. Our analysis showed that training a single R-NNC on images at multiple noise levels results in a network that cannot handle very high noise levels; and sometimes, it blurs the high-frequency information on less noisy areas. To address this problem, we designed and developed two new deep-learning structures, namely, noise-specific NNC (NS-NNC) and a DeepFloat model, for the task of image denoising at varying noise levels. Our models achieved the highest denoising performance comparing to the state-of-the-art techniques.As the second contribution of the thesis, we aimed to tackle the task of CT dose reduction by means of our NNC. Studies have shown that high dose of CT scans can increase the risk of radiation-induced cancer in patients dramatically; therefore, it is very important to reduce the radiation dose as much as possible. For this problem, we introduced a mixture of anatomy-specific (AS) NNC experts. The basic idea is to train multiple NNC models for different anatomic segments with different characteristics, and merge the predictions based on the segmentations. Our phantom and clinical analysis showed that more than 90% dose reduction would be achieved using our AS NNC model.We exploited our findings from image denoising and CT dose reduction, to tackle the challenging task of bone suppression in CXRs. Most lung nodules that are missed by radiologists as well as by computer-aided detection systems overlap with bones in CXRs. Our purpose was to develop an imaging system to virtually separate ribs and clavicles from lung nodules and soft-tissue in CXRs. To achieve this, we developed a mixture of anatomy-specific, orientation-frequency-specific (ASOFS) expert deep NNC model. While our model was able to decompose the CXRs, to achieve an even higher bone suppression performance, we employed our deep R-NNC for the bone suppression application. Our model was able to create bone and soft-tissue images from single CXRs, without requiring specialized equipment or increasing the radiation dose.
Show less

Title: Inefficiencies in resource allocation games
Creator: Tota, Praneeth
Date: 2019
Description: This thesis addresses a problem that has been debated by the academic community, the government and the industry at large which is : How...
Show moreThis thesis addresses a problem that has been debated by the academic community, the government and the industry at large which is : How unfair is a tiered Internet compared to a open Internet ? On one hand we have an open Internet in which all the data is treated equally and the Internet service providers have no say when it comes to a pricing differentiation and on the other hand we have a tiered Internet in which the ISPs can charge different amounts based on certain constraints like the type of data or the content provider. The architecture of the internet imposes certain constraints which need mechanisms to efficiently allocate the resources among all the competing participants who only concern themselves with their best interests without considering the social benefit as a whole. We consider one such mechanism known as proportional sharing in which resource or the bandwidth is divided among the participants based on their bids. An efficient allocation is one which maximizes the aggregate utility of the users. We consider inelastic demand with the participants as price anticipating and ensure market clearing.We examine a tiered Internet in which the ISPs can partition the bandwidth based on certain constraints and charge a premium for better service. The participants involved are from all economic classes, so they have different amounts of wealth at their disposal. We quantify the relative loss incurred by the participants in lower economic classes as compared to the higher economic classes. We also calculate the loss of efficiency caused by competition among the participants as compared to the optimum social allocation.
Show less

Title: Feasibility and Properness in Linear Interference Alignment: Flow Tests, Sufficient Conditions, and Approximation Algorithms.
Creator: Al-Dhelaan, Fahad Abdullah
Date: 2019
Description: Interference forms a major challenge in our understanding of the capacity of wireless networks and our ability to achieve this capacity....
Show moreInterference forms a major challenge in our understanding of the capacity of wireless networks and our ability to achieve this capacity. Rather than scheduling transmissions to avoid interference, recent techniques allow for interference to be neutralized and for simultaneous transmission of messages.Linear interference alignment in MIMO networks is the technique of aligning messages, by the transmitters through the use of precoding matrices, so that the undesired messages occupy some minimal sub-space upon their arrival at an unintended receiver. The overlapping of the sub-spaces where these interfering messages fall allows the receiver to neutralize them with minimal dedication of its resources through the application of a decoding matrix.The linear interference alignment problem is to design these precoding and decoding matrices. It has been shown to be NP-hard in the literature.A network is called feasible if such a solution exists. Even deciding whether some network instance is feasible, is non-trivial. The problem of deciding feasibility was shown to be NP-hard in the literature, for constant channel coefficients.We focus on finding efficient and robust feasibility tests in the case of generic channels, where the computational complexity is unknown. We provide efficient and robust tests for the necessary condition of properness, which had previously been identified in the literature but given no efficient tests in the general case.We identify several conditions, each being sufficient for feasibility. We study their relationships and the computational complexity of testing for them. We provide polynomial-time maximum flow test for one sufficient condition in the case of uniform demands. In the case of uniform demands which divide the number of antennas at all receivers or all transmitters, we show that these sufficient and necessary conditions are equivalent with feasibility, thereby admitting efficient maximum-flow tests.We identify a subset of feasible instances where the decoding and precoding matrices can be designed in polynomial-time. Furthermore, we show that any proper instance is within a constant factor of a one of these instances. Then, we provide efficient constant approximation algorithms for the problems of maximizing demand and minimizing antennas such that an instance is feasible.
Show less

Title: PROVENANCE FOR TRANSACTIONAL UPDATES
Creator: Arab, Bahareh Sadat
Date: 2019
Description: Database provenance explains how results are derived by queries. However, many use cases such as auditing and debugging of transactions...
Show moreDatabase provenance explains how results are derived by queries. However, many use cases such as auditing and debugging of transactions require understanding of how the current state of a database was derived by a transactional history. We introduce an approach for capturing the provenance of transactions. Our approach does not just work for serializable concurrency control protocols but also for non-serializable protocols including snapshot isolation. The main drivers of our approach are a provenance model for queries, updates, and transactions and reenactment, a novel technique for retroactively capturing the provenance of tuple versions. We introduce the MV-semirings provenance model for updates and transactions as an extension of the existing semiring provenance model for queries. Our reenactment technique exploits the time travel and audit logging capabilities of modern DBMS to replay parts of a transactional history using queries. Importantly, our technique requires no changes to the transactional workload or underlying DBMS and results in only moderate runtime overhead for transactions. We discuss how our MV-semirings model and reenactment approach can be used to serve a wide variety of applications and use cases including answering of historical what-if queries which determine the effect of hypothetical changes to past operations of a business, post-mortem debugging of transactions, and to create private data workspaces for exploration. We have implemented our approach on top of a commercial DBMS and our experiments confirm that by applying novel optimizations we can efficiently capture provenance for complex transactions over large data sets.
Show less

Title: INFLUENCE OF TIE STRENGTH ON HOSTILITY IN SOCIAL MEDIA
Creator: Radfar, Bahar
Date: 2019
Description: Online anti-social behavior, such as cyberbullying, harassment, and trolling, is a widespread problem that threatens free discussion and has...
Show moreOnline anti-social behavior, such as cyberbullying, harassment, and trolling, is a widespread problem that threatens free discussion and has negative physical and mental health consequences for victims and communities.While prior work has proposed automated methods to identify toxic situation such as hostility, they only focused on individual words. While only a bag of keywords is applied to detect hostility, this is not enough as words might have different meaning based on the relationship between participants of the discussion. In this paper, we considered the friendship between the sender and the target of a hostile conversation. First, we studied the characteristic of different types of relationship. Then, we set our goal to be more accurate hostility detection with reduced wrong red flags.Thus, we aim to detect both the presence and intensity of hostile comments based on linguistic and social features from our well-defined relationships. To evaluate our approach, we introduce a corpus of over 12K annotated Twitter tweets from over +170,000 tweets. Next, we extracted useful features such as relationship type and length of the tweet to feed into our Long Short Term Memory(LSTM) and Logistic Regression(LR) classifier. By considering the relationship type in the classifier model we improved the hostility detection AUC by close to 5 % comparing to the baseline method. Also, the F-1 score increased by 4 % as well.
Show less

Title: CYBER PHYSICAL SYSTEM WITH COUPLED NETWORKS: SECURITY AND PRIVACY
Creator: Zhao, Jing
Date: 2019
Description: With the development of cyber physical systems, people and electronic devices are connected via various networks. In many scenarios, different...
Show moreWith the development of cyber physical systems, people and electronic devices are connected via various networks. In many scenarios, different networks are strongly coupled with each other, e.g. power grid is strongly coupled with the communication network in smart grid. On one hand, such coupling brings benefits such as improved efficiency and quick response to system service exceptions. However, the coupling of different networks also brings security and privacy problems. In this thesis we study two scenarios: the the secure coupling of visual connection with short range pairwise communication and privacy aware coupling of smart home with smart grid. For the first scenario, we propose SCsec, a secure screen-camera communication system, which achieves secure one-way communication. The throughput of SCsec is comparable to current screen communication systems. For the second scenario, we propose a novel randomized battery load hiding algorithm which ensures differential privacy for smart homes with smart meters.
Show less

Title: ACCELERATING I/O USING DATA LABELS: A CONTENTION-AWARE, MULTI-TIERED, SCALABLE, AND DISTRIBUTED I/O PLATFORM
Creator: Kougkas, Antonios
Date: 2019
Description: Parallel file systems (PFS) have been the dominant storage solution in High-Performance Computing (HPC) for several years. However, as we move...
Show moreParallel file systems (PFS) have been the dominant storage solution in High-Performance Computing (HPC) for several years. However, as we move towards the exascale era, PFS have several limitations, such as scalability, complexity, metadata, data synchronization, and access latency, which can seriously affect storage's performance. These challenges along with the unprecedented data explosion accentuated the research conundrum known as I/O bottleneck. Moreover, the extreme computing scale, that exascale machines promise, brings forward another important limitation of the existing I/O path. Multiple large scientific applications will be accessing shared storage resources at the same time, and thus, will be competing. This phenomenon is known as cross-application I/O interference and is one of the most challenging performance degradation factors, even in today's petascale. To address some of the above issues, modern system designs have introduced a new memory and storage hierarchy, filled with novel special hardware technologies, that aims to ease, in a sense, the I/O bottleneck. However, software for management, I/O scheduling, and efficient data movement in this new complicated landscape of multi-tiered I/O infrastructure is limited at best. The added complexity of data access using buffering resources needs to be addressed and is of the utmost priority of several scientific sites and communities. This study makes steps towards I/O acceleration in HPC by proposing: a) a new subsystem for the I/O convergence between HPC and BigData storage ecosystems, b) a new subsystem equipped with several advanced I/O buffering techniques for the deep memory and storage hierarchy, and c) a new subsystem that implements several I/O scheduling algorithms to prevent the negative effects of I/O contention, and d) a new storage system that relies on a novel abstract notion of a data label that allows the I/O system to provide storage flexibility, versatility, agility, and malleability. The proposed work has been evaluated and results suggest that substantial improvements in I/O performance have been achieved.
Show less

Title: ACTIVE INFERENCE FOR PREDICTIVE MODELS OF SPATIO-TEMPORAL DOMAINS
Creator: Komurlu, Caner
Date: 2019
Description: Active inference is the method of selective information gathering during prediction in order to increase a predictive machine learning model's...
Show moreActive inference is the method of selective information gathering during prediction in order to increase a predictive machine learning model's prediction performance. Unlike active learning, active inference does not update the model, but rather provides the model with useful information during prediction to boost the prediction performance. To be able to work with active inference, a predictive model needs to exploit correlations among variables that need to be predicted. Then the model, while being provided with true values for some of the variables, can make more accurate predictions for the remaining variables.In this dissertation, I propose active inference methods for predictive models of spatio-temporal domains. I formulate and investigate active inference in two different domains: tissue engineering and wireless sensor networks. I develop active inference for dynamic Bayesian networks (DBNs) and feed-forward neural networks (FFNNs).First, I explore the effect of active inference in the tissue engineering domain. I design a dynamic Bayesian network (DBN) model for vascularization of a tissue development site. The DBN model predicts probabilities of blood vessel invasion in regional scale through time. Then utilizing spatio-temporal correlations between regions represented as variables in the DBN model, I develop an active inference technique to detect the optimal time to stop a wet lab experiment. The empirical study shows that the active inference is able to detect the optimal time and the results are coherent with domain simulations and lab experiments.In the second phase of my research, I develop variance-based active inference techniques for dynamic Bayesian networks for the purpose of battery saving for wireless sensor networks (WSN). I propose the expected variance reduction active inference method to detect variables that reduce the overall variance the most. I first propose a DBN model of a WSN. I then compare the prediction performance of the DBN with Gaussian processes and linear chain graphical models on three different WSN data using several baseline active inference methods. After showing that DBNs perform better than the baseline predictive models, I compare the performance of expected variance reduction active inference method with the performances of baseline methods on the DBN, and show the superiority of the expected variance reduction on the three WSN data sets.Finally, to address the inference complexity and the limitation of representing linear correlations due to Gaussian assumption, I replace the DBN representation with a feed-forward neural network (FFNN) model. I first explore techniques to integrate observed values into predictions on neural networks. I adopt the input optimization technique. Finally, I discover two problems: model error and optimization overfitting. I show that the input optimization can mitigate the model error. Lastly, I propose a validation-based regularization approach to solve the overfitting problem.
Show less

Title: Concurrency and Locality Aware GPGPU Thread Group Scheduling
Creator: Nosek, Janusz M
Date: 2018
Description: Graphics Processing Units (GPUs) once served a limited function for rending of graphics. With technological advances, these devices gained new...
Show moreGraphics Processing Units (GPUs) once served a limited function for rending of graphics. With technological advances, these devices gained new purposes beyond graphics. Most modern GPUs have exposed their APIs to allow processing of data beyond the display, thus leading to a revolution in computing where instructions and intensive tasks can be offloaded to these now General Purpose Graphical Processing Units (GPGPUs). Many compute and memory intensive tasks have utilized GPGPUs for acceleration and these devices are especially prevalent in the financial, pharmaceutical and automotive industries. As computing resources have increased exponentially, memory resources have not and now create a limiting factor known as the memory wall. GPUs have been designed as an application specific processing unit for the streaming data access patterns found in graphical applications. They are successful at their original purpose, but when extended to general purpose problems, they meet the same memory wall data access problem as their CPU counterparts; they can be more susceptible to the effects latency due to the locality and concurrency of instructions beside data. This thesis reviews the current GPGPU landscape, including the design of current scheduling systems, GPGPU architecture, as well as a way of computing and describing the memory access penalty with Concurrent Average Memory Access Time (C-AMAT). We will also demonstrate the current GPGPU landscape, including design of schedulers, simulators as well as how Concurrent Average Memory Access Time (C-AMAT) can be computed. We have devised a solution to manipulate the number of scheduled thread groups to allow a GPGPU’s processing units to match their current memory states defined by C-AMAT. Our solution results in the increase in IPC, the reduction in C-AMAT and decrease in memory misses. The solution also has different effects on different types of computing problems, with highest improvements achieved in compute intensive memory patterns with as much as a 12% improvement in the instructions per cycle and a 14% reduction in C-AMAT.
Show less

Title: PRIVACY PRESERVING BAG PREPARATION FOR LEARNING FROM LABEL PROPORTION
Creator: Yan, Xinzhou
Date: 2018
Description: We apply Privacy-preserving data mining standards (PPDM) to the Learning from label proportion (LLP) model to create the Private-preserving...
Show moreWe apply Privacy-preserving data mining standards (PPDM) to the Learning from label proportion (LLP) model to create the Private-preserving machine learning framework. We design the data preparation step for the LLP framework to meet the PPDM standards. In the data preparation step, we develop a bag selection method to boost the accuracy of the LLP model by more than 7%. Besides that, we propose three K- anonymous aggregation methods for the datasets which have almost zero accuracy loss and very robust. After the K-anonymous step, we apply Differential privacy to the LLP model and ensure a low accuracy loss for the LLP modelBecause of the LLP model’s special loss function, not only it is possible to replace all the feature vectors with the mean feature vector within each bag, but also the accuracy loss caused by Differential privacy can be bounded by a small number. The loss function ensures low accuracy loss when training LLP model on PPDM dataset. We evaluate the PPDM LLP model on two datasets, one is the Adult dataset and the other is the Instagram comment dataset. Both of them give empirical evidence of the low accuracy loss after applying the PPDM LLP model.
Show less

Title: DEVELOPMENT OF BIOMARKERS OF SMALL VESSEL DISEASE IN AGING
Creator: Makkinejad, Nazanin
Date: 2021
Description: Age-related neuropathologies including cerebrovascular and neurodegenerative diseases play a critical role in cognitive dysfunction, and...
Show moreAge-related neuropathologies including cerebrovascular and neurodegenerative diseases play a critical role in cognitive dysfunction, and development of dementia. Designing methodologies for early prediction of these diseases are much needed. Since multiple pathologies commonly coexist in brains of older adults, clinical diagnosis lacks the specificity to isolate the pathology of interest, and gold standard is determined only at autopsy. Magnetic resonance imaging (MRI) provides a non-invasive tool to study abnormalities in brain characteristics that is unique to each pathology. Utilizing ex-vivo MRI for brain imaging proves to be useful as it eliminates two important biases of in-vivo MRI. First, no additional pathology would develop between imaging and pathologic examination, and second, frail older adults would not be excluded from MRI.Hence, the aims of this dissertation were two-fold: to study brain correlates of age- related neuropathologies, and to develop and validate classifiers of small vessel diseases by combining ex-vivo MRI and pathology in a large community cohort of older adults. The structure of the project is as follows.First, the association of amygdala volume and shape with transactive response DNA-binding protein 43 (TDP-43) pathology was investigated. Using a regularized regression technique, higher TDP-43 was associated with lower amygdala volume. Also, shape analysis of amygdala showed unique patterns of spatial atrophy associated with TDP-43 independent of other pathologies. Lastly, using linear mixed effect models, amygdala volume was shown to explain an additional portion of variance in cognitive decline above and beyond what was explained by the neuropathologies and demographics.Second, the previous study was extended to analyze other subcortical regions including the hippocampus, thalamus, nucleus accumbens, caudate, and putamen, and was also conducted in a larger dataset. The results showed unique contribution of TDP-43, neurofibrillary tangles (hallmark characteristic of Alzheimer’s disease pathology), and atherosclerosis (a cerebrovascular pathology) to atrophy on the surface of subcortical structures. Understanding the independent effects of each pathology on volume and shape of different brain regions can form a basis for the development of classifiers of age-related neuropathologies.Third, an in-vivo classifier of arteriolosclerosis was developed and validated. Arteriolosclerosis is one of the main pathologies of small vessel disease, is associated with cognitive decline and dementia, and currently has no standard biomarker available. In this work, the classifier was developed ex-vivo using machine learning (ML) techniques and was then translated to in-vivo. The in-vivo classifier was packaged as a software called ARTS, which outputs a score that is the likelihood of arteriolosclerosis when the required input is given to the software. It was tested and validated in various cohorts and showed to have high performance in predicting the pathology. It was also shown that higher ARTS score was associated with greater cognitive decline in domains that are specific to small vessel disease.Fourth, motivated by current trends and superiority of deep learning (DL) techniques in classification tasks in computer vision and medical imaging, a preliminary study was designed to use DL for training an ex-vivo classifier of arteriolosclerosis. Specifically, convolutional neural networks (CNNs) were applied on 3 Tesla ex-vivo MR images directly without providing prior information of brain correlates of arteriolosclerosis. One interesting aspect of the results was that the network learnt that white matter hyperintense lesions contributed the most to classification of arteriolosclerosis. These results were encouraging, and more future work will exploit the capability of DL techniques alongside the traditional ML approaches for more automation and possibly better performance.Finally, a preliminary classifier of arteriolosclerosis and small vessel atherosclerosis was developed since the existence of both pathologies in brain have devastating effects on cognition. The methodology was similar to the one used for development of arteriolosclerosis classifier with minor differences. The classifier showed a good performance in-vivo, although the testing needs to be assessed in more cohorts.The comprehensive study of age-related neuropathologies and their contribution to abnormalities of subcortical brain structures offers a great potential to develop a biomarker of each pathology. Also, the finding that the MR-based classifier of arteriolosclerosis showed high performance in-vivo demonstrate the potential of ex-vivo studies for development of biomarkers that are precise (because they are based on autopsy, which is the gold standard) and are expected to work well in-vivo. The implications of this study include development of biomarkers that could potentially be used in refined participant selection and enhanced monitoring of the treatment response in clinical drug and prevention trials.
Show less

Title: A Reasoning System Architecture for Spectrum Decision-making
Creator: Das, Udayan D.
Date: 2021
Description: Spectrum is a public resource; yet understanding how spectrum is allocated and used is a daunting task. Usable spectrum is already fully...
Show moreSpectrum is a public resource; yet understanding how spectrum is allocated and used is a daunting task. Usable spectrum is already fully allocated, but the demand for spectrum continues to grow and there are opportunities for utilizing spectrum in more efficient ways. Understanding how spectrum is allocated and its utilization in time and space is necessary to take advantage of these emerging opportunities. A combination of fragmented information from varied information sources, a complex regulatory environment, variability of regulations and physics by band, real-time spectrum usage dynamics, and a status quo with knowledge concentration among a few, makes understanding spectrum a considerable challenge for all stakeholders including researchers, students, policymakers, and new telecom operators. After considerable study of spectrum, its allocation, regulation, and usage, we have developed a system architecture that is a significant step towards easing the burden of understanding spectrum information. Our system architecture connects information from disparate sources and leads to a richer understanding of spectrum usage, how it is governed, and its potential for future use. Classes of information are modeled as knowledge graphs, and the interplay of knowledge graphs produces a richer set of insight and can lead to more informed decision-making. Further, we show mechanisms for connecting spectrum information with real-time observations to get a comprehensive view of spectrum usage dynamics. While focused on the United States, this work should be applicable to other spectrum contexts worldwide. This work, of considerable technical value, also has democratic value in making complex information accessible and allowing the public to determine whether spectrum, a natural resource, is being used for the public good.
Show less

Title: Towards a Self-Programmable Storage Solution in Extreme-Scale Environments
Creator: Devarajan, Hariharan
Date: 2021
Description: Traditional compute-centric scientific discovery has led to a growing gap between computation power and storage capabilities. However, in the...
Show moreTraditional compute-centric scientific discovery has led to a growing gap between computation power and storage capabilities. However, in the data explosion era, where data analysis is essential for scientific discovery, slow storage systems led to the research conundrum known as the I/O bottleneck. Scientists have proposed several optimizations to address the I/O bottleneck. However, selecting and applying the appropriate optimization is a complex task, often left to the users. Additionally, the explosion of data has led to the proliferation of applications as well as storage technologies. This has created a complex matching problem between diverse application requirements and heterogeneous storage resources for the users. We need to move towards a Self-Programmable storage system that can automatically understand the I/O requirements of applications, transparently leverage the heterogeneity of storage, and reconfigures itself dynamically by utilizing application and storage information. In this work, we present the Jal System for building Self-Programmable storage. The Jal System consists of three layers: the application layer, the transfer layer, and the storage layer. The application layer uses automatic extraction of I/O requirements from applications using a source-code-based profiler. The storage layer defines a data abstraction, using a shared log store, to efficiently unify heterogeneous storage resources under a single platform. Finally, the transfer layer defines data management algorithms that consider multi-application and multi-storage information to optimize data operations. Additionally, we illustrate the benefits of utilizing the technologies within the Jal System on modern scientific AI applications. Our evaluations have demonstrated that each technology within the Jal System can accelerate I/O for modern scientific workflows. We have implemented software, tools, and system libraries for modern HPC systems. In the future, we envision building a fully integrated system that efficiently utilizes all the Jal System technologies. Additionally, we plan to extend the strategies and techniques in Jal System to other scientific domains such as AI and IoT.
Show less

Title: DATA-DRIVEN OPTIMIZATION OF NEXT GENERATION HIGH-DENSITY WIRELESS NETWORKS
Creator: Khairy, Sami
Date: 2021
Description: The Internet of Things (IoT) paradigm is poised to advance all aspects of modern society by enabling ubiquitous communications and...
Show moreThe Internet of Things (IoT) paradigm is poised to advance all aspects of modern society by enabling ubiquitous communications and computations. In the IoT era, an enormous number of devices will be connected wirelessly to the internet in order to enable advanced data-centric applications. The projected growth in the number of connected wireless devices poses new challenges to the design and optimization of future wireless networks. For a wireless network to support a massive number of devices, advanced physical layer and channel access techniques should be designed, and high-dimensional decision variables should be optimized to manage network resources. However, the increased network scale, complexity, and heterogeneity, render the network unamenable to traditional closed-form mathematical analysis and optimization, which makes future high-density wireless networks seem unmanageable. In this thesis, we study the design and data-driven optimization of future high-density wireless networks operating over the unlicensed band, including Radio Frequency (RF)-powered wireless networks, solar-powered Unmanned Aerial Vehicle (UAV)-based wireless networks, and random Non-Orthogonal Multiple Access (NOMA) wireless networks. For each networking scenario, we first analyze network dynamics and identify performance trade-offs. Next, we design adaptive network controllers in the form of high-dimensional multi-objective optimization problems which exploit the heterogeneity in users' wireless propagation channels and energy harvesting to maximize the network capacity, manage battery energy resources, and achieve good user capacity fairness. To solve the high-dimensional optimization problems and learn the optimal network control policy, we propose novel, cross-layer, scalable, model-based and model-free data-driven network optimization and resource management algorithms that integrate domain-specific analyses with advanced machine learning techniques from deep learning, reinforcement learning, and uncertainty quantification. Furthermore, convergence of the proposed algorithms to the optimal solution is theoretically analyzed using mathematical results from metric spaces, convex optimization, and game theory. Finally, extensive simulations have been conducted to demonstrate the efficacy and superiority of our network optimization and resource management techniques compared with existing methods. Our research contributions provide practical insights for the design and data-driven optimization of next generation high-density wireless networks.
Show less

Title: Combining Simulation and Emulation for Planning and Evaluation of Smart Grid Security, Resilience, and Operations
Creator: Hannon, Christopher
Date: 2020
Description: The modern power grid is a complex, large scale cyber-physical system comprising of generation, transmission and distribution elements....
Show moreThe modern power grid is a complex, large scale cyber-physical system comprising of generation, transmission and distribution elements. However, advancements in information technology have not yet caught up to the legacy operational technology used in the electric power system. Coupled with the proliferation of renewable energy sources, the electric power grid is in a transition to a smarter grid; operators are now being equipped with the tools to make real-time operational changes and the ability to monitor and provide situational awareness of the system. This shift in electric power grid priorities requires an expansive and reliable communication network to enhance efficiency and resilience of the Smart Grid. This trend calls for a simulation-based platform that provides sufficient flexibility and controllability for evaluating network application designs, and facilitating the transition from in-house research ideas into production systems. In this Thesis, I present techniques to efficiently combine simulation systems, emulation systems, and real hardware into testbed systems to evaluate security, resilience, and operations of the electric power grid. While simulating the dynamics of the physical components of the electric power grid, the cyber components including devices, applications, and networking functions are able to be emulated or even implemented using real hardware. In addition to novel synchronization algorithms between simulation and emulation systems, multiple test cases in applying software-defined networking, an emerging networking paradigm, to the power grid for security and resilience and phasor measurement unit analytics for grid operations are presented which motivate the need for a simulation-based testbed. The contributions of this work lay in the design of a virtual time system with tight controllability on the execution of the emulation systems, i.e., pausing and resuming any specified container processes in the perception of their own virtual clocks, and also lay in the distributed virtual time based synchronization across embedded Linux devices.
Show less

Title: DATA PRIVACY AND DEEP LEARNING IN THE MOBILE ERA: TRACEABILITY AND PROTECTION
Creator: Chen, Linlin
Date: 2020
Description: Privacy and deep learning have been two of the most exciting research trends in both academia and industry. On the one hand, big data rapidly...
Show morePrivacy and deep learning have been two of the most exciting research trends in both academia and industry. On the one hand, big data rapidly expedite lots of data orientated applications, especially like deep learning services. With the tremendous value exhibited by the data, the privacy of data subjects who generate the data, has also raised much attention. Meanwhile more regulations and legislation have been enacted or enforced, intending to enforce the companies and organizations to strictly comply with the personal privacy protection while collecting or utilizing their data. All these moves will substantially change the ways to train the deep learning models and provide AI services, and in some ways might hinder the development of deep learning if not coming up with some sophisticated mechanisms. On the other hand, deep learning has been showing incredibly promising performance in a variety of areas like face recognition, voice recognition, recommendation & advertising, autonomous driving, medical imaging, etc.. This keeps us thinking will deep learning also in turn influence privacy and be leveraged to compromise privacy. Meanwhile we also observe that mobile devices become so ubiquitous that more shares of data are generated on mobile devices, and mostly those data are both extremely sensitive for data subjects as well as extremely valuable for developing deep learning. We shouldn’t neglect the impact of mobile devices on both privacy and deep learning.In this thesis I explore the research on the interactions between privacy and deep learning, especially with the mobile devices being involved in. Specifically I work on: 1). How does privacy change the way we use the data when building deep learning models, and present the mechanism for privacy protection towards deep learning. 2). How does deep learning in turn make privacy more vulnerable to be compromised, and demonstrate the privacy compromise by facilitating deep learning to trace the source mobile devices and link the personal identities.
Show less

Title: A FRAMEWORK FOR MANAGING UNSPECIFIED ASSUMPTIONS IN SAFETY-CRITICAL CYBER-PHYSICAL SYSTEMS
Creator: Fu, Zhicheng
Date: 2020
Description: For a cyber-physical system, its execution behaviors are often impacted by its operating environment. However, the assumptions about a cyber...
Show moreFor a cyber-physical system, its execution behaviors are often impacted by its operating environment. However, the assumptions about a cyber-physical system’s expected environment are often informally documented, or even left unspecified during the system development process. Unfortunately, such unspecified assumptions made in cyber-physical systems, such as medical cyber-physical systems, can result in patients’ injures and loss of lives. Based on the U.S. Food and Drug Administration (FDA) data, from 2006 to 2011, there were 5,294 recalls and 1,154,451 adverse events resulting in 92,600 patient injuries and 25,800 deaths. One of the most critical reasons for these medical device recalls is the violations of unspecified assumptions. These compelling data motivated us to research unspecified assumptions issues in safety-critical cyber-physical systems, and develop approaches to reduce the failures caused by unspecified assumptions.In particular, this thesis is to study the issues of unspecified assumptions in cyber-physical system design process, and to develop an unspecified assumption management framework to (1) identify unspecified assumptions in system design models; (2) facilitate domain experts to perform impact analysis on the failures caused by violating unspecified assumptions; and (3) explicitly model unspecified assumptions in system design models for system safety validation and verification.Before starting to develop the unspecified assumption management framework, we first need to study how unspecified assumptions may be introduced into cyber- physical systems. We took cases from the FDA medical device recall database to analyze the root causes of medical device failures. By analyzing these cases, we found two important facts: (1) one of the major reasons that causes medical device recalls is violation of some unspecified assumptions; and (2) unspecified assumptions are often introduced into the system design models through syntactic carriers. Based on the two findings, we propose a framework for managing unspecified assumption in cyber- physical system development process. The framework has three components. The first component is called the Unspecified Assumption Carrier Finder (UACFinder), which is to identify unspecified assumptions in system design models through automatically extracting syntactic carriers associated with unspecified assumptions. However, as the number of unspecified assumptions identified from system design models can be large, and it may not be always feasible for domain experts to validate and address the most safety-critical assumptions at different system development phases. Therefore, the second component of the framework is a methodology that uses the Failure Mode and Effects Analysis (FMEA) based prioritization approach to facilitate domain experts to perform impact analysis on unspecified assumptions identified by the UACFinder and asses their safety-critical level. The third component of the framework describes a model architecture and corresponding algorithms to model and integrate assumptions into system design models, so that system safety associated with these unspecified assumptions can be validated and formally verified by existing tools.We also have conducted case-studies on representative system models to demonstrate how UACFinder can identify unspecified assumptions from system design mod- els, and how the FMEA based prioritizing approach can facilitate domain experts to verify the appropriateness of identified assumptions. In addition, case studies are also conducted to demonstrate how system safety properties can be improved by modeling and integrating unspecified assumptions into system models. The results of case-studies indicate that the unspecified assumption management framework can identify unspecified assumptions, facilitate domain experts to validate and verify the appropriateness of identified assumptions, and explicitly specify assumptions that would cause defects in these systems.
Show less

Title: Integrity based landmark generation: A method to generate landmark configurations that guarantee mobile robot localization safety
Creator: Chen, Yihe
Date: 2020
Description: From the bronze-age city Nineveh to the modern metropolitan like Tokyo, traffic shape cities and profoundly affect the life of people. Similar...
Show moreFrom the bronze-age city Nineveh to the modern metropolitan like Tokyo, traffic shape cities and profoundly affect the life of people. Similar to how the wide-spreading of automobile had modified the modern cities in early 20th century, we are now standing on the eve of yet another traffic revolution. With the vast spreading of autonomous/semi- autonomous robotics application, it is important for the urban designers to design or retrofit urban environment that is safe and friendly to the autonomous robots; As more robots are deployed in life-critical situations, such as autonomous passenger vehicles, it is imperative to consider their safety, and in particular, their localization safety. While it would be ideal to guarantee safety in any environment without having to physically modify said environment, this is not always possible and one may have add landmarks or active beacons to reach an acceptable level of safety for landmark-based localization. Localization safety is assessed using integrity, the primary safety metric used in open-sky aviation applications that has been recently applied to mobile robots and can ac- count for the impact of rarely occurring, undetected faults. Conventional integrity monitor- ing method has high dependency on GPS system, while the traditional Global Navigation Satellite System - Inertia Measurement Unit (GNSS-IMU) based localization does not ap- plied in the metropolitan areas due to the signal blocking and multi-pathing problem caused by high-rise structures. Thus, this dissertation concentrates on the feature based integrity monitoring method. This dissertation formulates environmental localization safety problem as a system- atic optimization problem: given the robot’s trajectory and the current landmark map, add the minimal number of new landmarks at certain location such that the integrity risk along the trajectory is below a given safety threshold. This dissertation proposes two algorithms to solve the problem: Integrity-based Landmark Generator (I-LaG) and Fast I-LaG. I-LaG adds fewer landmarks but it is relatively computationally expensive; Fast I-LaG is less com- putationally intensive at the expense of more landmarks. Both simulation and experimental results are presented.
Show less

Title: WHY AND WHY-NOT PROVENANCE FOR QUERIES WITH NEGATION
Creator: Lee, Seokki
Date: 2020
Description: Explaining why an answer is in the result of a query or why it is missing from the result is important for many applications including...
Show moreExplaining why an answer is in the result of a query or why it is missing from the result is important for many applications including auditing, debugging data and queries, hypothetical reasoning about data, and data exploration. Both types of questions, i.e., why and why-not provenance, have been studied extensively, but mostly in isolation. A recent study shows that unification of why and why-not provenance can be achieved by developing a provenance model for queries with negation. In many complex queries, negation is natural and yields more expressive power. Thus, supporting both types of provenance and negation together can be useful for, e.g., debugging (missing) data over complex queries with negation. However, why-not provenance and — to a lesser degree — why provenance, can be very large resulting in severe scalability and usability challenges.In this thesis, we introduce a framework that unifies why and why-not provenance. We develop a graph-based provenance model that is powerful enough to encode the evaluation of queries with negation (First-Order queries). We demonstrate that our model generalizes a wide range of provenance models from the literature. Using our model, we present the first practical approach that efficiently generates explanations, i.e., parts of the provenance that are relevant to the query outputs of interest. Furthermore, we present a novel approximate summarization technique to address the scalability and usability challenges. Our technique efficiently computes pattern-based provenance summaries that balance informativeness, conciseness, and completeness. To achieve scalability, we integrate sampling techniques into provenance capture and summarization. We implement these techniques in our PUG (Provenance Unification through Graphs) system which runs on top of a relational database. We demonstrate through extensive experiments that our approach scales to large datasets and produces comprehensive and meaningful (summaries of) provenance.
Show less

Title: DATA SHARING WITH PRIVACY AND SECURITY
Creator: Qian, Jianwei
Date: 2019
Description: Data is a non-exclusive resource and has synergistic effects. Open data sharing will enhance the utilization of big data’s value and...
Show moreData is a non-exclusive resource and has synergistic effects. Open data sharing will enhance the utilization of big data’s value and tremendously boost economic growth and transparency. Data sharing platforms have emerged worldwide, but with very limited services. Security is one of the main reasons why most data are not commonly shared. This dissertation aims to tackle several security issues in building a trustworthy data sharing ecosystem. First, I reveal the privacy risks in data sharing by designing de-anonymization and privacy inference attacks. Second, I present an analysis of the relationship between the attacker's knowledge and the privacy risk of data sharing, and try quantifying and estimating the risk. Then, I propose anonymization algorithms to protect the privacy of participants in data sharing. Finally, I survey the status quo, privacy and security concerns, and opportunities in data trading. This dissertation involves various data types with a focus on graph data and speech data; it also involves various forms of data sharing including collection, publishing, query, and trading.
Show less

repository.iit

Search the repository

Pages

Pages

Enabled Filters

Refine Results

Type

Date

Academic status of creator

Department

Subject

Creator

Rights