Search results

Title: MULTI-DIMENSIONAL BATCH SCHEDULING FRAMEWORK FOR HIGH-END SUPERCOMPUTERS
Creator: Zhou, Zhou
Date: 2016, 2016-05
Description: In the field of high performance computing (HPC), batch scheduling plays a critical role. They determine when and how to process the various...
Show moreIn the field of high performance computing (HPC), batch scheduling plays a critical role. They determine when and how to process the various jobs waiting for service. Conventional batch schedulers allocate user jobs solely based on their CPU footprints. However, for a given user job, it requires many different resources during its execution, such as power, network, I/O bandwidth, etc. Today’s job schedulers rarely take into account these resource requirements which sometimes turn out to be the Achilles’ heel of system-wide performance. In this research, we propose a multi-dimensional batch scheduling framework for high-end supercomputers. Our research aims to treat these common but often ignored resources (e.g., power, network, bandwidth) as schedulable resource and further transform each scheduling into a multi-objective optimization process. Our main contributions consist of a set of scheduling models and policies, aiming at addressing the issues in batch scheduling for large-scale production supercomputers. We evaluate our design by means of trace-based simulations using real workload and performance traces from production systems. Experimental results show our methods can effectively improve batch scheduling regarding user satisfaction, system performance and operating cost.
Ph.D. in Computer Science, May 2016
Show less

Title: BIG DATA SYSTEM INFRASTRUCTURE AT EXTREME SCALES
Creator: Zhao, Dongfang
Date: 2015, 2015-07
Description: Rapid advances in digital sensors, networks, storage, and computation along with their availability at low cost is leading to the creation of...
Show moreRapid advances in digital sensors, networks, storage, and computation along with their availability at low cost is leading to the creation of huge collections of data { dubbed as Big Data. This data has the potential for enabling new insights that can change the way business, science, and governments deliver services to their consumers and can impact society as a whole. This has led to the emergence of the Big Data Computing paradigm focusing on sensing, collection, storage, management and analysis of data from variety of sources to enable new value and insights. To realize the full potential of Big Data Computing, we need to address several challenges and develop suitable conceptual and technological solutions for dealing them. Today's and tomorrow's extreme-scale computing systems, such as the world's fastest supercomputers, are generating orders of magnitude more data by a variety of scienti c computing applications from all disciplines. This dissertation addresses several big data challenges at extreme scales. First, we quantitatively studied through simulations the predicted performance of existing systems at future scales (for example, exascale 1018 ops). Simulation results suggested that current systems would likely fail to deliver the needed performance at exascale. Then, we proposed a new system architecture and implemented a prototype that was evaluated on tens of thousands nodes on par with the scale of today's largest supercomputers. Micro benchmarks and real-world applications demonstrated the e ectiveness of the proposed architecture: the prototype achieved up to two orders of magnitude higher data movement rate than existing approaches. Moreover, the system prototype was incorporated with features that were not well supported in conventional systems, such as distributed metadata management, distributed caching, lightweight provenance, transparent compression, acceleration through GPU encoding, and parallel serialization. Towards exploring the proposed architecture at millions of node scales, simulations were conducted and evaluated with a variety of workloads, showing near linear scalability and orders of magnitude better performance than today's state-of-the-art storage systems.
Ph.D. in Computer Science, July 2015
Show less

Title: AUTOMATIC SUMMARIZATION OF CLINICAL ABSTRACTS FOR EVIDENCE-BASED MEDICINE
Creator: Summerscales, Rodney L.
Date: 2013, 2013-12
Description: The practice of evidence-based medicine (EBM) encourages health professionals to make informed treatment decisions based on a careful analysis...
Show moreThe practice of evidence-based medicine (EBM) encourages health professionals to make informed treatment decisions based on a careful analysis of current research. However, after caring for their patients, medical practitioners have little time to spend reading even a small fraction of the rapidly growing body of medical research literature. As a result, physicians must often rely on potentially outdated knowledge acquired in medical school. Systematic reviews of the literature exist for speci c clinical questions, but these must be manually created and updated as new research is published. Abstracts from well-written clinical research papers contain key information regarding the design and results of clinical trials. Unfortunately, the free text nature of abstracts makes it di cult for computer systems to use and time consuming for humans to read. I present a software system that reads abstracts from randomized controlled trials, extracts key clinical entities, computes the e ectiveness of the proposed interventions and compiles this information into machine readable and human readable summaries. This system uses machine learning and natural language processing techniques to extract the key clinical information describing the trial and its results. It extracts the names and sizes of treatment groups, population demographics, outcome measured in the trial and outcome results for each treatment group. Using the extracted outcome measurements, the system calculates key summary measures used by physicians when evaluating the e ectiveness of treatments. It computes absolute risk reduction (ARR) and number needed to treat (NNT) values complete with con dence intervals. The extracted information and computed statistics are automatically compiled into XML and HTML summaries that describe the details and results of the trial. xiii Extracting the necessary information needed to calculate these measures is not trivial. While there have been various approaches to generating summaries of medical research, this work has mostly focused on extracting trial characteristics (e.g. population demographics, intervention/outcome information). No one has attempted to extract all of the information needed, nor has anyone attempted to solve many of the tasks needed to reliably calculate the summary statistics.
PH.D in Computer Science, December 2013
Show less

Title: COVERAGE AND CONNECTIVITY IN WIRELESS NETWORKS
Creator: Xu, Xiaohua
Date: 2012-04-25, 2012-05
Description: The limited energy resources, instability, and lacking central control in wireless networks motivates the study of connected dominating set ...
Show moreThe limited energy resources, instability, and lacking central control in wireless networks motivates the study of connected dominating set (CDS) which serves as rout- ing backbone to support service discovery, and area monitoring and also broadcasting. The construction of CDS involves both coverage and connectivity. We ¯rst study sev- eral problems related to coverage. Given are a set of nodes and targets in a plane, the problem Minimum Wireless Cover (MWC) seeks the fewest nodes to cover the targets. If all nodes are associated with some positive prices, the problem Cheapest Wireless Cover (CWC) seeks a cheapest set of nodes to cover the targets. If all nodes have bounded lives, the problem Max-Life Wireless Cover (MLWC) seeks wireless coverage schedule of maximum life subject to the life constraints of individ- ual nodes. We present a polynomial time approximation scheme (PTAS) for MWC, and two randomized approximation algorithms for CWC and MLWC respectively. Given a node-weighted graph, the problem Minimum-Weighted Dominating Set (MWDS) is to ¯nd a minimum-weighted vertex subset such that, for any vertex, it is contained in this subset or it has a neighbor contained in this set. We will propose a (4+²)-approximation algorithm for MWDS in unit disk graphs. Meanwhile, for the connecting part, given a node-weighted connected graph and a subset of terminals, the problem Node-Weighted Steiner Tree (NWST) seeks a lightest tree connecting a given set of terminals in a node-weighted graph. We present three approximation algorithms for NWST restricted to UDGs. This dissertation also explores the applications of CDS, and develops e±cient algorithms for the applications such as real-time aggregation scheduling in wireless networks. Given a set of periodic aggregation queries, each query has its own period , and the subset of source nodes Si containing the data, we ¯rst propose a family of e±cient and e®ective real-time scheduling protocols that can answer every job of each query task within a relative delay under resource constraints by addressing the following tightly coupled tasks: routing, transmission plan constructions, node activity scheduling, and packet scheduling. Based on our protocol design, we further propose schedulability test schemes to e±ciently and e®ectively test whether, for a set of queries, each query job can be ¯nished within a ¯nite delay. We also conduct extensive simulations to validate the proposed protocol and evaluate its practical performance. The simulations corroborate our theoretical analysis.
Ph.D. in Computer Science, May 2012
Show less

Title: AN INTEGRATED DATA ACCESS SYSTEM FOR BIG COMPUTING
Creator: Yang, Xi
Date: 2016, 2016-07
Description: Big data has entered every corner of the fields of science and engineering and becomes a part of human society. Scientific research and...
Show moreBig data has entered every corner of the fields of science and engineering and becomes a part of human society. Scientific research and commercial practice are increasingly depending on the combined power of high-performance computing (HPC) and high-performance data analytics. Due to its importance, several commercial computing environments have been developed in recent years to support big data applications. MapReduce is a popular mainstream paradigm for large-scale data analytics. MapReduce-based data analytic tools commonly rely on underlying MapReduce file systems (MRFS), such as Hadoop Distributed File System (HDFS), to manage massive amounts of data. In the same time, conventional scientific applications usually run on HPC environments, such as Message Passing Interface (MPI), and their data are kept in parallel file systems (PFS), such as Lustre and GPFS, for high-speed computing and data consistency. As scientific applications become data intensive and big data applications become computing hungry, there is a surging interest and need to integrate HPC power and data processing power to support HPC on big data, the so-called big computing. A fundamental issue of big computing is the integration of data management and interoperability between the conventional HPC ecosystem and the newly emerged data processing/analytic ecosystem. However, data sharing between PFS and MRFS is limited currently, due to semantics mismatches, lacking communication middleware, and the diverged design philosophies and goals, etc. Also, challenges also exist in cross-platform task scheduling and parallelism. At the application layer, the data model mismatch between the raw data kept on file systems and the data management software of an application impedes cross-platform data processing as well. To support cross-platform integration, we propose and develop the Integrated Data Access System (IDAS) for big computing. IDAS extends the accessibilities of programming models and integrates the HPC environment with the data processing MapReduce/Hadoop environment. Under IDAS, MPI applications and MapReduce applications can share and exchange data under PFS and MRFS transparently and efficiently. Through this sharing and exchange, MPI and MapReduce applications can collaboratively provide both high-performance computing and data processing power for a given application. IDAS achieves its goal with several steps. First, IDAS enhances MPI-IO so that MPI-based applications can access data stored in HDFS efficiently. Here the term efficient means that HDFS is enhanced to support MPI-based applications. For instance, we have enhanced HDFS to transparently support N-to-1 file write for better write concurrency. Second, IDAS enhances Hadoop framework to enable MapReduce-based applications process data that resides on PFS transparently. Please notice that we have carefully chosen the term “enhance” here. That is MPI-based applications not only can access data stored on HDFS but also can continue access data stored on PFS. The same is for MapReduce-based applications. Through these enhancements, we achieve seamless data sharing. In addition, we have integrated data accessing with several application tools. In particular, we have integrated image plotting, query, and data subsetting within one application, for Earth Science data analysis. Many data centers prefer erasure-coding rather than triplication to achieve data durability, which trades data availability for lower storage cost. To this end, we have also investigated performance optimization of the erasure coded Hadoop system, to enhance Hadoop system in IDAS.
Ph.D. in Computer Science, July 2016
Show less

Title: QUALITY-OF-SERVICE AWARE SCHEDULING AND DEFECT TOLERANCE IN REAL-TIME EMBEDDED SYSTEMS
Creator: Li, Zheng
Date: 2015, 2015-05
Description: For real-time embedded systems, such as control systems used in medical, automotive and avionics industry, tasks deployed on such systems...
Show moreFor real-time embedded systems, such as control systems used in medical, automotive and avionics industry, tasks deployed on such systems often have stringent real-time, reliability and energy consumption constraints. How to schedule real-time tasks under various QoS constraints is a challenging issue that has drawn attention from the research community for decades. In this thesis, we study task execution strategies that not only minimize system energy consumption but also guarantee task deadlines and reliability satisfaction. We first consider the scenario when all tasks are of the same criticality. For this case, two task execution strategies, i.e. checkpointing based and task re-execution based strategies are developed. Second, considering the scenario when tasks are of different criticalities, a heuristic search based energy minimization strategy is also proposed. When tasks are of different criticalities, a commonly used approach to guaranteeing high-criticality task deadlines is to remove low-criticality tasks whenever the system is overloaded. With such an approach, the QoS provided to low-criticality tasks is rather poor, it can cause low-criticality tasks to have high deadline miss rate and less accumulated execution time. To overcome this shortcoming, we develop a time reservation based scheduling algorithm and a two-step optimization algorithm to meet high-criticality task deadlines, while minimizing low-criticality task deadline miss rate and maximizing their accumulated execution time, respectively. As many-core techniques mature, many real-time embedded systems are built upon many-core platforms. However, many-core platforms have high wear-out failure rate. Hence, the last issue to be addressed in the thesis is how to replace defective cores on many-core platforms so that deployed applications’ real-time properties can be maintained. We develop an offline and an online application-aware system reconfiguration strategy to minimize the impact of the physical layer changes on deployed real-time applications. All the developed approaches are evaluated through extensive simulations. The results indicate that the developed approaches are more effective in addressing the identified problems compared to the existing ones in the literature.
Ph.D. in Computer Science, May 2015
Show less

Title: COOPERATIVE BATCH SCHEDULING FOR HPC SYSTEMS
Creator: Yang, Xu
Date: 2017, 2017-05
Description: The batch scheduler is an important system software serving as the interface between users and HPC systems. Users submit their jobs via batch...
Show moreThe batch scheduler is an important system software serving as the interface between users and HPC systems. Users submit their jobs via batch scheduling portal and the batch scheduler makes scheduling decision for each job based on its request for system resources and system availability. Jobs submitted to HPC systems are usually parallel applications and their lifecycle consists of multiple running phases, such as computation, communication and input/output data. Thus, the running of such parallel applications could involve various system resources, such as power, network bandwidth, I/O bandwidth, storage, etc. And most of these system resources are shared among concurrently running jobs. However, Today's batch schedulers do not take the contention and interference between jobs over these resources into consideration for making scheduling decisions, which has been identified as one of the major culprits for both the system and application performance variability. In this work, we propose a cooperative batch scheduling framework for HPC systems. The motivation of our work is to take important factors about jobs and the system, such as job power, job communication characteristics and network topology, for making orchestrated scheduling decisions to reduce the contention between concurrently running jobs and to alleviate the performance variability. Our contributions are the design and implementation of several coordinated scheduling models and algorithms for addressing some chronic issues in HPC systems. The proposed models and algorithms in this work have been evaluated by the means of simulation using workload traces and application communication traces collected from production HPC systems. Preliminary experimental results show that our models and algorithms can effectively improve the application and the system overall performance, HPC facilities' operation cost, and alleviate the performance variability caused by job interference.
Ph.D. in Computer Science, May 2017
Show less

Title: SYSTEM SUPPORT FOR RESILIENCE IN LARGE-SCALE PARALLEL SYSTEMS: FROM CHECKPOINTING TO MAPREDUCE
Creator: Jin, Hui
Date: 2012-05-31, 2012-05
Description: High-Performance Computing (HPC) has passed the Petascale mark and is moving forward to Exascale. As the system ensemble size continues to...
Show moreHigh-Performance Computing (HPC) has passed the Petascale mark and is moving forward to Exascale. As the system ensemble size continues to grow, the occurrence of failures is the norm rather than the exception during the execution of parallel applications. Resilience is widely recognized as one of the key obstacles towards Exascale computing. Checkpointing is currently the de-facto fault tolerant mechanism for parallel applications. However, parallel checkpointing at scale usually generates bursts of concurrent I/O requests, imposes considerable overhead to I/O subsystems, and limits the scalability of parallel applications. Despite the doubt in the feasibility of checkpointing continues to increase, there is still no promising alternative on the horizon yet to replace checkpointing. MapReduce is a new programming model for massive data processing. It has demonstrated a compelling potential in reshaping the landscape of HPC from various perspectives. The resilience of MapReduce applications and its potential in benefiting HPC fault tolerance are active research topics that require extensive investigation. This thesis work targets at building a systematic framework to support resilience in large-scale parallel systems. We address the identified checkpointing performance issue through a three-fold approach: reduce the I/O overhead, exploit storage alternatives, and determine the optimistic checkpointing frequency. This three-fold approach is achieved with three different mechanisms, namely system coordination and scheduling, the utilization of MapReduce framework, and stochastic modeling. To deal with the increasing concerns about MapReduce resilience, we also strive to improve the reliability of MapReduce applications, and investigate the tradeoffs in the programming model selection (e.g., MPI v.s. MapReduce) from the perspective of resilience. This thesis provides a thorough study and a practical solution for solving the outstanding resilience problem of large-scale MPI-based HPC applications and beyond. It makes a noticeable contribution to the state-of-the-art and opens a new research direction for many to follow.
Ph.D. in Computer Science, May 2012
Show less

Title: WIRELESS SCHEDULING IN MULTI-CHANNEL MULTI-RADIO MULTIHOP WIRELESS NETWORKS
Creator: Wang, Zhu
Date: 2014, 2014-07
Description: Maximum multi ow (MMF) and maximum concurrent multi ow (MCMF) in multi-channel multi-radio (MC-MR) wireless networks have been well-studied in...
Show moreMaximum multi ow (MMF) and maximum concurrent multi ow (MCMF) in multi-channel multi-radio (MC-MR) wireless networks have been well-studied in the literature. They are NP-hard even in single-channel single-radio (SC-SR) wireless networks when all nodes have uniform (and xed) interference radii and the positions of all nodes are available. This disertation studies maximum multi ow (MMF) and maximum concur- rent multi ow (MCMF) in muliti-channel multi-radio multihop wireless networks under the protocol interference model in the bidirectional mode or the unidirectional mode. We introduce a ne-grained network representation of multi-channel multi- radio multihop wireless networks and present some essential topological properties of its associated con ict graph. It was proved that if the number of channels is bounded by a constant (which is typical in practical networks), both MMF and MCMF admit a polynomial-time ap- proximation scheme under the protocol interference model in the bidirectional mode or the unidirectional mode with some additional mild conditions. However, the run- ning time of these algorithms grows quickly with the number of radios per node (at least in the sixth order) and the number of channels (at least in the cubic order). Such poor scalability stems intrinsically from the exploding size of the ne-grained network representation upon which those algorithms are built. In Chapter 2 of this dissertation, we introduce a new structure, termed as concise con ict graph, on the node-level links directly. Such structure succinctly captures the essential advantage of multiple radios and multiple channels. By exploring and exploiting the rich structural properties of the concise con ict graphs, we are able to develop fast and scalable link scheduling algorithms for either minimizing the communication latency or maximizing the (concurrent) multi ow. These algorithms have running time growing linearly in both the number of radios per node and the number of channels, while not sacri cing the approximation bounds. While the algorithms we develop in Chapter 2 admit a polynomial-time ap- proximation scheme (PTAS) when the number of channels is bounded by a constant, such PTAS is quite infeasible practically. Other than the PTAS, all other known approximation algorithms, in both SC-SR wireless networks and MC-MR wireless networks, resorted to solve a polynomial-sized linear program (LP) exactly. The s- calability of their running time is fundamentally limited by the general-purposed LP solvers. In Chapter 3 of this dissertation, we rst introduce the concept of interference costs and prices of a path and explore their relations with the maximum (concurrent) multi ow. Then we develop purely combinatorial approximation algorithms which compute a sequence of least interference-cost routing paths along which the ows are routed. These algorithms are faster and simpler, and achieve nearly the same approximation bounds known in the literature. This dissertation also explores the stability analysis of two link scheduling in MC-MR wireless networks under the protocol interference model in the bidirectional mode or the unidirectional mode. Longest-queue- rst (LQF) link scheduling is a greedy link scheduling in multihop wireless networks. Its stability performance in single-channel single-radio (SC-SR) wireless networks has been well studied recently. However, its stability performance in multi-channel multi-radio (MC-MR) wireless networks is largely under-explored. We present a stability subregion with closed form of the LQF scheduling in MC-MR wireless networks, which is within a constant factor of the network stability region. We also obtain constant lower bounds on the efficiency ratio of the LQF scheduling in MC-MR wireless networks under the protocol interference model in the bidirectional mode or unidirectional mode. Static greedy link schedulings have much simpler implementation than dy- namic greedy link schedulings such as Longest-queue-frst (LQF) link scheduling. However, its stability performance in multi-channel multi-radio (MC-MR) wireless networks is largely under-explored. In this dissertation, we present a stability subre- gion with closed form of a static greedy link scheduling in MC-MR wireless networks under the protocol interference model in the bidirectional mode. By adopting some special static link orderings, the stability subregion is within a constant factor of the stable capacity region of the network. We also obtain constant lower bounds on the throughput efficiency ratios of the static greedy link schedulings in some special static link orderings.
Ph.D. in Computer Science, July 2014
Show less

Title: APPLICATION-AWARE OPTIMIZATIONS FOR BIG DATA ACCESS
Creator: Yin, Yanlong
Date: 2014, 2014-07
Description: Many High-Performance Computing (HPC) applications spend a significant portion of their execution time in accessing data from les and they are...
Show moreMany High-Performance Computing (HPC) applications spend a significant portion of their execution time in accessing data from les and they are becoming increasingly data-intensive. For them, I/O performance is a significant bottleneck leading to wastage of CPU cycles and the corresponding wasted energy consumption. Various optimization techniques exist to improve data access performance. However, the existing general-purpose optimization techniques are not able to satisfy diverse applications' demands. On the other hand, the application-specific optimization pro- cess is usually a difficult task due to the complexity involved in understanding the parallel I/O system and the applications' I/O behaviors. To address these challenges, this thesis proposes an application-aware data access optimization framework and claims that it is feasible and useful to utilize applications' characteristics to improve the performance and efficiency of the parallel I/O system. Under this framework, an optimization may consist of several basic but challenging steps, including capturing the application's characteristics, identifying the causality of I/O performance degra- dation, and delivering optimization solutions. To make these steps easier, we design and implement the IOSIG toolkit as an essential system support for the default par- allel I/O system. The toolkit is able to pro le the applications' I/O behaviors and then generate comprehensive characteristics through trace analysis. With the help of IOSIG, we design several optimization techniques on data layout optimization, data reorganization, and I/O scheduling. The proposed framework has significant poten- tial to boost application-aware I/O optimization. The results prove that the proposed optimization techniques can significantly improve the data access performance.
Ph.D. in Computer Science, July 2014
Show less

Title: THE EUML-ARC PROGRAMMING MODEL
Creator: Marth, Kevin
Date: 2014, 2014-07
Description: The EUML-ARC programming model shows that the increasing parallelism available on multi-core processors requires evolutionary (not...
Show moreThe EUML-ARC programming model shows that the increasing parallelism available on multi-core processors requires evolutionary (not revolutionary) changes in software design. The EUML-ARC programming model combines and extends software technology available even before the introduction of multi-core processors to provide software engineers with the ability to specify software systems that expose abstract platform-independent parallelism. The EUML-ARC programming model is a synthesis of Executable UML, the Actor model, role-based modeling, split objects, and aspect-based coordination. Computation in the EUML-ARC programming model is structured in terms of semantic entities composed of actor-based agents whose behaviors are expressed in hierarchical state machines. An entity is composed of a base intrinsic agent and multiple extrinsic role agents, all with dedicated conceptual threads of control. Entities interact through their role agents in the context of featureoriented collaborations orchestrated by coordinator agents. The conceptual threads of control associated with the agents in a software system expose both intra-entity and inter-entity parallelism that is mapped by the EUML-ARC model compiler to the hardware threads available on the target multi-core processor. The hardware and software e ciency achieved with representative benchmark systems show that the EUML-ARC programming model and its compiler can exploit multi-core parallelism while providing a productive model-driven approach to software development.
Ph.D. in Computer Science, July 2014
Show less

Title: CAPACITY BOUNDS FOR LARGE SCALE WIRELESS SENSOR NETWORKS
Creator: Tang, Shaojie
Date: 2012-11-20, 2012-12
Description: We study the network capacity of large scale wireless sensor networks under both Gaussian Channel model and Protocol Interference Model. To...
Show moreWe study the network capacity of large scale wireless sensor networks under both Gaussian Channel model and Protocol Interference Model. To study network capacity under gaussian channel model, we assume n wireless nodes {v1, v2, · · · , vn} are randomly or arbitrarily distributed in a square region Ba with side-length a. We randomly choose ns multicast sessions. For each source node vi, we randomly select k points pi,j (1 ≤ j ≤ k) in Ba and the node which is closest to pi,j will serve as a destination node of vi. The per-flow unicast(multicast) capacity is defined as the minimum data rate of all unicast(multicast) sessions in this network. We derive the achievable upper bounds on unicast capacity and a upper bound(partial achievable) on multicast capacity of the wireless networks under and Gaussian Channel model. We found that the unicast(multicast) capacity for wireless networks under both two models has three regimes. Under protocol interference model, we assume that n wireless nodes are randomly deployed in a square region with side-length a and all nodes have the uniform transmission range r and uniform interference range R > r. We further assume that each wireless node can transmit/receive at W bits/second over a common wireless channel. For each node vi, we randomly pick k − 1 nodes from the other n − 1 nodes as the receivers of the multicast session rooted at node vi. The aggregated multicast capacity is defined as the total data rate of all multicast sessions in the network. In this work we derive matching asymptotic upper bounds and lower bounds on multicast capacity of large scale random wireless networks under protocol interference model.
PH.D in Computer Science, December 2012
Show less

Title: SCALABLE RESOURCE MANAGEMENT SYSTEM SOFTWARE FOR EXTREMESCALE DISTRIBUTED SYSTEMS
Creator: Wang, Ke
Date: 2015, 2015-07
Description: Distributed systems are growing exponentially in the computing capacity. On the high-performance computing (HPC) side, supercomputers are...
Show moreDistributed systems are growing exponentially in the computing capacity. On the high-performance computing (HPC) side, supercomputers are predicted to reach exascale with billion-way parallelism around the end of this decade. Scientific applications running on supercomputers are becoming more diverse, including traditional large-scale HPC jobs, small-scale HPC ensemble runs, and fine-grained many-task computing (MTC) workloads. Similar challenges are cropping up in cloud computing as data-centers host ever growing larger number of servers exceeding many top HPC systems in production today. The applications commonly found in the cloud are ushering in the era of big data, resulting in billions of tasks that involve processing increasingly large amount of data. However, the resource management system (RMS) software of distributed systems is still designed around the decades-old centralized paradigm, which is far from satisfying the ever-growing needs of performance and scalability towards extreme scales, due to the limited capacity of a centralized server. This huge gap between the processing capacity and the performance needs has driven us to develop next-generation RMSs that are magnitudes more scalable. In this dissertation, we first devise a general system software taxonomy to explore the design choices of system software, and propose that key-value stores could serve as a building block. We then design distributed RMS on top of key-value stores. We propose a fully distributed architecture and a data-aware work stealing technique for the MTC resource management, and develop the SimMatrix simulator to explore the distributed designs, which informs the real implementation of the MATRIX task execution framework. We also propose a partition-based architecture and resource sharing techniques for the HPC resource management, and implement them by building the Slurm++ real workload manager and the SimSlurm++ simulator. We study the distributed designs through real systems up to thousands of nodes, and through simulations up to millions of nodes. Results show that the distributed paradigm has significant advantages over centralized one. We envision that the contributions of this dissertation will be both evolutionary and revolutionary to the extreme-scale computing community, and will lead to a plethora of following research work and innovations towards tomorrow’s extremescale systems.
Ph.D. in Computer Science, July 2015
Show less

Title: A STEP TOWARD SUPPORTING LONG-RUNNING APPLICATIONS WITH REAL-TIME CONSTRAINTS ON HYBRID CLOUDS
Creator: Wu, Hao
Date: 2017, 2017-05
Description: The advancement of computer and network technology has brought the world into a new computer cloud era. The ”pay-as-you-go” business model and...
Show moreThe advancement of computer and network technology has brought the world into a new computer cloud era. The ”pay-as-you-go” business model and the service oriented models allow users to have ”unlimited” resources if needed and free from infrastructure maintenance and software upgrades. Cloud services are currently among the top-ranked high growth areas in computing and are seeing an acceleration in enterprise adoption with the worldwide market predicted to reach more than $270b in 2020. According to Google, currently more than 95% of the web services are deployed on cloud.Many di↵erent types of applications are deployed on computer clouds. However, due to inherent performance uncertainty within computer clouds, as of today, applications with real-time and high QoS constraints still operate on traditional computer systems and are not able to benefit from elastic computer clouds.. The thesis focuses on both theoretical analysis and real system implementation on the problem of guaranteeing real-time application’s deadline requirement while minimizing the application’s execution cost on hybrid clouds. Four major problems have been addressed towards moving applications with real-time constraint on hybrid computer clouds. 1). A minimal slack time and minimal distance (MSMD) scheduling algorithm is developed to minimize the resources needed to guarantee an application’s end-to-end deadline requirement using computer clouds. 2). A VM Instance Hour Minimization (IHM) algorithm is developed to reduce the application’s execution cost for given schedules. The proposed IHM algorithm can be integrated with common scheduling algorithm used in the literature. In addition, we also evaluated the feasibility of utilizing spot instance to further reduce the application’s execution cost while not sacrificing QoS guarantees. 3). A reference model for virtual machine launching overhead is developed to predict both system utilization and timing overhead during the VM launching process. 4). A hybrid cloud management tool that integrates the developed algorithms and reference model is developed to support running long-running applications with real-time constraints on hybrid clouds.
Ph.D. in Computer Science, May 2017
Show less

Title: PERFORMANCE ANALYSIS AND OPTIMIZATION OF LARGE-SCALE SCIENTIFIC APPLICATIONS
Creator: Wu, Jingjin
Date: 2013, 2013-07
Description: Scientific applications are critical for solving complex problems in many areas of research, and often require a large amount of computing...
Show moreScientific applications are critical for solving complex problems in many areas of research, and often require a large amount of computing resources in terms of both runtime and memory. Massively parallel supercomputers with ever increasing computing power are being built to satisfy the need of large-scale scientific applications. With the advent of petascale era, there is an enlarged gap between the computing power of supercomputers and the parallel scalability of many applications. To take full advantage of the massive parallelism of supercomputers, it is indispensable to improve the scalability of large-scale scientific applications through performance analysis and optimization. This thesis work is motivated by cell-based AMR (Adaptive Mesh Refinement) cosmology simulations, in particular, the Adaptive Refinement Tree (ART) application. Performance analysis is performed to identify its scaling bottleneck, a performance emulator is designed for efficient evaluation of different load balancing schemes, and topology mapping strategies are explored for performance improvements. More importantly, the exploration of topology mapping mechanisms leads to a generic methodology for network and multicore aware topology mapping, and a set of efficient mapping algorithms for popular topologies. These have been implemented in a topology mapping library – TOPOMap, which can be used to support MPI topology functions.
PH.D in Computer Science, July 2013
Show less

Title: TOWARDS COMPREHENSIVE COUNTERMEASURES AGAINST CYBER ATTACKS TO IMPROVE SYSTEM SURVIVABILITY
Creator: Wang, Li
Date: 2012-11-20, 2012-12
Description: Survivability refers to the capability of a system to ful ll its mission, in a timely manner, in the presence of attacks, failures, or...
Show moreSurvivability refers to the capability of a system to ful ll its mission, in a timely manner, in the presence of attacks, failures, or accidents. For many distributed systems, ensuring their survivability under directed attacks is critical. Tra c analysis, conducted by the attacker, could reveal the protocol being carried out by the components. Furthermore, having inferred the protocol, the attacker can use the pattern of the messages as a guide to the most critical components. In this thesis, we rst thwart these directed attacks by using message forwarding to reduce tra c di erences, thus diverge attackers from directed attack to random attack, which probabilistically prolongs the availability of important components in the system. Then, we investigate how to improve system availability when the system is under random attack. Although the attackers cannot di erentiate the di erences between critical and non-critical components, they can intelligently decide how to invest their resources by rationally selecting the number of components to attack. Under this case, how to maintain system reliability is another challenging issue. This thesis further discusses the attacker-defender problem and analyzes how to maximize system reliability under rational attacks. When one or more system processing elements are compromised by attackers, how to select applications and deploy their tasks to the remaining processing elements so that the system availability is maximized is also investigated in this thesis. To be more speci c, we assume the applications may have di erent values towards system availability and may or may not share the same composing tasks, and we presented two di erent approaches, i.e., Genetic Algorithm (GA) based approach and Max-Min-Min based approach to solving this problem. GA-based approach produces near optimal solutions and it can be used o -line when the performance is important and timing complexity is not the primary concern. While the Max-Min-Min based approach is computationally e cient and it is used when the timing is critical.
PH.D in Computer Science, December 2012
Show less

Title: RELIABILITY AND ENERGY ANALYSIS FOR EXTREME SCALE SYSTEMS
Creator: Yu, Li
Date: 2015, 2015-12
Description: Reliability and energy are two of the top major concerns in the development of today's supercomputers. To build a powerful machine while at...
Show moreReliability and energy are two of the top major concerns in the development of today's supercomputers. To build a powerful machine while at the same time satisfying reliability requirement and energy constraint, HPC scientists continue to seek a better understanding of system and component behaviors. Toward this end, modern systems are deployed with various monitoring and logging tools to track reliability and energy data during system operations. Since these data contain important information about system reliability and energy, they are valuable resources for understanding system behaviors. However, as system scale and complexity continue to grow, the process of collecting system data to extracting meaningful knowledge out of overwhelming reliability and energy data faces a number of key challenges. To address these challenges, my work consists of three parts, including data preprocessing, data analysis and advanced modeling.
Ph.D. in Computer Science, December 2015
Show less

Title: POWER PROFILING, ANALYSIS, LEARNING, AND MANAGEMENT FOR HIGH-PERFORMANCE COMPUTING
Creator: Wallace, Sean
Date: 2017, 2017-05
Description: As the field of supercomputing continues its relentless push towards greater speeds and higher levels of parallelism the power consumption of...
Show moreAs the field of supercomputing continues its relentless push towards greater speeds and higher levels of parallelism the power consumption of these large scale systems is steadily transitioning from a burden to a serious problem. While the machines are highly scaleable, the buildings, power supplies, etc. are not. Even the most power efficient systems today consume one to two megawatts per peata op/s. Multiplying that by 1,000 to reach the next generation of supercomputer (i.e., exascale) and the power necessary just to turn the machine on is simply impractical. Thus, power has become a primary design constraint for future supercomputing system designs. As such, it has become a matter of paramount importance to understand exactly how current generation systems utilize power and what implications this has on future systems. As the saying goes, you can't manage what you don't measure. This work addresses several large hurdles in fully understanding the power consumption of current systems and making actionable decisions based on this understanding. First, by leveraging environmental data collected from runs of real leadership class applications we analyze power consumption and temperature as it pertains to scale on a production IBM Blue Gene/Q supercomputer. Then, through development of a new power monitoring library, MonEQ, we quantitatively studied how power is consumed in major portions of the system (e.g., CPU, memory, etc.) through profiling of microbenchmarks. Expanding on this, we then studied how scale and network topology affect power consumption for several well-known benchmarks. Wanting to increase the effectiveness of our power monitoring library, we extended it to work with many of the most common classes of hardware available in today's HPC landscape. In doing so, we provided an in-depth analysis of what data is obtainable, what the process of obtaining it is like, and how data from different systems compares. Next, utilizing the knowledge gained from these experiences, we developed a new scheduling approach which utilizing power data can effectively keep a production system's power consumption under a user-specified power cap without modification to the applications running on the system. Finally, we extend this scheduling approach to be applicable to more than just one objective. In doing so, the scheduler can now optimize on multiple criteria instead of simply considering system utilization.
Ph.D. in Computer Science, May 2017
Show less

Title: TOWARD A NATURAL GENETIC/EVOLUTIONARY ALGORITHM FOR MULTIOBJECTIVE OPTIMIZATION
Creator: Ramasamy, Hariharane
Date: 2013, 2013-05
Description: Practical optimization problems often have multiple objectives, which are likely to conflict with each other, and have more than one optimal...
Show morePractical optimization problems often have multiple objectives, which are likely to conflict with each other, and have more than one optimal solution representing the best trade-offs among the competing objectives. Genetic algorithms, which optimize by repeatedly applying genetic operators to a population of possible solutions, have been used recently in multiobjective optimization, but often converge to a single solution that is not necessarily optimal due to lack of diversity in the population. Current multiobjective genetic and other evolutionary methods prevent this premature convergence by promoting new members that are dissimilar in parameter or objective space. A distance measure, which calculates similarities among the members in either objective or parameter space, is used to degrade the fitness of solutions when they are crowded in a small region. This process forces the algorithm to find new but distinct trade-off points in the objective or parameter space, but is computationally expensive. As the number of objectives or parameters increases, the methods fail to scale up and they deviate from the motivating concept of the genetic algorithm—natural evolution. We extend the standard genetic algorithm through two simple, yet powerful, changes motivated by natural evolution. In the first method, the algorithm, at each step, randomly or sequentially chooses one of the objectives for optimization; hence the method is called sequential extended genetic algorithm (SEGA). In the second method, a population is maintained for each objective, and crossover is performed selecting parents from across populations. This method is called parallel extended genetic algorithm (PEGA). We applied these methods to test problems from the literature, and to two well known problems, protein folding and multiple knapsack. We discovered our methods found better trade-off solutions than current multiobjective methods, without increasing computational complexity of genetic algorithms.
PH.D in Computer Science, May 2013
Show less

Title: Polymorphic Network-on-Chip Datapath Architecture for Reconfigurable Computing Machines
Creator: Weber, Joshua
Date: 2012-04-18, 2012-05
Description: Polymorphic processors have considerable advantages in performance over existing reconfigurable designs. Polymorphic processors combine the...
Show morePolymorphic processors have considerable advantages in performance over existing reconfigurable designs. Polymorphic processors combine the flexibility and ease of a general purpose processor with the performance optimizations made possible through reconfigurable arrays. Polymorphic processors provide all the ease of programming from a traditional general purpose processor while incorporating the significant performance gains that can be realized using reconfigurable arrays. Polymorphic processors can be categorized by the level of integration between the general purpose processor and the reconfigurable array. At coarse levels of integration, the processor and reconfigurable array execute independently and exchange data utilizing bus structures. These systems perform robustly for high level datadriven optimizations, allowing large segments of processing to be quickly performed on fast reconfigurable resources. However, the overhead of data transfer between the processor and array limits the benefit to fine grained optimizations. Other architectures attempt a tight coupling of reconfigurable arrays, placing them within the processor as reconfigurable coprocessors and functional units. This technique allows fine grained optimizations of small scale, highly repeated computations, but finds it difficult to replicate the gains made in large coarse grained optimizations. To achieve an even more tightly coupled design than any prior work, the fundamental architecture of the processor is changed. The datapath of the processor is eliminated and replaced with a network-on-chip communications framework. This framework connects a system of reconfigurable arrays. Some of these reconfigurable blocks are tasked with execution of standard, general purpose processor computations, emulating the standard pipeline stages of a SPARC processor. Additional reconfigurable blocks are available to the end-user to incorporate custom application specific optimizations. This new polymorphic NoC datapath (PolyNoC) processor is able to provide a more tightly integrated architecture with significant performance advantages. The PolyNoC processor is able to incorporate both fine and coarse grained optimizations, producing a polymorphic processor able to provide performance improvements for a wide range of target applications. This thesis presents the architectural design of the PolyNoC processor. The unique design constraints resulting from the use of the NoC as a datapath will be fully explored. The impact of these constraints will be incorporated into the design of a suitable NoC for the PolyNoC processor. A cycle-accurate simulator of the PolyNoC processor has been constructed. This simulator is used to examine the performance of the PolyNoC processor when executing unmodified, industry standard benchmark programs. To demonstrate the advantages of application specific extensions to the processor, accelerators are added for each benchmark. The performance of the Poly- NoC processor is promising.
Ph.D. Computer Science, May 2012
Show less

repository.iit

Search the repository

Pages

Pages

Enabled Filters

Refine Results

Type

Date

Subject

Creator