Scheduling large amount of jobs/tasks over large-scale distributed systems play a significant role to achieve high system utilization and... Show moreScheduling large amount of jobs/tasks over large-scale distributed systems play a significant role to achieve high system utilization and throughput. Today’s state-of-the-art job management/scheduling systems have predominantly Master/Slaves architectures, which have inherent limitations, such as scalability issues at extreme scales (e.g. petascales and beyond) and single point failures. In designing the next-generation job management system that addresses both of these limitations, we argue that we must distribute the job scheduling and management; however, distributed job management introduces new challenges, such as non-trivial load balancing. This thesis proposes an adaptive work stealing technique to achieve distributed load balancing at extreme scales, those found in todays’ petascale systems towards tomorrow’s exascale systems. This thesis also presents the design, analysis and implementation of a distributed execution fabric called MATRIX (MAny-Task computing execution fabRIc at eXascales). MATRIX utilizes the adaptive work stealing algorithm for distributed load balancing and distributed hash tables for managing task metadata. MATRIX supports both high-performance computing (HPC) and many-task computing (MTC) workloads. We have validated it using synthetic workloads up to 4K-cores on a IBM BlueGene/P supercomputer. Results show that high efficiencies (e.g. 90%+) are possible with certain workloads. We study the performance of MATRIX in depth, including understanding the network traffic generated by the work stealing algorithm. Simulation results are presented up to 1M-node scales which show that work stealing is a scalable and efficient load balancing approach for many-core architectures to extreme-scale distributed systems. M.S. in Computer Science, May 2013 Show less