Artificial Intelligence (AI) holds a great promise in the healthcare. It provides a variety of advantages with its application in clinical... Show moreArtificial Intelligence (AI) holds a great promise in the healthcare. It provides a variety of advantages with its application in clinical diagnosis, disease prediction, and treatment, with such interests intensifying in the medical image field. AI can automate various cumbersome data processing techniques in medical imaging such as segmentation of left ventricular chambers and image-based classification of diseases. However, full clinical implementation and adaptation of emerging AI-based tools face challenges due to the inherently opaque nature of such AI algorithms based on Deep Neural Networks (DNN), for which computer-trained bias is not only difficult to detect by physician users but is also difficult to safely design in software development. In this work, we examine AI application in Cardiac Magnetic Resonance (CMR) using an automated image classification task, and thereby propose an AI quality control framework design that differentially evaluates the black-box DNN via carefully prepared input data with shape and fidelity variations to probe system responses to these variations. Two variants of the Visual Geometric Graphics with 19 neural layers (VGG19) was used for classification, with a total of 60,000 CMR images. Findings from this work provides insights on the importance of quality training data preparation and demonstrates the importance of data shape variability. It also provides gateway for computation performance optimization in training and validation time. Show less
This study aimed to verify that whether a low-coverage genome can work as an effective approach to isolate Lepidopteran microsatellites. As... Show moreThis study aimed to verify that whether a low-coverage genome can work as an effective approach to isolate Lepidopteran microsatellites. As microsatellites are useful tool to study population genetics, and there are many Lepidopteran agriculture pests which can cause huge economic damages every year, additionally, Lepidoptera have abundant similar flanking sequences making it difficult to develop reliable microsatellites. However, there are not enough published genomes of Lepidoptera species. If low-coverage Lepidopteran genomes can be used to isolate reliable microsatellites, the low-coverage genomes would be an effective and economical approach for microsatellites isolation, because low-coverage genome sequencing is much cheaper and less time-consuming than the published genome sequencing. Show less
Photograph of the Aaron Galleries Booth at the Art 20 exhibition, at Park Place Armory in 2006, including Mary Henry's painting The Chelsea... Show morePhotograph of the Aaron Galleries Booth at the Art 20 exhibition, at Park Place Armory in 2006, including Mary Henry's painting The Chelsea Way visible at center. Inscription on verso: "Art 20 - Park Ave. Armory 2006 Mary Henry 'The Chelsea Way' on the aisle Aaron Galleries Booth." Show less
Photograph of the Aaron Galleries Booth at the Art 20 exhibition, at Park Place Armory in 2006, including Mary Henry's painting The Chelsea... Show morePhotograph of the Aaron Galleries Booth at the Art 20 exhibition, at Park Place Armory in 2006, including Mary Henry's painting The Chelsea Way visible at center right. Inscription on verso: "Art 20 - Park Ave. Armory 2006 Mary Henry 'The Chelsea Way' on the aisle Aaron Galleries Booth." Show less
Photograph of the Aaron Galleries Booth at the Art 20 exhibition, at Park Place Armory in 2006, including Mary Henry's painting The Chelsea... Show morePhotograph of the Aaron Galleries Booth at the Art 20 exhibition, at Park Place Armory in 2006, including Mary Henry's painting The Chelsea Way visible at right. Inscription on verso: "Art 20 - Park Ave. Armory 2006 Mary Henry 'The Chelsea Way' on the aisle Aaron Galleries Booth." Show less
Cluster scheduling plays a crucial role in the high-performance computing (HPC) area. It is responsible for allocating resources and... Show moreCluster scheduling plays a crucial role in the high-performance computing (HPC) area. It is responsible for allocating resources and determining the order in which jobs are executed. Existing HPC job schedulers typically leverage simpleheuristics to schedule jobs, but such scheduling policies struggle to keep pace with modern changes and technology trends. The study of this dissertation is motivated by two new trends in HPC community: the rapid growth of heterogeneous system infrastructure and the emergence of artificial intelligence (AI) technologies. First, existing scheduling policies are solely CPU-centric. In contrast, systems become more complex and heterogeneous, and emerging workloads have diverse resource requirements, such as CPU, burst buffer, power, network bandwidth, and so on. Second, previous heuristic scheduling approaches are manually designed. Such a manual design process prevents adaptive and informative scheduling decisions. A recent trend in HPC is to intertwine AI to better leverage the investment of supercomputers. This embrace of AI provides opportunities to design more intelligent scheduling methods.
In this dissertation, we propose an efficient and practical cluster scheduling framework for HPC systems. Our framework leverages AI technologies and considers system heterogeneity. The framework comprises four major components. First,
shared network systems such as dragonfly-based systems are vulnerable to performance variability due to network sharing. To mitigate workload interference on these shared network systems, we explore a dedicated scheduling policy. Next, emerging workloads in HPC have diverse resource requirements instead of being CPU-centric. To cater to this, we design an intelligent scheduling agent for multi-resource scheduling in HPC leveraging the advanced multi-objective reinforcement learning (MORL) algorithm. Subsequently, we address the issues with existing state encoding approaches in
RL-driven scheduling, which either lack critical scheduling information or suffer from poor scalability. To this end, we present an efficient and scalable encoding model. Lastly, the lack of interpretability of RL methods poses a significant challenge to deploying RL-driven scheduling in production systems. In response, we provide a simple, deterministic, and easily understandable model for interpreting RL-driven scheduling. The proposed models and algorithms are evaluated with real job traces from production supercomputers. Experimental results show our schemes can effectively improve job scheduling in terms of both user satisfaction and system utilization. Show less