Search results
(1 - 1 of 1)
- Title
- Utilizing Concurrent Data Accesses for Data-Driven and AI Applications
- Creator
- Lu, Xiaoyang
- Date
- 2024
- Description
-
In the evolving landscape of data-driven and AI applications, the imperative for reducing data access delay has never been more critical,...
Show moreIn the evolving landscape of data-driven and AI applications, the imperative for reducing data access delay has never been more critical, especially as these applications increasingly underpin modern daily life. Traditionally, architectural optimizations in computing systems have concentrated on data locality, utilizing temporal and spatial locality to enhance data access performance by maximizing data and data block reuse. However, as poor locality is a common characteristic of data-driven and AI applications, utilizing data access concurrency emerges as a promising avenue to optimize the performance of evolving data-driven and AI application workloads.This dissertation advocates utilizing concurrent data accesses to enhance performance in data-driven and AI applications, addressing a significant research gap in the integration of data concurrency for performance improvement. It introduces a suite of innovative case studies, including a prefetching framework that dynamically adjusts aggressiveness based on data concurrency, a cache partitioning framework that balances application demands with concurrency, a concurrency-aware cache management framework to reduce costly cache misses, a holistic cache management framework that considers both data locality and concurrency to fine-tune decisions, and an accelerator design for sparse matrix multiplication that optimizes adaptive execution flow and incorporates concurrency-aware cache optimizations.Our comprehensive evaluations demonstrate that the implemented concurrency-aware frameworks significantly enhance the performance of data-driven and AI applications by leveraging data access concurrency.Specifically, our prefetch framework boosts performance by 17.3%, our cache partitioning framework surpasses locality-based approaches by 15.5%, and our cache management framework achieves a 10.3% performance increase over prior works. Furthermore, our holistic cache management framework enhances performance further, achieving a 13.7% speedup. Additionally, our sparse matrix multiplication accelerator outperforms existing accelerators by a factor of 2.1.As optimizing data locality in data-driven and AI applications becomes increasingly challenging, this dissertation demonstrates that utilizing concurrency can still yield significant performance enhancements, offering new insights and actionable examples for the field. This dissertation not only bridges the identified research gap but also establishes a foundation for further exploration of the full potential of concurrency in data-driven and AI applications and architectures, aiming at fulfilling the evolving performance demands of modern and future computing systems.
Show less