Abstract
GPUs only support simple coherence operations. For data integrity, heavyweight synchronization operations must be explicitly used. Scoped synchronization Hower et al. (2014) and Gaster et al. (2015) has been introduced to facilitate low overhead communication, when the communication is local across a subset of threads. Scoped synchronization can be used if the sharing is static. However, it is not possible to avoid using heavyweight global-scoped synchronization when the sharing is dynamic. Asymmetric sharing is one of the dynamic sharing patterns where a shared data is frequently accessed by a single agent while being rarely accessed by remote agents. It requires using global-scoped synchronization to encompass all possible synchronization agents on a GPU device without any special support. Remote scope promotion (RSP) Orr et al. (2015) allows use of lightweight local-scoped synchronization for frequent accesses and defers the most synchronization overhead to the rare remote accesses. We propose sRSP which is an efficient and scalable implementation of RSP semantics. sRSP tracks local-scoped synchronizations on local caches. When performing remote-scoped synchronization, it applies the costly cache operations selectively. We thoroughly evaluate our work and show that it reduces remote synchronization overhead significantly and scales better. On the average, it improves the performance around 25% for a GPU device with 64 compute units.
Original language | English |
---|---|
Article number | e6483 |
Journal | Concurrency Computation Practice and Experience |
Volume | 34 |
Issue number | 9 |
DOIs | |
Publication status | Published - 25 Apr 2022 |
Bibliographical note
Publisher Copyright:© 2021 John Wiley & Sons Ltd.
Funding
This research is supported by The Scientific and Technological Research Council of Turkey (TUBITAK) under the Career Development Program (CAREER), grant no 117E053. A preliminary version of this work was presented at the sixth High Performance Computing Conference (BASARIM 2020). 38
Funders | Funder number |
---|---|
TUBITAK | 117E053 |
Türkiye Bilimsel ve Teknolojik Araştirma Kurumu |
Keywords
- asymmetric synchronization
- graphic processing units
- remote scope promotion
- work stealing