Data Parallel Algorithmic Skeletons with Accelerator Support

Ernsting, S; Kuchen, H

Data Parallel Algorithmic Skeletons with Accelerator Support

Ernsting S, Kuchen H

Zusammenfassung
Hardware accelerators such as GPUs or Intel Xeon Phi comprise hundreds or thousands of cores on a single chip and promise to deliver high performance. They are widely used to boost the performance of highly parallel applications. However, because of their diverging architectures programmers are facing diverging programming paradigms. Programmers also have to deal with low-level concepts of parallel programming that make it a cumbersome task. In order to assist programmers in developing parallel applications Algorithmic Skeletons have been proposed. They encapsulate well-defined, frequently recurring parallel programming patterns, thereby shielding programmers from low-level aspects of parallel programming. The main contribution of this paper is a comparison of two skeleton library implementations, one in C++ and one in Java, in terms of library design and programmability. Besides, on the basis of four benchmark applications we evaluate the performance of the presented implementations on two test systems, a GPU cluster and a Xeon Phi system. The two implementations achieve comparable performance with a slight advantage for the C++ implementation. Xeon Phi performance ranges between CPU and GPU performance.

Publikationstyp

Forschungsartikel (Zeitschrift)

Begutachtet

Ja

Publikationsstatus

Veröffentlicht

Jahr

2016

Fachzeitschrift

International Journal of Parallel Programming

Band

2016

Erste Seite

1

Letzte Seite

17

Sprache

Englisch

ISSN

1573-7640

DOI

https://doi.org/10.1007/s10766-016-0416-7

Gesamter Text

http://dx.doi.org/