Scalable runtime support for data-intensive applications on the single-chip cloud computer

Published in Many-core Applications Research Community (MARC) Symposium, 2011

Link: https://pure.qub.ac.uk/portal/files/3644926/marc11.pdf

Abstract

Many-core processors, due to their complexity and diversity, necessitate high-productivity, domain-specific approaches to parallel programming. These approaches should hide architectural details and low-level parallelization constructs, while enabling scalability and performance portability. This paper presents a scalable implementation of MapReduce, a runtime system used widely by domain-specific languages for large-scale data processing, on the Intel SCC. We address the scalability bottlenecks of MapReduce with data partitioning, combining and sorting algorithms that we customize for the SCC network on-chip architecture. We achieve linear or superlinear speedups for representative MapReduce workloads with data sets that fit on a single SCC node. We also show that the SCC node outperforms the IBM Cell QS22 Blade, when the latter uses the fastest implementation of MapReduce available for the Cell processor.

Recommended citation: Papagiannis, Anastasios, and Dimitrios S. Nikolopoulos. “Scalable runtime support for data-intensive applications on the single-chip cloud computer.” In 3rd Many-core Applications Research Community (MARC) Symposium, vol. 7598. KIT Scientific Publishing, 2011.