Browse by author
Lookup NU author(s): Mohammed Al-Hayanni, Professor Rishad Shafik, Dr Ashur Rafiev, Dr Fei Xia, Professor Alex Yakovlev
This is the authors' accepted manuscript of a conference proceedings (inc. abstract) that has been published in its final definitive form by IEEE, 2017.
For re-use rights please refer to the publisher's terms and conditions.
Traditional speedup models, such as Amdahls, facilitate the study of the impact of running parallel workloads on manycore systems. However, these models are typically based on software characteristics, assuming ideal hardware behaviors. As such, the applicability of these models for energy and/or performance-driven system optimization is limited by two factors. Firstly, speedup cannot be measured without instrumenting the original software codes, and secondly, the parallelization factor of an application running on specific hardware is generally unknown. In this paper, we propose a novel method, whereby standard performance counters found in modern many-core platforms can be used to derive speedup without instrumenting applications for time measurements. We postulate that speedup can be accurately estimated as a ratio of instructions per cycle for a parallel manycore system to the instructions per cycle of a single core system. By studying the application instructions and system instructions for the first time, our method leads to the determination of the parallelization factor and the optimal system configuration for energy and/or performance. The method is extensively demonstrated through experiments on three different platforms with core numbers ranging from 4 to 61, running parallel benchmark applications (including synthetic and PARSEC benchmarks) on Linux operating system. Speedup and parallelization estimations using our method and their extensive cross-validations show negligible errors (up to 8%) in these systems. Additionally, we demonstrate the effectiveness of our method to explore parallelization-aware energy-efficient system configurations for many-core systems using energy-delay-product based formulations.
Author(s): Al-Hayanni M, Shafik R, Rafiev A, Xia F, Yakovlev A
Publication type: Conference Proceedings (inc. Abstract)
Publication status: Published
Conference Name: 2017 International Conference on High Performance Computing & Simulation
Year of Conference: 2017
Pages: 410-417
Online publication date: 14/09/2017
Acceptance date: 02/06/2017
Date deposited: 18/07/2017
Publisher: IEEE
URL: https://doi.org/10.1109/HPCS.2017.68
DOI: 10.1109/HPCS.2017.68
Library holdings: Search Newcastle University Library for this item
ISBN: 9781538632505