Holistic Workload Scaling: A New Approach to Compute Acceleration in the Cloud

Perez, JF; Chen, LY; Villari, M; Ranjan, R

doi:10.1109/MCC.2018.011791711

Holistic Workload Scaling: A New Approach to Compute Acceleration in the Cloud

Lookup NU author(s): Professor Raj Ranjan

Downloads

Published version [.pdf]

Licence

This is the final published version of an article that has been published in its final definitive form by IEEE, 2018.

For re-use rights please refer to the publisher's terms and conditions.

Abstract

© 2014 IEEE. Workload scaling is an approach to accelerating computation and thus improving response times by replicating the exact same request multiple times and processing it in parallel on multiple nodes and accepting the result from the first node to finish. This is not unlike a TV game show, where the same question is given to multiple contestants and the (correct) answer is accepted from the first to respond. This is different than traditional strategies for parallelization as used in, say, MapReduce workloads, where each node runs a subset of the overall workload. There are a variety of strategies that trade off metrics such as cost, utilization, performance, and interprocessor communication requirements. Performance modeling can help determine optimal approaches for different environments and goals. This is important, because poor performance can lead to application and domain-specific losses, such as e-commerce conversions and sales. Performance modeling and analysis plays an important role in designing and driving the selection of resource scaling mechanisms. Such modeling and analysis is complex due to time-varying workload arrival rates and request sizes, and even more complex in cloud environments due to the additional stochastic variation caused by performance interference due to resource sharing across co-located tenants. Moreover, little is known on how to multi-scale, i.e., dynamically and simultaneously scale resources vertically, horizontally, and through workload scaling. In this article, we first demonstrate the effectiveness of multi-scaling in reducing latency, and then discuss the performance modeling challenges, particularly for workload scaling.

Publication metadata

Author(s): Perez JF, Chen LY, Villari M, Ranjan R

Publication type: Article

Publication status: Published

Journal: IEEE Cloud Computing

Year: 2018

Volume: 5

Issue: 1

Pages: 20-30

Online publication date: 28/03/2018

Acceptance date: 02/04/2016

Date deposited: 07/06/2018

ISSN (electronic): 2325-6095

Publisher: IEEE

URL: https://doi.org/10.1109/MCC.2018.011791711

DOI: 10.1109/MCC.2018.011791711

Altmetrics

Altmetrics provided by Altmetric

Funding

Funder reference	Funder name
167266
407540

ePrints

Holistic Workload Scaling: A New Approach to Compute Acceleration in the Cloud

Downloads

Licence

Abstract

Publication metadata

Altmetrics

Funding

Share