Description

Energy-Aware Fast Scheduling Heuristics in Heterogeneous Computing Systems

All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.

Related Documents

Share

Transcript

Energy-Aware Fast Scheduling Heuristics in Heterogeneous Computing Systems
Cesar O. Diaz, Mateusz Guzek, Johnatan E. Pecero, Gregoire Danoy and Pascal Bouvry
CSC Research Unit, University of Luxembourg, Luxembourg
{
ﬁrstname.lastname
}
@uni.lu
Samee U. Khan
Department of Electrical and Computer Engineering North Dakota State University, Fargo, ND 58108
{
samee.khan@ndsu.edu
}
@uni.lu
ABSTRACT
In heterogeneous computing systems it is crucial to sched-ule tasks in a manner that exploits the heterogeneity of the resources and applications to optimize systems perfor-mance. Moreover, the energy efﬁciency in these systems isof a great interest due to different concerns such as opera-tional costs and environmental issues associated to carbonemissions. In this paper, we present a series of srcinal lowcomplexity energy efﬁcient algorithms for scheduling. Themain idea is to map a task to the machine that executes it fastest while the energy consumption is minimum. On the practical side, the set of experimental results showed that the proposed heuristics perform as efﬁciently as related ap- proaches, demonstrating their applicability for the consid-ered problem and its good scalability.
KEYWORDS:
Heterogeneous computing systems, en-ergy eﬃciency, scheduling, optimization
1. INTRODUCTION
Modern-day computing platforms such as grid or cloudcomputing are composed of many new features that enablesharing, selection, and aggregation of highly heterogeneousresources for solving large scale and complex real prob-lems. All these heterogeneous computing systems (HCS)are widely used as a cheap way of obtaining powerful par-allel and distributed systems. However, the required electri-cal power to run these systems and to cool them is of a greatinterest due to different concerns. This results in extremelylarge electricity bills, reduced system reliability and envi-ronmental issues due to carbon emissions [1]. Therefore,energy efﬁciency in HCS is the great interest.HCS comprises different hardware architectures, operatingsystems and computing power. In this paper, heterogene-ity refers to the processing power of computing resourcesand to the different requirements of the applications. Totake advantage of the different capabilities of a suite of het-erogeneous resources, a scheduler commonly allocates thetasks to the resources and determines a date to start the ex-ecution of the tasks. In this paper, we assume that energy isthe amount of power used over a speciﬁc time interval [2].For each task the information on its processing time and thevoltage rate of the processor to execute one unit of time issufﬁcient to measure the energy consumption for that task.In this context, we additionally promote the heterogeneitycapability of the computing system to efﬁciently use the en-ergy of the system [3–5]. The main idea is to match eachtask with the best resource to execute it, that is, the resourcethat optimizes the completion time of the task and executesit fastest with minimum energy.The main objective of our work is to contribute to the efﬁ-cient energy consumption in HCS. This target is achievedby providing a new set of scheduling algorithms. Thesealgorithms take advantage of the resource capabilities andfeature a very low overhead. The algorithms are based onlist scheduling approaches and they are considered as batchmode dynamic scheduling heuristics [6]. In the batch mode,the applications are scheduled after predeﬁned time inter-vals.As a part of this work, we compare the proposed algorithmsby analyzing the results of numerous simulations featuringhigh heterogeneity of resources, and/or high heterogeneityof applications. Simulations studies are performed to com-pare these algorithms with the min-min algorithm [7,8]. Weused min-min as a basis of comparison because it is one of the most used algorithm in the literature in the context of HCS, and it has a good performance behavior [3, 8]. Wehave considered the minimization of the makespan (i.e., themaximum completion time) and energy as a basis of com-
978-1-61284-383-4/11/$26.00 ©2011 IEEE 478
parison. Most of related work consider only the makespanas a performance criterion. The goal has been to ﬁnd a fea-sible schedule such that the total energy consumption overthe entire time horizon is as small as possible. It givesinsight into effective energy conservation, however, it ig-nores the important aspect that users typically expect goodresponse times for their job [9]. In this context, we alsocompare these heuristics based on the
ﬂowtime
. The ﬂow-time of a task is the length of the time interval between therelease time and completion time of the task. Flowtime iscommonly used as a quality of service measure that allowsguaranting good response times. The large set of exper-imental results shown that the investigated heuristics per-form as efﬁciently as the related approach for most of thestudied instances although their low running time, showingtheir applicability for the considered scheduling problem.The remainder of this paper is organized as follows. Thesystem, energy and scheduling models are introduced inSection 2. Section 3 brieﬂy reviews some related ap-proaches. We provide the resource allocation and schedul-ing heuristics in Section 4. Experimental results are givenin Section 5. Section 6 concludes the paper.
2. MODELS2.1. System and Application Models
We consider a HCS composed of a set of
M
=
{
m
1
,...,m
m
}
machines. We assume that the machinesare incorporated with an effective energy-saving mecha-nism for idle time slots [10]. The energy consumption of an idle resource at any given time is set using a minimumvoltage based on the processor’s architecture. In this pa-per, we consider two voltage levels:
maximum
, when theprocessor is performing work or it is in an active state and
idle
level, when processor is in an idle state. We considera set of independent tasks
T
=
{
t
1
,...,t
n
}
to be executedonto the system. The tasks are considered as an indivisibleunit of workload. Each task has to be processed completelyon a single machine. The computational model we con-sider in this work is the ETC model. In this model, it isassumed that we dispose of estimation or prediction of thecomputational load of each task, the computing capacity of each resource, and an estimation of the prior load of the re-sources. Moreover, we assume that the
ETC
matrix of size
t
×
m
is known. Each position
ETC
[
t
i
][
m
j
]
in the ma-trix indicates the expected time to compute task
t
i
on ma-chine
m
j
. This model allows to represent the heterogeneityamong tasks and machines. Machine heterogeneity evalu-ates the variation of execution times for a given task acrossthe computing resources.
Low machine
heterogeneity rep-resents computing systems composed by similar comput-ing resources (almost-homogeneous). On the contrary,
highmachine
heterogeneity represents computing systems inte-grated by resources of different type and capacity power.Forthecaseoftaskheterogeneity, itrepresentsthedegreeof variation among the execution time of tasks for a given ma-chine.
Low task
heterogeneity models the case when tasksare quasi homogeneous (i.e., when the complexity, and thecomputational requirement of tasks are quite similar), theyhave similar execution times for a given machine.
Hightask
heterogeneity describes those scenarios in which dif-ferent types of applications are submitted to execute in theheterogeneous computing system ranging from simple ap-plications to complex programs which require large compu-tational time to be performed. Additionally, the ETC modelalso tries to reﬂect the characteristics of different scenariosusing different ETC matrix consistencies deﬁned by the re-lation between a task and how it is executed in the machinesaccording to heterogeneity of each one [8]. The scenariosare
consistent
,
semi-consistent
and
inconsistent
. The con-sistent scenario models the SPMD applications executingwith local input data, that is if a given machine
m
j
exe-cutes any task
t
i
faster than machine
m
k
, then machine
m
j
executes all tasks faster than machine
m
k
. The inconsis-tent scenario representsthe most genericscenario for a HCSsystem that receives different tasks, from easy applicationsto complex parallel programs. Finally, the semi-consistentscenario models those inconsistent systems that include aconsistent subsystem.
2.2. Energy Model
The energy model used in this work is derived from thepower consumption model in digital complementary metal-oxide semiconductor (CMOS) logic circuitry. The powerconsumption of a CMOS-based microprocessor is deﬁnedto be the summation of capacitive power, which is dissi-pated whenever active computations are carried out, short-circuit and leakage power (static power dissipation). Thecapacitive power (
P
c
) (dynamic power dissipation) is themost signiﬁcant factor of the power consumption. It is di-rectly related to frequency and supply voltage, and it is de-ﬁned as [11]:
P
c
=
AC
eff
V
2
f,
(1)where
A
is the number of switches per clock cycle,
C
eff
denotes the effective charged capacitance,
V
is the supplyvoltage, and
f
denotes the operational frequency. The en-ergy consumption of any machines in this paper is deﬁnedas:
E
c
=
n
i
=1
AC
ef
V
2
i
fETC
[
i
][
M
[
i
]]
,
(2)
479
where
M
[
i
]
represents a vector containing the machine
m
j
where task
t
i
is allocated,
V
i
is the supply voltage of themachine
m
j
. The energy consumption during idle time isdeﬁned as:
E
i
=
m
j
=1
idle
jk
∈
IDLES
j
AC
ef
Vmin
2
j
I
jk
,
(3)where
IDLES
j
is the set of idling slots on machine
m
j
,
V
minj
is the lowest supply voltage on
m
j
, and
I
jk
is theamount of idling time for
idle
jk
. Then the total energy con-sumption is deﬁned as:
E
t
=
E
c
+
E
i
(4)
2.2. Scheduling Model
The scheduling problem is formulated as follows. Formally,given the heterogeneous computing systems composed of the set of
m
machines, and the set of
n
tasks. Any task is scheduled without preemption from time
σ
(
t
i
)
on ma-chine
m
j
, with an execution time
ETC
[
t
i
][
m
j
]
. The task
t
i
completes at time
C
i
equals to
σ
(
t
i
)
+
ETC
[
t
i
][
m
j
]
.The objective is to minimize the maximum completion time(
C
max
=
max
(
C
i
)
) or makespan with minimum energy
E
t
used to execute the tasks. Additionally, in this paperwe also aim to guarantee good response times. In this con-text, response time is modeled as ﬂowtime. As we alreadymentioned, the ﬂowtime of a task is the length of the timeinterval between the completion time and release time. Weconsider that the release time is 0 for all the tasks. Hence,the ﬂowtime represents the sum of completion time of jobs,that is,
ni
=1
C
i
, the aim is to minimize
ni
=1
C
i
. Let usmention that the tasks considered in our study are not asso-ciated with deadlines, which is the case for many computa-tional systems.
3. RELATED WORK
The job scheduling problem in heterogeneous computingsystems without energy consideration has been shown tobe NP-complete [7]. Therefore, a large number of heuris-tics have been developed. One of the most widely usedbatch mode dynamic heuristic for scheduling independenttasks in the heterogeneous computing system is the min-min algorithm. It begins by considering that all tasks arenot mapped. It works in two phases. In the ﬁrst phase, thealgorithm establishes the minimum completion time for ev-ery unscheduled job. In the second phase, the task with theoverall minimum expected completion time is selected andassigned to the corresponding machine. The task is thenremoved from the set and the process is repeated until alltasks are mapped. The run time of min-min is
O
(
t
2
m
)
.Some strategies for energy optimization in HCS systems byexploiting heterogeneity have been proposed and investi-gated. In [12], the authors investigated the tradeoff betweenenergy and performance. The authors proposed a methodfor ﬁnding the best match of the number of cluster nodesand their uniform frequency. However, the authors did notconsider much about the effect of scheduling algorithms.The authors in [5] introduced an online dynamic powermanagement strategy with multiple power-saving states.Then, they proposed an energy-aware scheduling algorithmto reduce energy consumption in heterogeneous comput-ing systems. The proposed algorithm is based on min-min.In [3] the authors considered the problem of schedulingtasks with different priorities and deadline constrained inan ad hoc grid environment with limited battery capacitythat used DVS for power management. In this context, theresource manager needed to exploit the heterogeneity of thetasksand resourceswhilemanaging theenergy. Theauthorsintroduced several online and batch mode dynamic heuris-tics and they showed by simulation that batch mode heuris-tics performed the best. These heuristics were based onmin-min. However, they required signiﬁcantly more time.
4. PROPOSED ALGORITHMS
In the scheduling problem on heterogeneous computingsystems, near-optimal solutions would sufﬁce rather thansearching for optimality for most practical applications.Therefore, we investigate low-cost heuristics with goodquality schedules and low energy consumption. Theseheuristics are based on the min-min algorithm. However,we took special care to decrease the computational com-plexity. The main idea is to avoid the loop on all the pairs of machines and tasks in the min-min algorithm correspondingto the ﬁrst phase of the heuristic. One alternative is to con-sider one task at a time, the task that should be schedulednext. For that we propose to order the tasks by a predeﬁnedpriority, so that they can be selected in constant time. Oncethe order of tasks is determined, in the second phase thatwe call the mapping event, we consider assigning the task to the machine that minimizes its expected completion timeas well as its execution time. This modiﬁcation lies essen-tially with the calculation of a mapping event of an appli-cation to a machine. We propose a weighted function, thatwe named the
score function
SF
(
t
i
,m
j
)
(see Eq. 5), thattries to balances both objectives. The rational is to mini-mize the workload of machines and intrinsically minimizethe energy used to carry out the work. This is the mainprinciple of the scheduling heuristics we are interested inthis work. However, the main difference among them is the
480
priority used to construct the list. To optimize the ﬂowtimewe apply the classical shortest processing time rule on eachmachine after the schedule is constructed.
4.1. Low-cost heuristics
Algorithm 1 depicts the general structure of the proposedheuristics. It is based on classical list scheduling algorithmsfor what well founded theoretical performance guaranteeshave been proven [7]. The heuristics start by computing the
Priority
of each task according to some objective (line 1).Hence, we compute the sorted list of tasks (line 2). The or-der of the list is not modiﬁed during the execution of theheuristics. Next, the heuristics proceed to allocate the tasksto the machines and determine the starting date for them(main loop line 3). One task at a time is scheduled. Theheuristics always consider the task
t
i
at the top of the list(highest priority) and remove it from that (line 4). A scorefunction
SF
(
t
i
,m
j
)
for the selected task is evaluated on allthe machines (lines 5 and 6). Then each heuristic selectsthe machine for which the value of the score function is op-timized for task
t
i
and we schedule the task on that machine(line 8). It corresponds to the second phase of the min-minheuristic, with a different evaluation function. In the case of min-min, the evaluation function is only based on the com-pletion time of the selected task on all the machines. Then,the algorithm selects the machine that gives the minimumcompletion time for that task and the task is assigned onthat machine. The list is updated (line 9) and we restart themain loop. Once all task have been scheduled we apply theshortest processing time rule on all machines to optimizethe ﬂowtime (lines 11 and 12).The score of each mapping event is calculated as in equa-tion 5. For each machine
m
j
,
SF
(
t
i
) =
λ
·
C
i
mk
=1
C
ik
+ (1
−
λ
)
·
ETC
[
t
i
][
m
j
]
mk
=1
ETC
[
t
i
][
m
k
]
,
(5)where
mk
=1
C
ik
is the sum of the completion time of thetask
t
i
over all machines and
mk
=1
ETC
[
t
i
][
m
k
]
is thesum of the expected time to complete of task
t
i
over allmachines. The ﬁrst term of equation 5 aims to minimizethe completion time of the tasks
t
i
, while the second termaims to assign the task to the fastest machine or the ma-chine on which the task takes the minimum expected timeto complete. The heuristics differ on the objective used tocompute the priorities. For that,
maximum
(Algorithm 2),
minimum
(Algorithm 3) and
average
(Algorithm 4) com-pletion time of the task are used as if it was the only task to be scheduled on the computing system. Let’s note that itcorresponds to the execution time (
ETC
[
t
i
][
m
j
]
) of task
t
i
on machine
m
j
. The name of the heuristics are MaxMaxMin (Algorithm 2) (maximum completion time of tasks,
Algorithm 1
Pseudo-code for the low-cost heuristics
1:
Compute
Priority
of each task
t
i
∈
T
according to somepredeﬁned objective;
2:
Build the list L of the tasks sorted in decreasing order of Pri-ority;
3:
while
L
=Ø
do
4:
Remove the ﬁrst task
t
i
from
L
;
5:
for
each machine
m
j
do
6:
Evaluate Score Function SF(
t
i
);
7:
end for
8:
Assign
t
i
to the machine
m
j
that optimize the Score Func-tion;
9:
Update the list L;
10:
end while
11:
for all
machine
m
j
do
12:
Sort the tasks
t
k
on
m
j
in increasing ETC[
t
k
][
m
j
];
13:
end for
sorted in decreasing order of its maximum completion time,and scheduled based on the minimum completion time).MinMax Min (Algorithm 3) (minimum completion time of tasks, sorted in decreasing order of its minimum comple-tion time, and scheduled based on the minimum completiontime). MinMean Min (Algorithm 4) (average completiontime of tasks, sorted in decreasing order of its average com-pletion time, and scheduled based on the minimum comple-tion time).
Algorithm 2
Pseudo-code for heuristic (MaxMax min)
1:
for all
task
t
i
do
2:
for
each machine
m
j
do
3:
Evaluate CompletionTime(
t
i
,
m
j
);
4:
end for
5:
Select the maximum completion time for each task
t
i
;
6:
end for
Algorithm 3
Pseudo-code for heuristic (MinMax min)
1:
for all
task
t
i
do
2:
for
each machine
m
j
do
3:
Evaluate CompletionTime(
t
i
,
m
j
);
4:
end for
5:
Select the minimum completion time for each task
t
i
;
6:
end for
Algorithm 4
Pseudo-code for heuristic (MinMean min)
1:
for all
task
t
i
do
2:
for
each machine
m
j
do
3:
Evaluate CompletionTime(
t
i
,
m
j
);
4:
end for
5:
Compute the average completion time for each task
t
i
;
6:
end for
481
4.2. Computational complexity of the heuris-tics
The computational complexity of the algorithms is as fol-lows: computing the value of the priorities for the tasks andthe construction of the sorted list have an overall cost of
O
(
tm log t
)
. The execution of the main loop (line 3) inAlgorithm 1 has an overall cost of
O
(
tm
)
. Sorting the tasksfor all the machines takes
O
(
km log k
)
(line 12), where
k
≤
t
. Therefore, the asymptotic overall cost of the heuris-tics is
O
(
tm log t
)
, which is less than one order of magni-tude to the related approaches.
5. EXPERIMENTAL EVALUATION
In this section, we compare by simulations the proposedalgorithms and min-min on a set of randomly built ETCs.Table 1 shows the twelve combinations of heterogeneitytypes (tasks and machines) and consistency classiﬁcationsin the ETC model that we use in this paper. The consis-tency categories are named for the correspondent initial let-ter (
c
stands for consistent,
i
for inconsistent,
s
for semi-consistent,
lo
stands for low heterogeneity and
hi
for highheterogeneity). Hence, a matrix named
c hihi
correspondsto a consistent scenario with hi task heterogeneity and himachine heterogeneity.
Table 1. Consistency and heterogeneity combinations inthe ETC model
Consistency
Consistent Semi-consistent Inconsistent
c lolo s lolo i loloc lohi s lohi i lohic hilo s hilo i hiloc hihi s hihi i hihi
5.1. Experiments
For the generation of these ETC matrices we have usedthe coefﬁcient of variation based method (COV) introducedin [13]. To simulate different heterogeneous computing en-vironments we have changed the parameters
µ
task
,
V
task
and
V
machine
, which represent the mean task executiontime, the task heterogeneity, and the machine heterogene-ity, respectively. We have used the following parameters:
V
task
and
V
machine
equal to 0.1 for low case respectivelyand 0.6 for high case, and
µ
task
= 100
. The heterogeneousranges were chosen to reﬂect the fact that in real situationsthere is more variability across the execution time for differ-ent tasks on a given machine than that across the executiontime for a single task on different machines [14].As we are considering batch mode algorithms, we assumein both cases that all tasks have arrived to the system beforethe scheduling event. Furthermore, we consider that all themachinesareidleoravailableattimezero, thiscanbepossi-ble by considering advance reservation. We have generated1200 instances, 100 for each twelve cases to evaluate theperformance of the heuristics. We have generated instanceswith 512 tasks in size to be scheduled on 16 machines. Ad-ditionally, we have considered different voltages for the ma-chines. We randomly assigned these voltages to machinesby choosing among three different set. The ﬁrst set con-siders
1
.
95
and
0
.
8
Volts for active state and idle state, re-spectively. The second set is
1
.
75
Volts at maximum stateand
0
.
9
Volts at idle state. Finally, the last set considers
1
.
6
Volts for active level and
0
.
7
Volts at idle level.
5.2. Results
The results for the algorithms are depicted from Figure 1to 3. We show normalized values of makespan, ﬂowtimeand energy for each heuristic against min-min for
λ
-valuesin the interval [0, 1]. The normalized data were gener-ated by dividing the results for each heuristic by the max-imum result computed by these heuristics. We only showthe curves for the high task and high machine heterogeneityfor the three different scenarios which are the most signiﬁ-cant results. The legends
m-m n mksp
,
m-m n ﬂow
and
m-mn energy
in the ﬁgures stand for makespan, ﬂowtime andenergy of min-min.We can observe from these ﬁgures that the proposed heuris-tics follow the same performance behavior according to thescenarios. Relative values range are biggest for the consis-tent instances than semi-consistent and inconsistent. Theresults clearly demonstrate that energy efﬁciency is the bestfor the consistent instances. It may be related to the fact,that the makespan has worse results. However, for value of
λ
= 0.8 the proposed heuristics can perform as well as min-min for all the three considered metrics. We can also ob-serve that the proposed algorithms can improve makespanand ﬂowtime results for lambda for semi-consistent and in-consistent instances. Interestingly, if the instance is moreinconsistent, the new algorithms performs better. The ben-eﬁt of exploiting the heterogeneity of the applications andresources to maximize the performance of the system andenergy is more apparent. This is mainly because these in-stances are the ones presenting the highest inconsistencyand heterogeneity. In terms of ﬂowtime, all the heuristicsare as efﬁcient as min-min, however, the proposed heuris-tics have lower complexity.
482

Search

Similar documents

Related Search

Task Scheduling in Heterogeneous Computing SyTask scheduling algorithms in Grid computing Energy-Aware ComputingJob Scheduling in grid computingHeuristics In Judgment And Decision MakingPower Aware/energy Aware SystemsHuman Factors in Computing SystemsHandoff Analysis in Heterogeneous NetworkQoS Issues in Heterogeneous NetworksEnergy and exergy based analysis in thermal s

We Need Your Support

Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks