Internet & Technology

A Partitioning-based method according to budget distribution for task scheduling in Computational Grids

Journal of Computing, eISSN 2151-9617,
of 6
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
  A Partitioning-based method according tobudget distribution for task scheduling inComputational Grids Mostafa Ghobaei Arani, Sam Jabbehdari and Nasser Modiri   Abstract  — The goal of computational grids is to aggregate heterogeneous distributed resources for solving large-scaleproblems in science, engineering and commerce. Unfortunately dynamism and heterogeneity of grid resources and also variousdemands for applications on grids cause the complexity of grid scheduling. So for having access to high performance in gridsystems, It is necessary to get effective scheduling for resources. Most Quality of Service (QoS) constraint based workflowscheduling algorithms are based on either budget or deadline constraints. In this paper, we solve the problem of budgetconstraint-based scheduling through dividing total problem on several partitions and budget distribution on each of them. Afterbudget distribution, we can find a local optimal schedule for each partition based on its sub-budget. We evaluate proposed  algorithm compared with Back-tracking and BTO scheduling algorithms in the fields of time and cost execution. Simulationexperimental results shows that proposed algorithm provide better performance in low-level budgets. Index Terms  — Computational Grids, Workflow Scheduling, QoS Constraints, Partition. ——————————      —————————— 1 I NTRODUCTION computational Grid is a software and hardwareinfrastructure that provides dependable, consistent,pervasive, and inexpensive access to high-end computa-tional capabilities [1]. In recent years, Grid technologyprovide the basis for creating a service-oriented paradigmthat enables a new way of service provisioning based onutility computing models. Typically, service providerscharge higher prices for higher QoS. Users are charged forconsuming services based on their usage and QoS levelrequired. Workflow scheduling algorithms is required to be able to analyze users QoS requirements and mapworkflows on suitable resources such that the workflowexecution can be completed to satisfy users QoS con-straints [2].Processing time and cost execution are two typicalQoS constraints for workflow execution on utility Grids.Let B be the cost constraint (budget) and D be the timeconstraint (deadline) specified by a user for workflowexecution. The budget constrained scheduling problem isto map every i T  onto a suitable service to minimize theexecution time of the workflow and complete it with thetotal cost less than B. Similarly, the deadline constrainedscheduling problem is to map every i T  onto a suitableservice to minimize the execution cost of the workflowand complete it with the total time less than D [3]. Several strategies have been proposed to addressscheduling problems based on user’s deadline and budg-et constraints. Buyya Time Optimization (BTO) andBuyya Cost Optimization (BCO) are derived from the costand deadline optimization algorithms in Nimrod-G[4,5,6], which is initially designed for scheduling inde-pendent tasks on Grids. BTO is used for solving time op-timization problem with a budget. It sorts services bytheir processing times and assigns as many tasks as poss-ible to the fastest services without exceeding the budget.BCO is used for solving the cost optimization problemwithin the deadline. It sorts services by their processingprices and assigns as many tasks as possible to cheapestservices without exceeding the deadline.More recently, iterative processing based heuristicssuch as Back-tracking [7], and LOSS and GAIN [8], havebeen proposed to solve constrained optimization prob-lems. They iteratively amend the schedule optimized forone factor to satisfy the other factor in the way that it cangain maximum benefit or minimum loss. However, theyneed go through more iteration to modify and recom-puted the current schedule to meet the constraint andthus result in large scheduling computation time. In this paper, we provide the budget constrainedscheduling algorithm, which by following the divide-and-conquer technique and divide workflow tasks into severalpartition and after distribution of overall budget on eachpartition, we can find a local optimal schedule for eachpartition based on its sub-budget. Then we compare pro- ————————————————   ã    Mostafa Ghobaei Arani is with the Computer Engineering Department,Kashan Branch, Islamic Azad University, Kashan, Iran .   ã   Sam Jabbehdari is with the Computer Engineering Department, NorthTehran Branch, Islamic Azad University, Tehran, Iran .   ã   Nasser Modiri is with the Computer Engineering Department, Zanjan-Branch, Islamic Azad University, Zanjan, Iran .   A JOURNAL OF COMPUTING, VOLUME 3, ISSUE 9, SEPTEMBER 2011, ISSN 2151-9617HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTINGWW.JOURNALOFCOMPUTING.ORG59 © 2011 Journal of Computing Press, NY, USA, ISSN 2151-9617  posed algorithm with BTO and Back-tracking algorithmsin the fields of cost and time execution.The remainder of the paper is organized as follows:Section 2 provides a workflow scheduling problem de-scription. We describe proposed scheduling algorithm inSection 3. Experimental details and simulation results ofproposed algorithm in compare with two algorithmsBack-tracking and BTO are presented in Section 4. Finally,we conclude the paper with directions for further work inSection 5. 2 P ROBLEM D ISCRIPTION Before providing proposed algorithm, It is better to of-fer more exact description from the problem of workflowscheduling. A workflow application can be modeled as aDirected Acyclic Graph (DAG) . Let Γ be the finite set oftasks (1) i T i n ≤ ≤ . Let Λ be the set of directed edges.Each edge is denoted by (,) i j T T   , where i T  is called animmediate parent task of  j T  , and  j T  the immediate childtask of i T  . A child task can not be executed until all of itsparent tasks have been completed. There is a transmissiontime and cost associated with each edge. We assume thata child task can not be executed until all of its parent tasksare completed. Then, the workflow application can bedescribed as a tuple (,) Ω Γ Λ .In a workflow graph, a task which does not have anyparent task is called an entry task, denoted as entry T  and atask which does not have any child task is called an exittask, denoted as exit  T  . In this paper, we assume there isonly one entry T  and exit  T  in   the workflow graph. If there aremultiple entry tasks and exit tasks in a workflow, we   canconnect them to a zero-cost pseudo entry or exit task [8].The execution requirements for tasks in a workflow could be heterogeneous. A service may be able to execute someof workflow tasks. Let m be the total number of servicesavailable. There are a set of services  ji S is capable of ex-ecuting the task i T   , but only one service can be assignedfor the execution of a task.Services have varied processing capability deliveredat different prices. We denote  ji t  as the sum of theprocessing time and data transmission time, and  ji c as thesum of the service price and data transmission cost forprocessing i T  on service  ji S [3]. 3 P ROPOSED A LGORITHM In this section, we decide to develop introduced pro-cedure for Deadline distribution scheduling algorithm in[9] to Budget constraint Scheduling. We solve the schedul-ing problem by following the divide-and-conquer tech-nique in three phases as below:Phase 1: Workflow tasks partitioning into partitions.Phase 2: Distribute overall budget into every partition.Phase 3: Make advance reservations based on the localoptimal solution of partition.We describe details of phases 1-3 in the following sub-sections. 3.1 Workflow Tasks Partitioning Phase Workflow tasks are categorized to be either a synchro-nization task or a simple task. A synchronization task isdefined as a task which has more than one parent or childtask. For example, In Figure 1, 11014 ,, T T T  are synchroniza-tion tasks. Other tasks, which have only one parent taskand child task, are simple tasks. In the example of Figure1, 29 T T  − and 1113 T T  − are supposed simple tasks. Fig.1. Before partitioning [9] A simple partition can be a set of interdependent sim-ple tasks that are executed sequentially between two syn-chronization tasks. Simple tasks are categorized in onepartition. For example, simple partitions in Figure 2 con-sist of 234 {,,} T T T   , 56 {,} T T   , 7 {} T   , 89 {,} T T   , 11 {} T  and 1213 {,} T T  . Fig.2. After partitioning [9] (1) {1;1,}  ji i i S i n j m m m =≤≤ ≤≤ ≤ JOURNAL OF COMPUTING, VOLUME 3, ISSUE 9, SEPTEMBER 2011, ISSN 2151-9617HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTINGWW.JOURNALOFCOMPUTING.ORG60 © 2011 Journal of Computing Press, NY, USA, ISSN 2151-9617   We partition workflow tasks Γ into independent par-titions (1) i P i k  ≤≤ and synchronization tasks (1) i Y i l ≤≤  , such that k and l are the total number of par-titions and synchronization tasks in the workflow respec-tively.Let V  be a set of nodes in a DAG corresponding to aset of partitions (1) i V i k l ≤ ≤ + . Let E be the set of di-rected edges of the form (,) i j V V  where i V  is a parent par-tition of  j V  and  j V  is a child partition of i V  . Then, a parti-tion graph in problem of budget constraint is denoted asG(V, E, B). 3.2 Budget Distribution Phase After workflow tasks partitioning, we distribute theoverall budget between each i V  in G. Denoted sub- budget [] i bdg V  to any i V  is sub-budget of overall budgetB. We consider the following strategy of budget assign-ment based on below mentioned policies:1. The sub-budget of any partition ( [] i bdg V  ) should not be more than expected execution cost of that partition( [] i eec V  ).2. The sum of sub-budget assigned to any partition isequal to overall budget.3. The overall distributed budget over partitions is inproportion to average execution cost (processing cost anddata transmission) of their available task.In phase of overall budget distribution over partition, based on average execution cost and the cost of datatransmission of available tasks in each partition, we as-sign a sub-budget to each partition. In a workflow, tasksmay require various kinds of services with different pric-es, and their computational workload and required I/Odata transmission is varied between services. Therefore,the portion of the overall budget each task obtains should be based on the proportion of their expense requirements.Since there are multiple possible services and data linksfor executing a task, their average cost values are used formeasuring their expense requirements. The expected budget for task i T  is defined by: Where: 3.3 Planing for Scheduling Phase The planning phase makes an optimized schedule foradvance reservation and run-time execution. Optimaldecision-making of scheduler is gained through selectionof fastest services, which can execute related task in as-signed sub-budget. After budget-distribution, we can findan optimal local scheduling for every partition accordingto sub-budget of that mentioned partition. If each localschedule guarantees that their task execution can be com-pleted within the sub-budget, the whole workflow execu-tion will be completed within the overall budget. Thereare two types of partitions: Simple Partition and Synchro-nization Partition. Simple Partition consist of several sim-ple tasks and Synchronization Partition include one syn-chronization task. The scheduling solutions for each typeof partition are described as follow: 3.3.1 Synchronization Partition Scheduling For Synchronization Partition Scheduling, schedulerconsiders only one synchronization task. The optimaldecision is to select the fastest service that can process thetask within the assigned sub-budget. The objective func-tion for scheduling one synchronization task i Y  is: 3.3.2 Simple Partition Scheduling If there are one simple task in to the partition, in thismode, the algorithm of Simple Partition Scheduling are assame as Synchronization Partition Scheduling. But if thereare multiple tasks in to the partition, scheduler shouldassign one service to each task for execute after comple-tion of its parent task. The optimal decision is to minimizethe total execution cost of every partition and completepartition tasks within the assigned sub-budget. The objec-tive function for scheduling partition  j P is as follow: 1 (5)   i  ji j S jii cavg cs ≤≤ = ∑ (2) [][] bdg V eec V i i ≤ (6) min()  j ji i i i t where j m and c eec Y  1≤≤ ≤ (7) min,1() i j i j k k i i i jT P T P t where k m and c eec P ∈∈ ≤≤ ≤ ∑∑ 11 (4) [] iii  ji j S ji j ST V  avg ceec V Bavg c ≤≤≤≤∈  =×  ∑ [] (3) i V Gi bdg V B ∈ = ∑ JOURNAL OF COMPUTING, VOLUME 3, ISSUE 9, SEPTEMBER 2011, ISSN 2151-9617HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTINGWW.JOURNALOFCOMPUTING.ORG61 © 2011 Journal of Computing Press, NY, USA, ISSN 2151-9617  4 R ESULT E VALUATION   We use Gridsim [10], [11] to simulate a Grid testbedfor our experiments. As execution requirements for tasksin scientific workflows are heterogeneous, we use servicetype to represent different type of service. Every task inour experimental workflow applications requires a cer-tain type of service. We model 4 types of services withdifferent prices for simulation within a heterogeneousenvironment, each of which was supported by 10 differ-ent service providers with different processing capability.The topology of Grid system is specified in the samemanner that services are connected to each other. Theavailable network bandwidths between services are 1000Mbps, 200 Mbps, 512Mbps, 1024 Mbps.For experiments, the cost that a user needs to pay forworkflow execution comprises of two parts: processingcost and data transmission cost. Table 2, shows an exam-ple of data transmission cost, while Table 1 shows an ex-ample of processing cost. The length of each task is meas-ured according to MI (Million Instructions) and we useMIPS (Million Instructions per Second) to represent theprocessing capability of services. As you can see theprocessing cost and transmission cost are inversely pro-portional to its processing time and transmission timerespectively. TABLE   1S ERVICE SPEED AND CORRESPONDING PRICE .TABLE   2T RANSMISSION BANDWIDTH AND CORRESPONDING PRICE . Beacause may be there are different structures for appli-cations, we use a common and useful structure ofworkflow in scientific applications for simulation experi-ments, according to Figure 3. Fig.3. A part of workflow in applicationwith parallel structure [3] We compared proposed algorithm with other algo-rithms of BTO and Back–tracking over workflow, asshown in Figure3. BTO algorithm sorts services by theirprocessing time and assigns as many tasks as possible toservices without exceeding the budget coustraint, whileBack-tracking algorithm assigns more ready task by fast-est computing resources. If the execution cost exceeds the budget, it Back-tracks to the previous step and removesthe fastest computing resources from its resource list andreassigns tasks with the reduced resource set.We use two metric time and cost execution for evaluat-ing scheduling algorithms. Execution cost specify wheth-er produced schedule by scheduling algorithm is capableof execution with lower cost than our determined budgetand execution time can specify how much time have beenconsumed for executing workflow tasks on the testbed.A comparison of time and cost execution for budgetconstrained scheduling algorithms of BTO, Back-trackingand proposed algorithm about in user-budget of500,1000,1500,2000,2500 Grid dollars(G$) have beenshown in Figure 4 and Figure 5.As you can see in Figure 4, BTO algorithm can notmeet users specified budgets in all of cases, while usersspecified budgets 500,1000,1500 G$, execution cost in BTOalgorithm is more than users specified budgets, while twoother algorithms can meet budget constraint in all of cas-es. In other words, that can complete execution with cost,lower than user specified budgets. In addition to, pro-posed algorithm can meet and complete budget con-straint in lower budgets with consuming lower costs incomparison with Back-tracking algorithm. Therefore,proposed algorithm according to execution cost is betterthan Back-tracking and Back-tracking is better than BTO. Cost (G$/sec)   ProcessingTime(MIPS)   ServiceID   300   1200   1   600   600   2   900   400   3   1200   300   4   Cost (G$/sec)   Bandwidth (Mbps)   1   100   2   200   5.12   512   10.24   1024   JOURNAL OF COMPUTING, VOLUME 3, ISSUE 9, SEPTEMBER 2011, ISSN 2151-9617HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTINGWW.JOURNALOFCOMPUTING.ORG62 © 2011 Journal of Computing Press, NY, USA, ISSN 2151-9617
Similar documents
View more...
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks