Food

A PERF solution for distributed query optimization

Description
A PERF solution for distributed query optimization
Categories
Published
of 4
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  Proceedings of the ISCA 15th International Conference on Computers and Their Applications (CATA-2000) !e" #rleans$ %ouisiana$ &SA 'arch 2  *1$ 2000 A PERFSolution ForDistributedQueryOptimizationRamzi A. Haraty and Roula FanyLebanese American UniersityP.O. !o" #$%&'&$!eirut( Lebanon Abstract +uer, optimiation techni.ues aim tominimie the cost of transferring data acrossnet"or/s 'an, techni.ues and algorithmshae een proposed to optimie .ueries #neof the algorithms is the  algorithm usingsemi-3oins !o"ada,s$ a ne" techni.ue calledP46 seems to ring some improement oer semi-3oins 728 P46 3oins are t"o-"a, semi- 3oins using a it ector as their ac/"ard phase #ur research encompasses appl,ingP46 3oins to the  algorithm Programs"ere designed to implement oth the srcinaland the enhanced algorithms Seerale9periments "ere conducted and the resultssho"ed a er, considerale enhancementotained , appl,ing the P46 concept )ey*ords +uer, #ptimiation$ Semi-:oins$ and P46:oins # + ,ntroduction  ;istriuted .uer, processing is the processof retrieing data from different sitesAccessing data from different sites inolestransmission ia communication lin/s thatcreates dela,s The asic challenge is to designand deelop efficient .uer, processingtechni.ues and strategies to minimie thecommunication cost This is the main purposeof .uer, optimiation "hich estimates the costof alternatie .uer, plans in order to choosethe est plan to ans"er .uic/l, and efficientl,$comple9 and e9pensie .ueries 7*8 The .uer, optimiation prolem "asaddressed man, times$ from different perspecties$ and a lot of "or/ has een doneProposed algorithms and techni.ues can ecategoried in t"o main approaches<1-'inimie the cost of data transferredacross the net"or/ , reducing theamount of transmitted information$and 2-'inimie the response time of the.uer, , using parallel processing In this paper$ "e "ill mainl, focus on thefirst approach #ne of the most importantalgorithms suggested for .uer, optimiation"ith minimum cost "as algorithm =4!4A%(total cost) presented , Apers$ >ener and?ao in 1@* 78 The adent of A>? "as areolution in .uer, optimiation domain ecause it introduced semi-3oins as reducers inthe .uer, optimiation process In 15$ Todd Bealor from indsor &niersit,$ Canada presented a ne" algorithmcalled  algorithm as an enhancement oer A>? At the same time$ a ne" techni.uecalled P46 (Partiall, 4ncoded ecord 6ilter)"as presented , enneth oss 728 Thismethod adds to semi-3oins another dimension$"hich is the ac/"ard phase that "ill e usedto eliminate unnecessar, redundant semi-3oins , using it ectors In this paper "e present an improementoer  algorithm using P46 3oins applied to This paper is organied as follo"s< Section2 presents the  algorithm Section *discusses our contriution in the P46algorithm Section  presents thee9perimental results And section 5 concludesthe paper - % /e 0 Al1orit/m  The main aim of this algorithm is tominimie total time , using reducers in order to eliminate unnecessar, data This algorithmis characteried , t"o distinct phases<  Phase 1.  Semi-3oin schedules for constructingeach reducer are formed using a costDenefitanal,sis ased on estimated attriuteselectiit, and sies of partial results  Phase 2.  Schedule is e9ecutedAlgorithm  "or/s as follo"s<1 4stalish schedules for the construction of reducers 6or each 3oin attriute  j  constructschedule for the reducer dE m3  It should enoted that at this leel$ each schedule isconsidered independentl, >ence$ no semi- 3oins are e9ecuted ,et This is achieed in t"o phases<  Phase 1.  Sort attriutes , increasing siesuch that< S(d a3 ) ≤  S (d  3 ) ≤  - - - ≤  S(d m3 )  Phase 2.  4aluate semi-3oins in order  eginning "ith d a3  d  3  Append semi-3ointo schedule if<  Proceedings of the ISCA 15th International Conference on Computers and Their Applications (CATA-2000) !e" #rleans$ %ouisiana$ &SA 'arch 2  *1$ 2000a It is profitale and marginall, profitaleP(d a3  d  3 ) F 0 and 'P (d a3  d  3 ) F 0or$  It is gainful ut not profitale >ence$ P(d a3  d  3 ) G 0 ut = (d a3 d  3 ) F 0If semi-3oin is appended then dE  3  d c3  isealuated ne9t$ else dE a3 d c3  is consideredepeat this process until all semi-3oins in these.uence are ealuated The last attriute inthe se.uence "ill e called the reducer2 49amine the effects of reducers Consider the reduction effects of the reducersH all-applicale relations ,<a Sorting reducers from smallest to largest   4stimating the cost and enefit of a semi- 3oin "ith each admissile relation and for each reducer Profitale semi-3oins areappended to the schedule* eie" of unused semi-3oins 6or non- profitale reducers$ ree9amine the possiilit,of haing profitale semi-3oins for that particular 3oin attriute This phase is doneusing the follo"ing su-steps<a Sort attriutes , increasing sie  4aluate each semi-3oin and append profitale semi-3oins to the final schedule !ote that marginal profit is not considered inthis step 49ecute the schedule ;uring this phase$reducers are constructed and shipped todesignated sites to reduce the correspondingrelations Then$ reduced relations are shippedto the asseml, site This heuristic is simple and efficient Itaims to construct in the cheapest possile "a,$reducers "ho are highl, selectie Thosereducers "ill e then used to eliminate tuplesfrom participating relations prior to shipmentto the .uer, site (asseml, site) It should e noted that algorithm ameliorates the choice of 3oin attriutes andtheir order ut does not eliminate redundanttransmissions ecause schedules are alsotreated separatel, $ % /e PERF0 Al1orit/m  hen appl,ing P46 to the  algorithm$the same concept is presered ut semi-3oinsare replaced , P46 3oins #ur enhancementconsisted of the follo"ing t"o phases that"ere added to the schedule construction<a Build a P46 list "here P46 i i1 3  is setto 1 "hen transmission "as done from  R i  to  R i1  on 3oin attriute  j.   hen calculating transmission cost$ If P46 i i  1 3  J 1 then Cost J 0 4lse Cost J C 0   C 1  E  i/    ( i/   E ? (i  1) /   )D@ "here C 0   C 1  E  i/   is the linear function of transmission cost that is e.ual to the fi9ed cost per ,te transmitted (C 1 ) multiplied , the siein ,tes of the 3oin attriute pro3ected This isthe usual cost of a semi-3oin /no"n as thefor"ard cost$ and ( i/   E ? (i  1) /   )D@ is the ac/"ard cost that is the cost of transmitting ac/ to  R i  the it ector consisting of onl,matching alues of the correspondingattriute 6or simplicit, of this e.uation$ "eare considering attriute k   of "idth 1 ,te As it can e seen$ the P46 ersion of algorithm does not eliminate redundanttransmissions from the schedules ut it ma/estheir cost ero "hen the, occur This can emade possile , adding a little oerhead onthe transmission cost$ "hich is the ac/"ardcost &sing this fact$ if a transmission "asdone from site  A  to site  B  using a 3oin attriute  j $ then eer, other transmission from  A  to  B using  j  "ill hae a ero cost and eer,transmission from  B  to  A  using  j  "ill haealso a ero cost 6rom this point$ a P46 3oincan e seen as a non-redundant s,mmetricfunction This fundamental propert, allo"edus to enhance oer the  algorithm 2 % E"perimental Results  ;ifferent scenarios "ere conceied in order to ealuate the performance of the differentalgorithms and for each scenario programs"ere run 1500 times ;ifferent /inds of resultsare collected including the comparison of allalgorithms ersus the unoptimied method !ote that all programs "ere deelopedusing Kisual C 0 under indo"s 549periments "ere conducted on a Pentium KPC "ith L 'B A' In the first test scenario the attriute "idthis ta/en as 1 ,te for all attriutesT?P4P46P46  Proceedings of the ISCA 15th International Conference on Computers and Their Applications (CATA-2000) !e" #rleans$ %ouisiana$ &SA 'arch 2  *1$ 2000D2-22M**2 *52-**@@M@ 112-5L1@L0L* 5*-2*0L**2L 20*-*1LM*5 2L*-52*L55*2 2L-2152*1 0@L-*M1@L 150-55*55M12 1MM5-251M51 0255-*5L*55*M 0M5-L00@L11 105T#T<M0M2* 215 =raphicall,$ the results are represented asfollo"s< comparing P46 to < "e noticethat P46 outperforms  in all cases 020406080         2    -        2        2    -        4        3    -        3        4    -        2        4    -        4       5    -        3 In the second test scenario the attriute "idthis ta/en as 5 ,tes for all attriutes T?P4P46P46D2-22M5L*112*5L2-*2*1L1102-55255M*-22@L2*0M212*-*0L**2M2L*-521555102-20*511M0@1-*55M1*15-555L@01*@55-250@5511L0*15-*551155@M0ML5-L1LL2@102T#T<L2*@121@  =raphicall,$ the results are represented asfollo"s< comparing P46 to < "e noticethat P46 outperforms  in all cases 020406080         2    -        2        2    -        4        3    -        3        4    -        2        4    -        4       5    -        3 In the third test scenario the attriute "idth ista/en as 50 ,tes for all attriutes T?P4P46P46D2-22@2*1@1*5M2-*2M*LM5022-5M2*L1L0*M*-22@M@*0@520M*-*1LM22M5*-520*522-20@M1L@0@2-*L10M5L15-5ML5LL11@55-251@51ML02@5-*52552*0@15-L0LL1L100T#T<LL0@ML21L  =raphicall,$ the results are represented asfollo"s< comparing P46 to < "e noticealso that P46 outperforms  in all cases 0204060802-22-32-43-23-33-44-24-34-45-25-35-4 e used man, different scenarios in order tostud, the performance of the mentionedalgorithms from different perspecties 6or each scenario$ "e compared the performanceof the algorithms "ith respect to each other&sing different scenarios "e studied etter the ehaior of all algorithms under a ariet, of circumstances e could e ale to note that  Proceedings of the ISCA 15th International Conference on Computers and Their Applications (CATA-2000) !e" #rleans$ %ouisiana$ &SA 'arch 2  *1$ 2000P46 has the est performance for a field"idth of 50 ,tes This result "as e9pected ecause of the oerhead added , P46 to the ac/"ard phase ememer that P46consists of returning ac/ to the srcinal site a it ector representing the matching tuplesThis oerhead is someho" more considerale"hen the srcinal field "idth is GJ 1 ,te ecause it might e more profitale sometimesnot to send ac/ this data But "hen haing a"idth of 50 ,tes$ the ac/"ard cost ecomesnegligile as compared to the for"ard cost 6inall,$ "e can conclude that the results of our e9periments "ere up to the e9pectationsand proed the po"er of P46 3oins and their adantage in optimiing the total time of distriuted .ueries & + 3onclusion  In this paper$ a P46 3oin algorithm has een presented as our contriution to the.uer, optimiation prolem using semi-3oinse hae full, e9posed oth concepts of semi- 3oins and P46 3oins and then$ "e hae ta/enan optimiation algorithm using semi-3oins() and enhanced it , appl,ing P46 3oins(P46) 4 + Re5erences 718Todd Bealor$ NSemi-3oin Strategies6or Total Cost 'inimiation In;istriuted +uer, ProcessingO$'aster Thesis$ &niersit, of indsor$Canada$ 15728he %i$ A oss$ NP46 :oin< AnAlternatie to T"o-a, Semi-:oinand Bloom3oinO$ Columia&niersit,$ !e" ?or/$ 157*8; Barara$  ;u'ouchel$ C6aloustos$ P: >aas$ :' >ellerstein$? Iaonnidies$ >K :agadish$ T:ohnson$  !g$ K Poosala$ Aoss and C Seci/$ NThe !e":erse, ;ata eduction eportO$Bulletin #f The Technical Committee#n ;ata 4ngineering$ Pages< *-5$;ecemer 1M78Peter '= Apers$ Alan  >ener and S Bing ?ao$ N#ptimiationAlgorithms 6or ;istriuted +ueriesO$I444 Transactions #n Soft"are4ngineering$ Kol Se-$ !o1$ Pages<5M-L@$ :anuar, 1@*758oula 6an,$ NP46 Solutions for ;istriuted +uer, #ptimiationO$'asters Thesis$ %eanese American&niersit,$ Septemer 1
Search
Tags
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks