Technology

Expanding the volunteer computing scenario: A novel approach to use parallel applications on volunteer computing

Description
Expanding the volunteer computing scenario: A novel approach to use parallel applications on volunteer computing
Categories
Published
of 9
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  Future Generation Computer Systems 28 (2012) 881–889 Contents lists available at SciVerse ScienceDirect Future Generation Computer Systems  journal homepage: www.elsevier.com/locate/fgcs Expanding the volunteer computing scenario: A novel approach to use parallelapplications on volunteer computing Alejandro Calderón ∗ , Felix García-Carballeira, Borja Bergua, Luis Miguel Sánchez, Jesús Carretero Computer Architecture Group, Computer Science Department, Universidad Carlos III de Madrid, Leganés, Madrid, Spain a r t i c l e i n f o  Article history: Received 28 May 2010Received in revised form15 March 2011Accepted 7 April 2011Available online 16 April 2011 Keywords: Volunteer computingHigh throughput computingHigh performance computingMPIParallel application a b s t r a c t Nowadays, high performance computing is being improved thanks to different platforms like clusters,grids, and volunteer computing environments. Volunteer computing is a type of distributed computingparadigm in which a large number of computers, volunteered by members of the general public, providecomputing and storage resources for the execution of scientific projects.Currently, volunteer computing is used for high throughput computing, but it is not used for parallelapplications based on MPI due to the difficulty in communicating among computing peers. As Sarmentaand Hirano (1999) [2] said, there are several research issues to be solved in volunteer computing platforms, one of them is ‘implementing other types of applications’. In fact, the BOINC team (thewell known volunteer computing development platform) is requesting ‘MPI-like support’ for volunteercomputing as one of its hot topics.This paper proposes an extension of the usual capabilities of volunteer computing and desktop gridsystemsbyallowingtheexecutionofparallelapplications,basedonMPI,onthesetypesofplatforms.Thegoal of this paper is to describe a method to transform and adapt MPI applications in order to allow theexecution of this type of applications on volunteer computing platforms and desktop grids. The proposedapproachdoesnotrequireanMPIinstallationontheclient’sside,whichkeepsthesoftwaretobeinstalledon the client quite lightweight. The paper describes the prototype built, and also shows the evaluationmade of this prototype, in order to test the viability of the idea proposed in this article, with promisingresults. © 2011 Elsevier B.V. All rights reserved. 1. Introduction Nowadays,highperformancecomputingcanbefoundondiffer-ent platforms: supercomputers, clusters, grids, clouds, volunteercomputing environments, and desktop grids. Volunteer comput-inganddesktopgridsareinterestingoptionsforhighperformancecomputing due to the great amount of volunteer resources avail-able in the world.Volunteer computing [1] is a type of distributed computing paradigm, in which a large number of computers, volunteeredby members of the general public, provides computing andstorageresourcesfortheexecutionofscientificprojects.Currently,volunteer computing is used for high throughput computing, butit is not used for parallel applications based on MPI, due to thedifficulty in communicating among computing peers.As Sarmenta [2] said, there are several research issues to be solved in volunteer computing platforms, one of them is‘implementingothertypesofapplications’.Infact,theBOINCteam(the well known volunteer computing development platform) is ∗ Corresponding author. Tel.: +34 91 624 9497. E-mail address:  acaldero@arcos.inf.uc3m.es (A. Calderón). requesting‘MPI-likesupport’forvolunteercomputingasoneofitshot topics [3]. In a volunteer computing scenario, people (volunteers) providecomputing resources to a project, which uses these resources toperform distributed computing. Volunteers are typically based onInternet-connected PCs [4], and many famous projects are worthy causes (medical research, environmental studies, etc.).The most used platform for volunteer computing is BOINC [5], that hosts many well known projects, like SETI@home [6,1]. The global capacity of current volunteer computing based on BOINCprojects is increasing day by day [7]. The average number of  floatingpointoperationspersecondofallBOINCprojectsexceeded3,600,000 GigaFLOPS [7] on November 2010, which is larger than the 2,570,000 GigaFLOPS of the new fastest supercomputer in theworld (see the November 2010 top 500 list [8]), and this capacity is increasing day by day.Sarmenta [2] identified some important research issues in volunteer computing:  user interface design ,  adaptive parallelism ,  fault and sabotage tolerance ,  performance and scalability , and implementing other types of applications .Since the work of Sarmenta [2], volunteer computing hasbeen used in the same way. Input data to be processed are 0167-739X/$ – see front matter © 2011 Elsevier B.V. All rights reserved.doi:10.1016/j.future.2011.04.004  882  A. Calderón et al. / Future Generation Computer Systems 28 (2012) 881–889 divided into small portions, and all the portions are stored in aserver,togetherwiththeapplicationbinaryfordifferentplatformswithin a project. One application binary and one portion of thedata are sent to a volunteer computer. Then, the application isexecuted, and the results are returned to the server. This processis repeated until all work portions are completed. This kind of behavior is appropriate for high throughput computing. However,the volunteer computing environment has not been used forparallelapplications,suchasMPIapplications,duetothedifficultyin communicating with other computing peers efficiently. Theexecution of parallel applications in volunteer platforms is a veryimportantandhottopic.AsisdescribedintheBOINC‘helpwanted’wikipage[3],oneoftheproposalsforfutureversionsoftheBOINCplatform (but still not solved) is to ‘add support for MPI-type [9] applications’.This paper proposes an extension of the usual capabilities of volunteer computing and desktop grid systems by allowing theexecution of parallel applications based on MPI on these typesof platforms. The paper describes a method to transform andadapt MPI applications in order to allow the execution of thistypeofapplicationsinvolunteercomputingplatformsanddesktopgrids. The proposed approach, called MPIWS, does not require anMPI installation on the client’s side, which keeps the software tobe installed on the client quite lightweight. The paper describesthe prototype built, and also shows the evaluation made of thisprototype in order to test the viability of the idea proposed in thisarticle, with promising results.Therestofthepaperisorganizedasfollows:Section2describesthe motivation and goals; Section 3 shows the main details of the proposed model; Section 4 describes the main aspects of theprototypedeveloped;Section5presentsthemainresultsobtainedin the evaluation; Section 6 shows the related work; and, finally,Section 7 summarizes the paper, and shows the expected futurework. 2. Motivation and goals Typical parallel applications for clusters and supercomputersare written using the MPI standard [9] (Message Passing Interface) as the programming interface. An MPI implementation actslike a middleware, and provides an execution environmentwhere several compute nodes run the processes of the parallelapplication.Typical volunteer applications use some volunteer computingmiddleware API. The most known middleware is BOINC [4]. It also provides an execution environment where several volunteercomputersoffertheircomputationalresourcesforaperiodoftimeto run applications. In this case, the execution environment isbased on a pull system [10], where volunteer computers request the work to be done.The input data to be processed are divided into small portions,and all these portions together with the application (that couldbe executed on different platforms) are managed by a scheduler.The application and one portion of data are sent to the volunteercomputer. Then, the application is executed, and the resultsare returned, so the scheduler tracks the work done, and thework still to be done. This model of execution is appropriate forsequentialapplications,butisnotusedfortheexecutionofparallelapplications, such as MPI applications.In this paper, we try to answer the following question: howcan a general MPI application be executed using a set of volunteercomputational resources?One possible strategy would be executing a virtual machine(for example, Oracle VirtualBox or VMWare that provide supportfor Windows, Linux and MacOS) in each volunteer computer. Thevirtual machine [11] provides the software found in a clusternode, so it is possible to use an MPI implementation, and build acluster using the virtual machines deployed on different volunteercomputers. The main problem with this option is that not onlydoes the BOINC middleware have to be installed on the volunteercomputers, but also some virtual machine monitor (i.e. OracleVirtualBox or VMWare), which could demand a large amount of computationalandmemoryresourcesinthevolunteercomputers.For example, the size of a virtual machine hard disk could beseveral hundred megabytes in the best case. Furthermore, the MPIimplementation used should be adapted in order to allow thecommunication of the computational peers using the Internet.Instead, the strategy proposed by the authors is to adapt theMPI application in order to allow the execution of the processes of a parallel application on different volunteer nodes. The proposedapproach does not require an MPI installation on the client’s sidetokeepthesoftwaretobeinstalledontheclientquitelightweight.The main goals of the proposed approach are the following. •  To take advantage of volunteer computing and desktop gridplatforms’ power for the execution of MPI parallel applications,allowing the execution of an MPI parallel program throughnodes located even in different administrative domains. •  Tore-usetheexistingsoftwarewithminimalchanges.Theideais to provide a nearly transparent solution for users. •  To use web services to solve connectivity problems amongcomputational resources, due to firewalls, filters, etc. •  To provide a mechanism to deploy the parallel applicationprocesses on different volunteer computers.The next section introduces the general framework used, andthe proposed model. 3. Proposed model The objective of this paper is to describe a general frameworkto allow the execution of parallel applications in computationalresources connected to the Internet. In this section we define, ina formal way, the general model behind this approach.Let  C   =  C  1 ,..., C  m  be a set of independent computationalresources connected to the Internet, with  | C  i | ≥  1. If   | C  i | =  1,then  C  i  represents a single node. If   | C  i |  >  1, then  C  i  representsa cluster with  | C  i |  nodes. Each cluster may have its own MPI ornative implementation. MPI k  is the MPI implementation used inthe cluster  C  k , and it can be different from MPI  j .Let  P   =  P  1 ,..., P  n  be the processes of an MPI program. Thegoal is to execute  P   on  C   using a mapping function  π  :  P   →  C  ,where  π( P  k )  =  C   j  if   P  k  executes in  C   j . In Fig. 1, we can see three computationalresources.Thefirstisaclusterwiththreenodesandone MPI implementation. The second is a cluster with a differentMPI implementation. The third represents a single node. All theseresources are connected to the Internet. The nodes inside a clustercan communicate through a local network. The figure represents aparallel application with 6 processes deployed on these resources.Let  M   =  ( R , S )  be a directed graph that represents a singleMPI communication call, with  R  ⊂  P   and  S  ∈  R  ×  R . For  q  = ( P  i , P   j )  where  q  ∈  S , P  i  is the MPI source process and  P   j  is the MPIdestinationprocessofthecommunication.AllMPIcommunicationcalls (synchronous, asynchronous, point to point, collective, etc.)can be represented by a directed graph.Any MPI collective primitive needs several arrows (directedarcs) in order to represent the communication that this collectiveprimitive performs.There are two types of communications. The MPI communica-tion  M   =  ( R , S )  can be: •  local:  π( P  i )  =  π( P  k ) ∀ P  i , P  k  ∈  R , •  remote: ∃ P  i , P  k  ∈  R , and  P  i  ̸    =  P  k  |  π( P  i )  ̸    =  π( P  k ) .   A. Calderón et al. / Future Generation Computer Systems 28 (2012) 881–889  883 Fig. 1.  Example of an MPI application execution. Fig. 1 shows an MPI application with six processes. Localcommunication among the processes executing in the samecomputational resource  C  k  is made using the native MPI k implementation installed on the  C  k  cluster, if this implementationis available. The remote communication uses web services.This general model has been adapted and applied to volunteercomputing platforms. In this case, all computational resources, C   =  C  1 ,..., C  m , are single volunteer computers, and therefore | C  i | =  1.Forthisscenario,allcommunicationsareremote,andanyMPI implementation is available. The implementation developeddoesnottakeintoaccountlocalcommunicationsamongprocesseslocated in the same computational resource; all communicationshave been considered in this paper as remote. 3.1. MPIWS model As we defined before,  C   =  C  1 ,..., C  m  is a set of independentcomputational resources connected to the Internet, and  P   = P  1 ,..., P  n  is an MPI program associated with a volunteer project.In order to deploy  P   on  C  , a dedicated cluster associated with thevolunteerprojectisrequired.Wecallthisclustera stubcluster  .Thisclusterbelongstothevolunteerproject,andcanbeassociatedwiththe volunteer project server. On the  stub cluster   we define a  stubMPI   as  SP   =  SP  1 ,..., SP  n , where: •  SP   is an MPI program too; • | P  | = | SP  | ; •  allprocessesexecuteonthe stub cluster  ,andtherefore ∀ P  i , P  k  ∈ SP  ,π( P  i )  =  π( P  k ) , and they use the same MPI implementationavailable on this cluster; • ∀ P  i  ∈  P  , ∃  SP  i  ∈  SP  , where each  SP  i  represents a remote stubprocess for  P  i .The remote communication is used for communication amongprocesses of different computational resources, and for communi-cationamongprocessesofthesamecomputationalresourcewhenan MPI implementation is not available. The remote communica-tion uses Web services and the cooperation of the  stub MPI  .Any MPI communication  M   =  ( R , S )  can be transformed in anMPI remote communication  M  ′ =  ( R ′ , S ′ ) , where  R ′ ⊂  P   ∪  SP  , S ′ ⊂  R ′ ×  R ′ and  ∀ q  ∈  S , q  =  ( P   x , P   y ) ∃ q 1 , q 2 , q 3  ∈  S ′ , with q 1  =  ( P   x , SP   x ), q 2  =  ( SP   x , SP   y ), q 3  =  ( SP   y , P   y ) .  q 1  and  q 3  are doneby using Web services, whereas  q 2  uses the MPI implementationavailable on the cluster where the  stub MPI   executes (see Fig. 2).Fig. 2 represents an example of a parallel program with fourprocesses. In this example four volunteer computers are used.Each process of the parallel application is executed in a volunteercomputer.ThestubclusterexecutesallMPIstubsbyusingtheMPIimplementation available on this cluster. 3.2. Key aspects of the MPIWS model Any parallel application consists of a computational part anda communication–synchronization part. The proposed model pro-videsatwolevelstructureinordertosupportbothparts:volunteercomputers provide the resources for the computational part andthe stub cluster provides the non-computational resources (com-munication, synchronization, file access, etc.).Both the stub cluster and volunteer computers work togetherto provide a new kind of platform that makes use of the extracomputational capacity provided by the volunteer computers. Forexample, a stub cluster that executes the MPI stub could have 100nodes with 16 cores each. This cluster could hold the MPI stub foran MPI program running in 1600 volunteer quad-core nodes. Ascanbeseen,inthiscasewecanuseacomputationalcapacityinthevolunteernodesthatisnotavailableinthestubcluster.Inthisway,any MPI application with great computational cost could benefitfrom this platform.This two level configuration reduces costs (compared withhaving a cluster with 1600 nodes), and takes advantage of theexisting infrastructure. From the administration point of view, noextrasoftwareisneededonthevolunteercomputers,forexample,virtual machines, MPI implementations, etc. In order to avoid theproblemsthatfirewallsandothersecuritysoftwaremayintroduce,remote communication is based on Web services, so volunteercomputers are able to communicate with the MPI stubs withoutadditional changes. 3.3. Architecture for remote communication Given an MPI parallel application source code, a simplepreprocessor stage is needed to build the source code for twoprograms: the one to be executed on the volunteer nodes, and theone to be executed on the stub cluster.Fig. 3 shows an example of an MPI application with twoprocesses (process  k  and process  i ) executed on two volunteernodes. One process requests an  MPI_Send  and the other one an MPI_Recv  respectively.The preprocessor adds to each MPI process a local stub thatintercepts the MPI calls of this process. Any call to the MPI API willbeissuedbythelocalstub(1and3).Furthermore,thepreprocessorcreates the MPI stub program to support the communication partoftheprogram.EachprocessoftheMPIstubhasaremotestubthatreceives any communication request from the corresponding localstub (2 and 4), using Web services.With these elements, a point to point communication betweentwo processes ( P  i  executes an MPI_Send() operation, and process P  k  executes an MPI_Recv() operation) is made as follows.1. Process  P  i  executes an MPI_Send(), and this operation isprocessed by the local stub inside the same process. This is alocal call because the application code and the local stub codebelong to the same process.2. The local stub in the process  P  i  requests to the remote stub  i the MPI operation. This communication is done by using Webservices.3. Process  P  k  executes an MPI_Recv(), and this operation isprocessed by the local stub inside the same process.4. The local stub in the process  P  k  requests to the remote stub  k the MPI operation. This communication is also done by usingWeb services.5. When the remote stub  i  receives a request, it also receivesthe parameters of the MPI operation. With these parametersthe remote stub implements the communication by usingthe MPI implementation available in the stub cluster. In thiscase, the stub cluster executes an MPI_Send() call in order tocommunicate with the stub process  k .  884  A. Calderón et al. / Future Generation Computer Systems 28 (2012) 881–889 Fig. 2.  An MPI program and its stub MPI. Fig. 3.  Operations flow of MPI_Send() and MPI_Recv(). 6. The same process is performed in the stub process  k . Thecommunication part of the MPI application occurs in steps 5and 6.7. The stub process  i  returns the MPI_Send() result.8. The stub process  k  returns the MPI_Recv() result in a similarway.9. Finally, the local stub in process  i  returns the result to theapplication.10. In a similar way, the local stub in process  k  returns the resultto the application.Collective communications are done in a similar way to theone described before. With this approach, volunteer nodes donot need to have an MPI implementation installed to run parallelapplications. The stub MPI program can be executed on a cluster(stub cluster) with the minimal number of nodes because thisprogram performs the MPI API calls of the parallel program withthe minimal computation part, so many stub MPI processes can beexecuted in the same compute node.There are some interesting advantages of this architecture. •  It is possible to use any MPI implementation on behalf thecommunication nodes for an MPI parallel application. Amongthe existing MPI implementations, one of them can be used fortheMPIstubs.Evenmore,wecanusenotonlycommunicationscalls, but also other MPI API services, like for example fileservices (MPI-IO). •  It does not require an MPI installation on the volunteercomputer side, which keeps the software to be installed onthese nodes quite lightweight. •  We can obtain advantages of current multicore architectures.The two level configuration lets us use large existing multicorecomputers as volunteer computers, and a short number of megacorecomputers(i.e.16coreCPU)asthestubcluster,whichreducesthecostsofbuyingalargenumberofnodeswithrecenttechnology.The main drawback of this architecture is the cost of remotecommunications. Parallel applications with a high number of communicationcallsintheirexecutionarenotappropriateforthis   A. Calderón et al. / Future Generation Computer Systems 28 (2012) 881–889  885 Fig. 4.  Preprocessing and deployment. approach.Theproposedmodelisappropriateforparallelscientificapplications that spend more time in their computational phasesthan in the communication phases.However, as is described in [12], almost all successful MPI pro-grams are written in a  bulk synchronous style  in which the com-putation proceeds in phases. Within each phase, processors firstparticipateinacommunicationoperation.Fortheremainderofthephase,eachprocessorcanperformcomputationwithoutanycom-munication.Virtuallyallofthehighlyscalablescientifickernelscanbe formulated in this way, including matrix–vector multiplication,explicit time integration, and particle simulation. Important ker-nels that do not fit this model, like sparse factorizations, typicallydo not scale beyond a few hundred processors. 4. MPIWS implementation We have built a prototype in order to evaluate the viability of the proposed model. This prototype consists of a preprocessor anda library that implements the remote communication. Given anMPI application, the preprocessor replaces all MPI references withMPIWS library references, and builds a new application (the initialMPI code is kept without modification). The preprocessor alsobuilds the associated stub MPI. This process, totally transparent tousers, can be seen in Fig. 4.Later, the MPIWS application may be deployed on differentvolunteer nodes and the stub MPI application on the stub cluster.Communication among local and remote stubs uses Web services,implemented in the prototype using gSOAP 2.7.In this prototype, an mpirun-like utility is provided in orderto deploy the processes in the corresponding nodes. This mpirun-like utility receives the list of volunteer computers available atthat time and the stub cluster to use. With this list and the stubcluster, the mpirun-like utility launches the stub MPI program inthe stub cluster, and also launches the appropriate MPI processin each volunteer computer. This list could be obtained using, forinstance, an existing volunteer project server like the one used ina BOINC project.As the performance of Web services is poor for large messages(especially for binary messages), the prototype implementsmessage compression. The compression has been implementedwith two algorithms: GZIP and LZO. GZIP is provided by gSOAPwhile LZO is not, but the latter is the recommended algorithm forquickcompressionanddecompression.TheLZOalgorithm[13]has been used by authors before in a multithread MPI implementationcalled MiMPI [14,15] with successful results. Compression is activated when the message size is greater than or equal to twokilobytes (2 KB). 5. Evaluation In this section we want to demonstrate that the proposedarchitecture can be used in terms of high performance computingin a scenario composed of a set of volunteer computers withoutany MPI implementation installed on them. In order to performtheevaluation,theprototypedescribedintheprevioussectionhasbeen used.For this evaluation, two aspects have been evaluated: themaximum bandwidth that can be obtained, and the throughputobtained with a typical MPI application. 5.1. Platforms used The evaluation has been made using two hardware environ-ments. •  Thefirstoneisatypicalclusterwith8Inteldualcoreprocessorsat 2.2 GHz, with 2 GB of RAM, and interconnected to aFastEthernet network. This cluster uses OpenMPI [16] as the MPI native implementation. •  The second one consists of 40 independent PCs (the volunteercomputers) located in laboratory rooms placed in two differ-entcampuses oftheuniversity(Colmenarejo andLeganescam-puses), with an approximate distance between them of 45 km.These PCs are not configured as a cluster, and they do not haveany MPI implementation installed. All machines use DebianGNU/Linux 4.0, and they are interconnected to a FastEthernetnetwork.AllthesePCsaredifferent.Someofthemaredualcore,some others are single core, and the frequency varies from 1.8to 2.2 GHz. The memory varies from 1 to 2 GB.We want to compare unmodified MPI programs against thesame MPI programs transformed following the proposal of thispaper. For this purpose we will use the hardware environmentsdescribed above. •  When we evaluate unmodified MPI programs, we will use thefirst hardware environment (cluster). •  When we evaluate MPI programs transformed following theproposal of this paper, we will use only one node of the firsthardware environment to execute the stub MPI program, andthe second hardware environment will execute the volunteerprocesses.Communication between the stub node and the volunteerprocesses takes place over a FastEthernet network. As describedbefore, the Web services communication among the processes inthe second platform and the processes in the stub node (located inonenodeofthefirstplatform)ismadebyusinggSOAPversion2.7.We know that this environment is not fully representative of areal-life situation in volunteer computing. However, this scenarioisappropriatetoobtaininitialconclusionsaboutthebenefitsoftheproposal. The only main difference with a real-life scenario is thevolatility of the nodes, that will be addressed in future works. 5.2. Evaluation software and results Fortheevaluation,wehaveusedabenchmarkandarealparallelapplication. Both of them have been executed in the platformsdescribed above. The results for the first scenario are shown asCluster-OpenMPI, and the results for the second one are shown asPCs-MPIWS.The results presented here are the average values obtainedafter executing the tests 25 times. The results obtained in everytest were almost the same; that is the reason why the standarddeviation is very low. The standard deviation found in the secondtest (the one with more differences among the values obtained)was less than 5% of the average value.
Search
Similar documents
View more...
Tags
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks