A New Large-Scale Distributed System

A New Large-Scale Distributed System
of 15
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
  (*) Claudio E. Righetti - Instituto Nacional de Tecnología IndustrialAv. Gral. Paz y Constituyentes s/nC.C. 157 - C.P. 1650 San Martín (Pcia. Bs. As.) - ArgentinaPhone: 54-1-7526915 / Fax: 54-1-7545194 A New Large-Scale Distributed System María Eva M. Lijding Computer Science Department Universidad de Buenos Airesmlijding@dc.uba.ar  Claudio E. Righetti (*) Computer Science Department Universidad de Buenos Airesclaudio@dc.uba.ar  Leandro Navarro Moldes Computer Architecture Department Universidad Politécnica de Cataluñaleandro@ac.upc.es Abstract We introduce in this work Object Distribution System, a distributed system based ondistribution models used in everyday life (e.g. food distribution chains, newspapers, etc.). Thissystem is designed to scale correctly in a wide area network, using weak consistencyreplication mechanisms. It is formed by two independent virtual networks on top of Internet,one for replicating objects and the other one to build distribution chains to be used by the firstnetwork.As in Internet some sites often become inaccessible due to latency, partitions and flash-crowd, objects in our system are accessed locally and updated off-line. It also providesmethods for the classification of objects. This allows selective distribution, and provides orderin the chaos that reigns nowadays in Internet. Distribution chains are build dynamically toprovide end users with the objects they want to consume, while making good use of availableresources. 1. Introduction In the last few years Internet has been growing exponentially in number of users and hosts,but the communication infrastructure does not keep pace with that growth.This growth brings about an important rise in traffic, so users experience a big latency whenaccessing resources. A resource is any object that can be accessed or referenced in a network [Deutsch 94] (e.g. a document, a name server, etc.).Berners-Lee arguments that a reasonable latency is about a 100 ms [Berners 95], but Vilesand French have found that latency is about 500 ms [Villes 95]. Our measures show that thesituation nowadays is even more discouraging, as we have found out that average latenciesbetween Spain and Argentine are over 2000 ms. Packet loss is considered common at about40% or more [Goldin 92b], but we usually suffer losses between 60% and 70% (measurestaken in Jan/Feb. 97). Another problem in Internet are network partitions, due to link or nodefailures.We must not only take in to account the growth in the number of users, but also theirbehavior. The fact that an important amount of users have a common interest at a certainmoment, cause an inundation phenomena in a resource server and the network nearby, called  flash-crowd   [Nielsen 95]. This phenomena can be seen with resources that thousands of userswant to access simultaneously (e.g. 880.000 accesses to the NASA Web Servers during thecomet Levy-Schoemaker-9 collision). If the resource involved is a document, it is usuallyreferred to as hot document  .  All this causes that many resources are considered unreachable by huge amounts of users,that feel frustrated by this.Another problem is the great volume of information available in Internet. Generally there isno guaranty of the quality and reliability of such information (e.g. in News over 90% of thenews is considered noise [Saltz 92]). The information available in Internet is not classified,except in News and some catalogues in Web servers. This brings about yet another problem,how to find relevant information for the user. The excess of information and it’s associatedproblems, leads to total misinformation. Sometimes the excess of information is a problem asimportant as the lack of it.IRTF (Internet Research Task Force) has stated that resource discovery tools (RDT) (e.g.WAIS, Gopher, Web, Archie, Prospero, etc.) have scalability failures. Scalability has beenshown by Danzing et al. [Danzing 94] in three dimensions, to put this problems in a properframework: Data Volume, Number of users and RDT diversityOur work intends to solve the lack of scalability of the first two dimensions and assure theavailability of resources even when network partitions occur. 2. Framework Replicating resources improves performance and availability. By storing copies of shareddata on processors where they are frequently accessed , the need for expensive remote readaccesses is decreased. By storing copies of critical data on processors with independent failuremodes, the probability that at least one copy of the data will be accessible increases. But whenresources are replicated the correctness of each copy must be taken into account. Consistencyis defined as all the copies of the same logical data item to agree on exactly one value[Davidson 85].Maintaining correctness and availability of data during network partitions are competinggoals. Correctness may be guaranteed by allowing operations to take place on only onepartition. This means first of all that the system must be able to determine the existence of anetwork partition. In very congested networks, such as Internet, this can not be achievedbecause frequently a partition can not be distinguished from the latency of the network. Thisway of maintaining correctness diminishes the system’s availability. If we want to ensure theavailability of resources at all time, we can allow the normal operation of the system, even inthe face of partitions. By allowing this, replicas may not always be consistent and it is neededto apply some correcting mechanism once the partition has been solved [Davidson 85].The consistency degree of a distributed systems, depends mainly on the constraints inherentto the application. It is not the same to maintain brokers informed about the stock exchangethan providing new articles to a group of researchers.Systems based on weak consistency replication , allow the replicas to diverge in the face of anetwork partition, so that each replica can continue offering service. Once the partition issolved, the replicas eventually converge to a consistent state. We consider that significantlatencies must also be thought of as network partitions.We believe weak consistency protocols are needed in order to make replication mechanismsscalable in wide area networks. These protocols have already been used in a great variety of systems, due to the high availability, good scalability and design simplicity (e.g. Grapevine[Birell 82], Clearinghouse [Oppen 83], Locus [Popek 85], Coda [Satya 92], GNS [Lampson86], AFS [Satya 93], News, Refdbms [Goldin 92a], Harvest [Obraczka 94], OSCAR[Downing 90]).Information tools deployed on the Internet may all be classified as using one of the followingtechniques to ‘disseminate’ their information [Weider 94]:  ♦ Come get it:  Is the most used technique (e.g. FTP, Gopher, Web, etc.). It means a waste of time for the user, as a combination of access latency and excess of information (relevant andnoise). These systems are mainly based on a client-server model. ♦ Send it everywhere:  We believe it is better to call this technique ‘send it to all interestedmembers’. News, Mbone [Eriksson 94] and Harvest may be considered to use this technique.They are generally distributed systems, where service agents cooperate in order to replicateinformation. Something important to take into account is the topology used to disseminateinformation and the way it is build.Let’s take Web as an example of an information system used by a great number of users.Actually Web is responsible for most of nowadays traffic on Internet. As many otherinformation systems, Web was not designed to be used in such a big scale.A solution to diminish access latency to a Web document, is to use cache techniques. Thesetechniques store documents asked by a user, so that they will be locally available whenanother user wants to access them. But the first user that accesses a document must stand thehole latency at that precise moment. Generally they also show a lack of robustness whennetwork partitions occur, because they must validate the local replica against the srcinaldocument. Even if more efficient ways of caching have been proposed (e.g. geographical pushcashing [Gwertzman 94], demand based dissemination [Bestravros 95], cooperative cache[Malpani 95]) they all have the problems just mentioned.Caching: - on demand (readers) - partial (page) - synchronous (same time) - event: get - 1 doc; N (old) fragmentsDistribution: - on production (authors) - total (volume) - asynchronous (diff. time) - event: post - 1 doc replicated N timesFig. 1: Cache techniques vs. Distribution techniquesCache techniques fail to handle big information volumes and they also fail when informationchanges rapidly. Web documents are generally formed by several files. Cache techniquesaccess documents using HTTP, but HTTP opens a TCP connection for each file in adocument. Handling multiple TCP connections causes an important load on busy servers.Another widely used technique in order to diminish latency and the flash-crowd phenomena,is making an exact copy of a remote FTP site and present it as a local FTP site. This is usuallyknown as mirroring or mirrored sites.But using mirrored sites has various problems. Users usually do not trust the source of information and doubt about the information being kept updated. Another problem is thatusers do not remember about mirrored sites and even when they do, they are not able to decidewhich one is the closest. Keeping mirrored sites does not forbid the access to the srcinaldocument, so most users go on accessing only one site. Even if all the problems we have justmentioned are brought about by the behavior of users, there is yet another underlying problem,because generally replication is carried out in a centralized manner. 3. Object Distribution System (ODS) Our idea is to introduce a system based on distribution models used in everyday life (e.g.food distribution chains, publications, etc.). Consumers do not go to the places where thegoods are produced (e.g. factories, author’s house, etc.), they buy them in the closest retailshop. Consumers purchase the goods that are already available in the shops, they must notwait for them to arrive. Even if the factory is constantly producing new goods, consumerspurchase the goods already available in the shops. This systems works because consumers  trust their retail shops, they believe that they have products as fresh as possible at a reasonableprice.Consumers that do not want to accept this rules, can try to obtain the goods straight from theproducers, but the producers may not want to supply them directly. Even if the producersaccept direct purchasing, it is not as easy for consumers to do this than to go to retail shops(distance, working hours, product presentation, service to the clients, etc.).In Internet community this concept does not exist. We should like to introduce it in a waythat adequates to the infrastructure we have nowadays, without modifying used protocols andstandards, and trying to take advantage from their wide utilization. We believe that modelsthat propose a radical change can not find immediate viability, because defining new modelsand protocols to replace the ones being used, is not as fast as the Internet communitydemands.The goal of Object Distribution System is to provide this service in Internet. It is formed bytwo independent virtual networks: Object Distribution Network (ODN) and Object RoutingNetwork (ORN). ODN brings objects close to end users according to their interests and ORNbuilds the distribution chains that ODN needs to do this in an optimal way. Object DistributionObject DistributionNetwork Network (ODN)(ODN)Object RoutingObject Routing Network Network (ORN)(ORN) groupmembershipdistributionchains Fig. 2: Object Distribution SystemODN handles objects (not with the precise meaning given in object programming) that arepersistent and replicated in every interested service agent. ODN can handle different classes atthe same time. Inside each class, objects can be further classified by their authors or someclassification authority.Our objects are write-one/read-many (e.g. Web documents, FTP archives, etc.), this meansthat each object is only modified in the service agent where it is registered  . In this way noinformation correction is needed, because the worst thing that may happen is that some usersare accessing in a read-only manner a version of an object that may not be the last one. It mustbe noted that in a finite time the objects will be updated, even if this time is not bounded.Accessing a version of an object different from the last one, is not acceptable in certainapplications. This can be clearly seen in stock exchange information for real time decisions ora videoconference. We have designed ODN for objects that do not suffer constant changes.A good example of this kind of objects are documents, because they are persistent, can beclassified and do not change often. We introduce Document Distribution Network (DDN) as aspecial case of ODN, for which we have a reference model. We state that documents generallydo not change often based on the following statistics: − Bestavros [Bestavros 95b] states that the amount of updates on documents that he calls‘globally popular’ is at most 0.5% per day, even more this amount is restricted to a smallsubset of them.  − Blaze [Blaze 93] states that the probability that a documents changes gets smaller furtherin time since the last update from the document. Service agents  (members of ODN) join different groups according to the interests of theirusers. In each ODN group the service agents cooperate to obtain an efficient replication.Having groups allows selective replication. There is a group for each kind of objects. In thisway we also want to put some order in the chaos that is brought about by having informationthat is not classified.ORN builds distribution chains dynamically for each group. To build the chains the routingagents  (members of ORN) take into account the type of membership to a group of eachservice agent and the underlying network state. Even if News and GNS distribute objects in ahierarchical manner, they do not build distribution paths dynamically.The routing mechanisms used in ORN for building distribution chains is completelyindependent of the class of objects that are being handled by ODN. Both networks weredesigned to work independently, defining a clear interface between them so that ORN canprovide services to ODN in a transparent way. UAP   UASAP ODNODN RAP ORP RAPUAPUASASASARARARARAODN: Object Distribution Network ORN: Object Routing Network SA: Server AgentUA: User AgentRA: Routing AgentSAP: Server Agent ProtocolUAP: User Agent ProtocolRAP: Routing Agent Protocol ORP: Objetct Routing Protocol ORNORN Fig. 3: Object Distribution System’s ProtocolsSome authors propose the use of IP multicast [Deering 89] for massive replication (e.g.DPM [Donnelley 95]). Using IP multicast implies best-effort sending of datagrams to a groupof hosts that share a unique IP address. This fits correctly to real time application for audioand video, where the loss of a datagram is not an important problem and receiving datagramstoo late or replicated is much worse. But using directly IP multicast does not work when dataconsistency is needed. Message delivery is not reliable and all the members of the group mustbe active to receive an update, because the distribution is done directly by the producer of theobject.Multicast routing algorithms (e.g. CBT [Ballardie 93], DVMRP [Waitzman 88], MOSPF[Moy 94], PIM [Deering 96], etc.) may be studied separately from the use of IP multicast. Asours is an application level problem it is not right to depend on the network level to solve it.Our proposal is to adapt routing techniques used on network level to application level. 3.1 Object Distribution Network (ODN) We define an Object Distribution Network (ODN) as a set of service agents (SA)  thatcooperate in order to replicate objects for users in different sites. The users need not knowwhich is the srcinal site of each object, and even less if it is reachable. Each user accesses areplica of the objects he subscribes to, in the service agent that provides him with an accesspoint to ODN and also registers there the new objects he wants to disseminate through ODN.Users really access ODN through user agents (UA)  that work as interfaces. A user agentcommunicates with the service agent using the User Agent Protocol (UAP) . User agents arenot part of the system, so the way users interact with them is not defined.
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks