Business & Finance

A security architecture for computational grids

Description
A security architecture for computational grids
Published
of 10
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  A Security Architecture for Computational Grids ∗ Ian Foster 1 Carl Kesselman 2 Gene Tsudik 2 Steven Tuecke 11 Mathematics and Computer Science  2 Information Sciences InstituteArgonne National Laboratory University of Southern CaliforniaArgonne, IL 60439 Marina del Rey, CA 90292 { foster,tuecke } @mcs.anl.gov  { carl,gts } @isi.edu Abstract State-of-the-art and emerging scientific applications require  fast access to large quantities of data and commensurately  fast computational resources. Both resources and data are often distributed in a wide-area network with components administered locally and independently. Computations may involve hundreds of processes that must be able to acquire re-sources dynamically and communicate efficiently. This pa-per analyzes the unique security requirements of large-scale distributed (grid) computing and develops a security policy and a corresponding security architecture. An implemen-tation of the architecture within the Globus metacomputing toolkit is discussed. 1 Introduction Large-scale distributed computing environments, or “com-putational grids” as they are sometimes termed [4], cou-ple computers, storage systems, and other devices to enableadvanced applications such as distributed supercomputing,teleimmersion, computer-enhanced instruments, and distri-buted data mining [2]. Grid applications are distinguishedfrom traditional client-server applications by their simulta-neous use of large numbers of resources, dynamic resourcerequirements, use of resources from multiple administrativedomains, complex communication structures, and stringentperformance requirements, among others.While scalability, performance and heterogeneity are de-sirable goals for any distributed system, the characteristicsof computational grids lead to security problems that are notaddressed by existing security technologies for distributedsystems. For example, parallel computations that acquiremultiple computational resources introduce the need to es-tablish security relationships not simply between a clientand a server, but among potentially hundreds of processes ∗ This work was supported in part by the Mathematical, Informa-tion, and Computational Sciences Division subprogram of the Officeof Computational and Technology Research, U.S. Department of En-ergy, under Contract W-31-109-Eng-38; by the Defense Advanced Re-search Projects Agency under contract N66001-96-C-8523; and by theNational Science Foundation. To appear in the 5th ACM Conference on Computer andCommunication Securitythat collectively span many administrative domains. Fur-thermore, the dynamic nature of the grid can make it im-possible to establish trust relationships between sites priorto application execution. Finally, the interdomain securitysolutions used for grids must be able to interoperate with,rather than replace, the diverse intradomain access controltechnologies inevitably encountered in individual domains.In this paper, we describe new techniques that overcomemany of the cited difficulties. We propose a security pol-icy for grid systems that addresses requirements for singlesign-on, interoperability with local policies, and dynamicallyvarying resource requirements. This policy focuses on au-thentication of users, resources, and processes and supportsuser-to-resource, resource-to-user, process-to-resource, andprocess-to-process authentication. We also describe a se-curity architecture and associated protocols that implementthis policy. Finally, we present a concrete implementation of this architecture and discuss our experiences deploying thisarchitecture on a large grid testbed spanning a diverse col-lection of resources at some 20 sites around the world. Thisimplementation is performed in the context of the Globussystem [5], which provides a toolkit, testbed, and set of ap-plications that can be used to evaluate our approach. How-ever, we believe that the proposed techniques are generalenough to make them applicable outside the Globus con-text.In summary, this paper makes four contributions to ourunderstanding of distributed system security:1. it provides an in-depth analysis of the security problemin computational grid systems and applications;2. it includes the first detailed formulation of a securitypolicy for grid systems;3. it proposes solutions to specific technical issues raisedby this policy, including local heterogeneity and scal-ability; and4. it describes a security architecture that uses these so-lutions to implement the security policy, and it demon-strates – via large-scale deployment – that this archi-tecture is workable. 2 The Grid Security Problem We introduce the grid security problem with an exampleillustrated in Figure 1. This example, although somewhatcontrived, captures important elements of real applications,such as those discussed in Chapters 2–5 of [4].1  DataData DataDataData 1. Request data analysis2. Contact resource broker3. Initiate task farm4. Access parameter valuesA. Physicist Kerberos physicist ssh ap SSL guest29 plaintext aphysicist plaintext ap6 plaintext bcollab Site ASite BSite C Site DSite E Site F Site G Figure 1: Example of a large-scale distributed computation:user initiates a computation that accesses data and comput-ing resources at multiple locations.We imagine a scientist, a member of a multi-institutionalscientific collaboration, who receives e-mail from a colleagueregarding a new data set. He starts an analysis program,which dispatches code to the remote location where the datais stored (site C). Once started, the analysis program deter-mines that it needs to run a simulation in order to comparethe experimental results with predictions. Hence, it contactsa resource broker service maintained by the collaboration (atsite D), in order to locate idle resources that can be used forthe simulation. The resource broker in turn initiates com-putation on computers at two sites (E and G). These com-puters access parameter values stored on a file system at yetanother site (F) and also communicate among themselves(perhaps using specialized protocols, such as multicast) andwith the broker, the srcinal site, and the user.This example illustrates many of the distinctive charac-teristics of the grid computing environment: •  The user population is large and dynamic. Partici-pants in such virtual organizations as this scientificcollaboration will include members of many institu-tions and will change frequently. •  The resource pool is large and dynamic. Because indi-vidual institutions and users decide whether and whento contribute resources, the quantity and location of available resources can change rapidly. •  A computation (or processes created by a computa-tion) may acquire, start processes on, and release re-sources dynamically during its execution. Even inour simple example, the computation acquired (andlater released) resources at five sites. In other words,throughout its lifetime, a computation is composed of a  dynamic   group of processes running on different re-sources and sites. •  The processes constituting a computation may com-municate by using a variety of mechanisms, includingunicast and multicast. While these processes form asingle, fully connected logical entity, low-level commu-nication connections (e.g., TCP/IP sockets) may becreated and destroyed dynamically during program ex-ecution. •  Resources may require different authentication and au-thorization mechanisms and policies, which we willhave limited ability to change. In Figure 1, we indi-cate this situation by showing the local access controlpolicies that apply at the different sites. These includeKerberos, plaintext passwords, Secure Socket Library(SSL), and secure shell. •  An individual user will be associated with different lo-cal name spaces, credentials, or accounts, at differentsites, for the purposes of accounting and access con-trol. At some sites, a user may have a regular account(“ap,” “physicist,” etc.). At others, the user may usea dynamically assigned guest account or simply an ac-count created for the collaboration. •  Resources and users may be located in different coun-tries.To summarize, the problem we face is providing securitysolutions that can allow computations, such as the one justdescribed, to coordinate diverse access control policies andto operate securely in heterogeneous environments. 3 Security Requirements Grid systems and applications may require any or all of thestandard security functions, including authentication, accesscontrol, integrity, privacy, and nonrepudiation. In this pa-per, we focus primarily on issues of authentication and ac-cess control. Specifically, we seek to (1) provide authentica-tion solutions that allow a user, the processes that comprisea user’s computation, and the resources used by those pro-cesses, to verify each other’s identity; and (2) allow localaccess control mechanisms to be applied without change,whenever possible. As will be discussed in Section 4, au-thentication forms the foundation of a security policy thatenables diverse local security policies to be integrated intoa global framework.In developing a security architecture that meets theserequirements, we also choose to satisfy the following con-straints derived from the characteristics of the grid environ-ment and grid applications: Single sign-on:  A user should be able to authenticateonce (e.g., when starting a computation) and initiate com-putations that acquire resources, use resources, release re-sources, and communicate internally, without further au-thentication of the user. Protection of credentials:  User credentials (passwords,private keys, etc.) must be protected. Interoperability with local security solutions:  While oursecurity solutions may provide interdomain access mecha-nisms, access to local resources will typically be determinedby a local security policy that is enforced by a local securitymechanism. It is impractical to modify every local resourceto accommodate interdomain access; instead, one or moreentities in a domain (e.g., interdomain security servers) mustact as agents of remote clients/users for local resources. Exportability:  We require that the code be (a) exportableand (b) executable in multinational testbeds. In short, theexportability issues mean that our security policy cannotdirectly or indirectly require the use of bulk encryption.2  Uniform credentials/certification infrastructure:  Inter-domain access requires, at a minimum, a common way of expressing the identity of a  security principal   such as an ac-tual user or a resource. Hence, it is imperative to employa standard (such as X.509v3) for encoding credentials forsecurity principals. Support for secure group communication  . A computationcan comprise a number of processes that will need to coordi-nate their activities as a group. The composition of a processgroup can and will change during the lifetime of a compu-tation. Hence, support is needed for secure (in this context,authenticated) communication for dynamic groups. No cur-rent security solution supports this feature; even GSS-APIhas no provisions for group security contexts. Support for multiple implementations:  The security pol-icy should not dictate a specific implementation technology.Rather, it should be possible to implement the security pol-icy with a range of security technologies, based on both pub-lic and shared key cryptography. 4 A Grid Security Policy Before delving into the specifics of a security architecture, itis important to identify the security objectives, the partic-ipating entities, and the underlying assumptions. In short,we must define a  security policy  , a set rules that define the se-curity subjects (e.g., users), security objects (e.g., resources)and relationships among them. While many different secu-rity policies are possible, we present a specific policy that ad-dresses the issues introduced in the preceding section whilereflecting the needs and expectations of applications, users,and resource owners. To our knowledge, the following dis-cussion represents the first such grid security policy that hasbeen defined to this level of detail.In the following discussion, we use the following termi-nology from the security literature: •  A  subject   is a participant in a security operation. Ingrid systems, a subject is generally a user, a processoperating on behalf of a user, a resource (such as acomputer or a file), or a process acting on behalf of aresource. •  A  credential   is a piece of information that is used toprove the identity of a subject. Passwords and certifi-cates are examples of credentials. •  Authentication   is the process by which a subject provesits identity to a requestor, typically through the useof a credential. Authentication in which both par-ties (i.e., the requestor and the requestee) authenticatethemselves to one another simultaneously is referred toas  mutual authentication  . •  An  object   is a resource that is being protected by thesecurity policy. •  Authorization   is the process by which we determinewhether a subject is allowed to access or use an object. •  A  trust domain   is a logical, administrative structurewithin which a single, consistent local security policyholds. Put another way, a trust domain is a collec-tion of both subjects and objects governed by singleadministration and a single security policy.With these terms in mind, we define our security policyas follows:1. The grid environment consists of multiple  trust do-mains  . Comment  : This policy element states that the grid se-curity policy must integrate a heterogeneous collectionof locally administered users and resources. In general,the grid environment will have limited or no influenceover local security policy. Thus, we can neither requirethat local solutions be replaced, nor are we allowed tooverride local policy decisions. Consequently, the gridsecurity policy must focus on controlling the interdo-main interactions and the mapping of interdomain op-erations into local security policy.2. Operations that are confined to a single trust domainare subject to local security policy only. Comment  : No additional security operations or ser-vices are imposed on local operations by the grid se-curity policy. The local security policy can be imple-mented by a variety of methods, including firewalls,Kerbero,s and SSH.3. Both global and local subjects exist. For each trustdomain, there exists a partial mapping from global tolocal subjects. Comment  : In effect, each user of a resource will havetwo names, a global name and a potentially differentlocal name on each resource. The mapping of a globalname to a local name is site-specific. For example, asite might map global user names to: a predefined localname, a dynamically allocated local name, or a single“group” name. The existence of the global subjectenables the policy to provide single sign-on.4. Operations between entities located in different trustdomains require mutual authentication.5. An authenticated global subject mapped into a localsubject is assumed to be equivalent to being locallyauthenticated as that local subject. Comment  : In other words, within a trust domain, thecombination of the grid authentication policy and thelocal mapping meets the security objective of the hostdomain.6. All access control decisions are made locally on thebasis of the local subject. Comment  : This policy element requires that accesscontrol decisions remain in the hands of the local sys-tem administrators.7. A program or process is allowed to act on behalf of auser and be delegated a subset of the user’s rights. Comment  : This policy element is necessary to supportthe execution of long-lived programs that may acquireresources dynamically without additional user inter-action. It is also needed to support the creation of processes by other processes.8. Processes running on behalf of the same subject withinthe same trust domain may share a single set of cre-dentials. Comment  : Grid computations may involve hundredsof processes on a single resource. This policy compo-nent enables scalability of the security architecture tolarge-scale parallel applications, by avoiding the needto create a unique credential for each process.3  Protocol 1: Creation of aUser ProxyResource Proxy Process Site 1Local policy   and mechanismsSite 2UserUser Proxy Protocol 2: Allocation of aremote resource Protocol 3: Resource allocationfrom a process Protocol 4: Creation of a global-to-local mappingResource ProxyGlobal-to-localmapping table Process Local policyand mechanisms ProcessProcess Host computer ,  C P ,  C P C RP ,  C P ,  C P C RP ,  C UP C U Global-to-localmapping table ,, ,,  Long-livedcredentialTemporarycredential Figure 2: A computational grid security architecture.We note that the security policy is structured so as notto require bulk privacy (i.e., encryption) for any reason.Export control laws regarding encryption technologies arecomplex, dynamic and vary from country to country. Con-sequently, these issues are best avoided as a matter of design.We also observe that the thrust of this policy is to enablethe integration of diverse local security policies encounteredin a computational grid environment. 5 Grid Security Architecture The security policy defined in Section 4 provides a contextwithin which we can construct a specific security architec-ture. In doing so, we specify the set of subjects and objectsthat will be under the jurisdiction of the security policy anddefine the protocols that will govern interactions betweenthese subjects and objects. Figure 2 shows an overview of our security architecture. The following components are de-picted: entities, credentials, and protocols. The thick linesrepresent the protocols described later in the paper. Thecurved line separating the user from the rest of the figuresignifies that the user may disconnect once the user proxyhas been created; the dashed lines represent authenticatedinterprocess communication.We are interested in computational environments. Con-sequently, the subjects and objects in our architecture mustinclude those entities from which computation is formed. Acomputation consists of many processes, with each processacting on behalf of a user. Thus, the subjects are usersand processes. The objects in the architecture must includethe wide range of resources that are available in a grid en-vironment: computers, data repositories, networks, displaydevices, and so forth.Grid computations may grow and shrink dynamically,acquiring resources when required to solve a problem andreleasing them when they are no longer needed. Each timea computation obtains a resource, it does so on behalf of a particular user. However, it is frequently impractical forthat “user” to interact directly with each such resource forthe purposes of authentication: the number of resources in-volved may be large, or, because some applications may runfor extended period of time (i.e., days or weeks), the usermay wish to allow a computation to operate without inter-vention. Hence, we introduce the concept of a  user proxy  that can act on a user’s behalf without requiring user inter-vention. Definition 5.1  A  user proxy  is a session manager process given permission to act on behalf of a user for a limited period of time. The user proxy acts as a stand-in for the user. It hasits own credentials, eliminating the need to have the useron-line during a computation and eliminating the need tohave the user’s credentials available for every security op-eration. Furthermore, because the lifetime of the proxy isunder control of the user and can be limited to the dura-tion of a computation, the consequences of its credentialsbeing compromised are less dire than exposure of the user’scredentials.Within the architecture, we also define an entity thatrepresents a resource, serving as the interface between thegrid security architecture and the local security architecture. Definition 5.2  A  resource proxy  is an agent used to trans-late between interdomain security operations and local in-tradomain mechanisms. Given a set of subjects and objects, the architecture isdetermined by specifying the protocols that are used whensubjects and object interact. In defining the protocols, wewill use  U  ,  R , and  P   to refer to a user, resource, and process,respectively, while  UP   and  RP   will denote a user proxy andresource proxy, respectively. Many of the following proto-cols will rely on the ability to assert that a piece of datasrcinated from a known source,  X  , without modification.We know these conditions to be true if the text is “signed”by  X  . We indicate signature of some text  text   by a subject X   by  Sig X { text  } . This notation is summarized in Table 1.Table 1: Notation used in the rest of the paper U, R, P user, resource, processUP, RP user proxy, resource proxy C  X  credential of subject  X Sig X { text }  “text” signed by subject  X  The range of interactions that can occur between entitiesin a computational grid is defined by the functionality of theunderlying grid system. However, based on experience andthe current grid systems that have been built to date, it isreasonable to assume that the grid system will include thefollowing operations: •  allocation of a resource by a user (i.e., process cre-ation), •  allocation of a resource by a process, and •  communication between processes located in differenttrust domains.(We use the term  allocation   to denote the operations re-quired to provide a user with access to a resource. On somesystems, this will involve interaction with a scheduler to ob-tain a reservation [3].) We must define protocols that control UP  - RP  ,  P  - RP  , and  P  - P   interactions. In addition, the intro-duction of the user proxy means that we must establish howthe user and user proxy ( U  - UP  ) interact.4  Within our architecture, we meet the above requirementby allowing a user to “log on” to the grid system, creating auser proxy using Protocol 1. The user proxy can then allo-cate resources (and hence create processes) using Protocol 2.Using Protocol 3, a process created can allocate additionalresources directly. Finally, Protocol 4 can be used to definea mapping from a global to a local subject.We now describe each of these protocols in more detail.We note that to minimize problems with export controls,the protocols are all designed to rely on authentication andsignature techniques, not encryption. Furthermore, our de-scriptions do not talk about specific cryptographic methods.In fact, as we shall see below, our implementation uses theGeneric Security Services application programming interfaceto achieve independence from any specific security technol-ogy. 5.1 User Proxy Creation Protocol Recall that a user proxy is an entity within our architecturethat acts on behalf of a user. In practice, the user proxy is aspecial process started by the user which executes on somehost local to that user. The main issue in the user proxycreation protocol is the nature of credentials given to theproxy and how the proxy can obtain these credentials.A user could enable a proxy to act on her behalf by giv-ing the proxy the appropriate credentials (e.g., a passwordor private key). The proxy could then use those creden-tials directly. However, this approach has two significantdisadvantages: it introduces an increased risk of the creden-tials being compromised and does not allow us to restrict thetime duration for which a proxy can act on the user’s behalf.Instead, a temporary credential,  C  UP  , is generated for theuser proxy; the user indicates her permission by signing thiscredential with a secret (e.g., private key).  C  UP   includes thevalidity interval as well as other restrictions imposed by theuser, e.g., host names (where the proxy is allowed to operatefrom) and target sites (where the proxy is allowed to startprocesses and/or use resources.)The actual process of user proxy creation is summarizedin Protocol 1. As a consequence of this protocol, the userproxy can use its temporary credential to authenticate withresource proxies. 1. The user gains access to the computer from which the userproxy is to be created, using whatever form of local authenti-cation is placed on that computer.2. The user produces the user proxy credential,  C  UP  , by usingtheir credential,  C  U  , to sign a tuple containing the user’s id, thename of the local host, the validity interval for  C  UP  , and anyother information that will be required by the authenticationprotocol used to implement the architecture (such as a publickey if certificate-based authentication is used): C  UP   =  Sig  U   { user-id  ,  host  ,  start-time ,  end-time ,  auth-info ,... }  .3. A user proxy process is created and provided with  C  UP  . It isup to the local security policy to protect the integrity of   C  UP  on the computer on which the user proxy is located. Protocol 1 : User proxy creation The concept of a user proxy is not unique to our archi-tecture. For example, Kerberos generates a limited-lifetimeticket to represent a user. Various public key systems [7, 12],use techniques similar to ours in which temporary creden-tials (i.e., a public and private key pair) are used to generatea limited lifetime certificate which is then signed by the userto indicate that this certificate represents, or is a proxy for,the user. What distinguishes our architecture from these ap-proaches is the way that a user proxy interacts with the re-source proxy to achieve single sign-on and delegation, whichis discussed in the next section. 5.2 Resource Allocation Protocol In discussing resource allocation, we decompose the probleminto two classes: allocation of resources by a user proxy andallocation of resources by a process. As process allocationis a generalization of user proxy allocation, we will start ourdiscussion with allocation by a user proxy.Recall that operations on resources are controlled byan entity, called a  resource proxy  , which is responsible forscheduling access to a resource and for mapping a compu-tation onto that resource. The resource proxy is used asfollows. A user proxy requiring access to a resource first de-termines the identity of the resource proxy for that resource.It then issues a request to the appropriate resource proxy.If the request is successful, the resource is allocated and aprocess created on that resource. (The procedure would besimilar if our goal was simply to allocate a resource, such asnetwork or storage, with which no process was to be asso-ciated. However, for brevity, we assume here that processcreation always follows resource allocation.)The request can fail because the requested resource isnot available (allocation failure), because the user is nota recognized user of the resource (authentication failure),or because the user is not entitled to use the resource inthe requested mode (authorization failure). As discussedabove, it is up to the resource proxy to enforce any localauthorization requirements. Depending on the nature of theresource and local policy, authorization may be checked atresource allocation time or process creation time, or it maybe implicit in authentication and not be checked at all.We define as Protocol 2 the mechanism used to issue a re-quest to a resource proxy from a user proxy. The verificationin Step 3 may require mapping the user’s credentials into alocal user id or account name if the policy of the resourceproxy is to check for authorization at resource allocationtime. Alternatively, authorization checks can be delayeduntil process creation time. The mechanism by which thismapping is performed is discussed in Section 5.4. Noticethat the ability to have a resource proxy create credentialson behalf of the process it creates relies on a process and itsresource proxy executing in the same trust domain.The protocol creates a temporary credential for the newlycreated processes. This credential,  C  P  , gives the processboth the ability to authenticate itself and the identify of the user on whose behalf the process was created. A sin-gle resource allocation request may result in the creation of multiple processes on the remote resource. We assign allsuch processes the same credential, as allowed by securitypolicy element 8. An advantage of this decision is that inthe situation when a user allocates resources on large par-allel computers, scalability is enhanced. A disadvantage isthat it is not possible to use credentials to distinguish twoprocesses started on the same resource by the same alloca-tion request. However, we do not believe that this featureis often useful in practice.The existence of process credentials enables us to imple-ment a range of additional protocols that allow a process tocontrol access to incoming communication operations on a5
Search
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks