E-MIP: A New Economic Approach for Multi-robot Dynamic Coalition Regeneration in the Metaphor of Italian Politics

E-MIP: A New Economic Approach for Multi-robot Dynamic Coalition Regeneration in the Metaphor of Italian Politics
of 8
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
  E-MIP: A New Economic Approach for Multi-robotDynamic Coalition  Regeneration  in the Metaphor of Italian Politics A. Chella M. Gentile F. P. Ponente R. Sorbello Dipartimento di Ingegneria InformaticaUniversita’ di Abstract Hybrid Multi-Agent Architectures allow the sup-port of mobile robots colonies moving in dynamic,not predictable and time variable environments inorder to achieve distributed solving strategies thatdevelop collective team-oriented behaviors for solv-ing complicate and difficult tasks; MIP architec-ture (Chella et al., 2002), taking inspiration from thepolitical organizations of democratic governments,provides a solution for the coordination of a robotcolonies in dangerous environments. The develop-ment of an evolution of Metaphor of Italian Poli-tics, an economic method for dynamic regeneration of coalition during a mission, is outlined. This new ap-proach proposesamechanismto makeanewcoalitioncaused by the failure of the government strategy andby a general inefficiency of the whole colony duringthe reaching of the mission targets. The robot agencyis able to adapt its behavior to highly dynamic envi-ronments, choosing from time to time the best coali-tion capable to apply the most suitable strategy forthe current situation. To validate the effectiveness of our approach we have realized a framework based onthe MissionLab robot simulation software developedat the Mobile Robot Lab of the Georgia Institute of Technology. 1. Introduction A Robots colony can be efficiently used for many difficulttasks, because it can complete an assigned task more rapidlythan a single agent can by separating the task into sub-tasksand executing them simultaneously. Two main methods havebeen proposed in the literature: the first one is the centralizedapproach while the second one is the distributed approach(Carpin and Parker, 2002) (Stoytchev and Arkin, 2000). If we suppose to use the second approach, we need to answerto the following question: how can we make the robots ableto get organized and, moreover, how can they regenerate thisorganization during the time? In this report, we describe amethodforcoordinatingrobotsbaseduponametaphorofpol-itics, using economic methods for coalitions regeneration. Inthis paper we revisited MIP architecture (Chella et al., 2002),an hybrid and dynamic architecture to coordinate a robotteam that we have been developing in the last years, intro-ducing a new mechanism of election, inspired by the polit-ical organization of the democratic governments, where theleader is constituted by a group of robots,and an economic-based approach for the  regeneration  of the coalitions accord-ing to the current external conditions and to the performanceobtained under the previous strategies applied by the vari-ous governments, allowing each robot to dynamically choosethe best one. Our economic approach took inspiration fromthe works of many researchers, like (Dias and Stentz, 2000)(Brandt et al., 2000) (Toledo et al., 2001). The proposed so-lution is an hybrid since it adopts a centralized, but at thesame time distributed among the government members, de-liberative planner and several agents with reactive and delib-erative capabilities. 2. Mathematical Model The new approach adopted in E-MIP architecture considersa colony composed of   H   robot and  M   political parties, with M   ≤  H  . A set of N issues, expressing individual features, isassociated to each robot. Every issue can have a limited setof values: 0 (don’t care), -1 (absolutely not) and 1 (absolutelyyes). Every party can be represented by the values of theissues that an ideal robot should have to belong to that party.For each robot  i  and party  j  there is a vector of   n  issues: I  Ri  , I  P j  ∈  M  ( n × 1) where  i  = 1 ...H  ,  j  = 1 ...M  ,  P   =party and  R  = robot. To simplify our model description, weconsider only 3 issues, identified with the following termsand meanings:–  WELFARE :  Energy of the robot  –  DEFENSE :  Attitude to the risk  –  LABOR :  Amount of work   Every issue is weighted through a non negative coefficient(from 0 to +  ∞ ), representing the intensity of the issue. Inconclusion, each robot  R i  and party  P  j  will be representedby a vector with  n  components as following showed: R i  =  S  Ri  · I  Ri  , P  j  =  S  P j  · I  P j  (1)where  S  Ri  ,  S  P j  are diagonal  n  ×  n  matrixes containing theweights of the robot issues and of the parties issues;  R i  and P  j  are representation of a robot and of a party in a multi-dimensional space called  ROBOT ISSUES SPACE .The heterogeneity of the robots inside the colony is describedby a Roles Matrix ( RM  ) which shows the capability for arobot to cover a role in the government; having  H   robot and4 roles the matrix will have an  H   × 4  dimension.The architecture foresees at the highest level of abstractionthe four macro-states showed in figure 1, that will be ex-plained in detail in the following sections. Figure 1: The Robot Finite State Automata Diagram 2.1 Election This first state can be divided into sub-states, each one rep-resenting a different phase of the entire process; we will de-scribe these states below:1.  Voting Process : The voting process is divided in twosteps. The first step is  Cluster Identification ; the litera-ture provides several criteria: in particular we focus on Voronoi tessellation . In the E-MIP architecture, the clus-ter identification groups the robots of the colony on thebasis of the membership’s party; this choice is based onthe consideration that every robot has a political orienta-tion depending on the closest political party; a robot  R i will belong to the  P  j  party if the following condition isverified: R i  ∈  P  j  ⇔  d i,j  = min k { d i,k }  k  = 1 , 2 ,...M   (2)with  d i,k  the euclidean distance among the robot  R i  andthe generic party  P  k  in the  ROBOT ISSUES SPACE . Figure2 shows the graphical result of the clustering process re-ferring to an example made up of 11 robots and 3 parties.The second step is the  Vote Extraction : the vote expres- Figure 2: Clustering formation sion by a robot is simulated by a  Montecarlo  random ex-traction of a number included in interval [0,1], divided inits turn into  M   sub-intervals, each one associated to the M   parties. For each robot  i  of the colony the subdivisionof the interval [0,1] is made through the relative distancesof the robot from each party  j : d RELi,j  =  k  = j  d i,k  k  d i,k %  k  = 1 , 2 ,...M   (3)2.  Coalition Formation : The formation of the politicalcoalition which constitutes the new government is madewith the support of a linear space, called  POLITICALIDEOLOGY SPACE . This space represents all the robotsand the parties belonging to the  ROBOT ISSUES SPACE .The mapping between the two spaces is performed by amapping function  f  ( · )  which makes a scaling among thetwo spaces of representation and which groups the robotsshowing the same vote during the voting process.The axis of coalitions is divided into 3 big strips whichrepresent the ideologies of the left, center and right politi-cal trends. Two functions,  f  R ( · )  and f  L ( · ) , are introducedin order to highlight, if applied to a  P  j  party, the aspectswhich respectively characterize the right and left trendthrough a scalar value; the considered party will have apolitical trend given by a better compromise among rightand left one. If we represent the position of the party  P  j with  p j  in the space of the coalitions, we consider the fol-lowing relation:  p j  =  f  ( P  j ) =  f  R ( P  j ) − f  L ( P  j )  (4)  A positive value identifies a right party while a negativeone identifies a left party; values closer to zero identifyright-center or left-center parties according to the sign.The above mentioned functions are vectors of coefficientswhich opportunely weight the several members of theparty; the value to be given to these coefficients is relatedto the meaning associated to the issues of the parties in or-der to identify the ideology in the  ROBOT ISSUES SPACE .The function of scaling is:  p j  =  M  T R  · P  j  − M  T L  · P  j  = ( M  R  − M  L ) T  · P  j  (5)where  M  R ,M  L  ∈  M  ( n × 1) . We introduce a scale fac-tor for the positioning of the parties to avoid a dispersionof the parties themselves, keeping unchanged the rela-tive distances. A graphical example of representation isshowed in figure 3. A relation analogous to the calcula-tion of the relative distances leads to the determination of the political mass  m i,j  of each robot  i , which representsthe weight inside the voted party  j : m i,j  =  k  = i  d k,j  k  d k,j if   i  voted  j  , 0 otherwise (6)where the index  k  describes all the robots of the colonywhich expressed a vote for the  j  party. Also the par-ties represented in the  POLITICAL IDEOLOGY SPACE  arecharacterized by a mass center depending on the politicalmasses associated to the robots which expressed a votefor that party; such a center of mass is obtained using theanalogous concept of the classical physics: r ( j ) CM   =  i  m i,j  · r i  i  m i,j (7)where the index  i  describes all the robots of the colonywhich voted  j .When the scaling process is finished, the coalition whichwill constitute the new government will be formed bythe winning party added by adjacent parties until moreof 50% of total votes is reached; the concept of adjacentparty is meant in terms of distance among the center of mass of the winning party and the center of mass of theremaining ones. The coalition formation is represented infigure 3 in a hypothetical situation with 3 parties and 11robots. 2.2 Determine Roles Next step is the identification of the robots which will assumethe government roles, that is:  Prime Minister   (PM),  Minister of Defence  (MD),  Minister of Communications  (MC). Firstthe robots of the coalition are filtered through the Role’s Ma-trix ( RM  ) cutting the ones unable to execute the tasks con-cerning such a role. Second, following rules are used to as-sign the political roles: the PM is chosen between the robotsbelonging to the winning  party , while the MD and the MC are Figure 3: Mass center and coalitions chosen between the robots belonging to the winning  coalition which has not assumed a previous governative role. Repre-senting respectively with  r ( PM  ) k  ,  r ( MD ) k  and  r ( MC  ) k  the po-sitions of the robots which satisfy these conditions, and with r CM   the position of the center of mass of the winning party,the PM role is assigned to the robot  i  closer to the  r CM   cen-ter of mass, the MD role is assigned to the robot  i  positionedto the right extremity of the coalition, and the MC role is as-signed to the robot  j  positioned to the left extremity of thecoalition: R i  =  PM   ⇔  r ( PM  ) i  = min k | r ( PM  ) k  − r CM  |  (8) R i  =  MD  ⇔  r ( MD ) i  = max k ( r ( MD ) k  − r CM  )  (9) R j  =  MC   ⇔  r ( MC  ) j  = min k ( r ( MC  ) k  − r CM  )  (10) 2.3 Conduct Business Therobots forming thenewgovernment will adopt abehaviorin the mission’s management according to the strategy of thewinning coalition. This strategy is a combination of two ex-tremes, representing pure left and pure right ideologies, anddynamically changes in accordance with the weight given tothe two component in the formation of the government coali-tion. A strategy can be characterized by a set of parameterswhich identify the various aspects of the robot’s behavior; forinstance we considered the action ray for a robot’s request of mines defusing support, or the probability that a robot willfight against an enemy instead of running away, or again thetax percentage that robots must pay to the government in theend of a mission . Each parameter has values belonging toa continuous interval whose extremes (lower and superior)are associated to the limit left and right strategies. For ev-ery parameter  s , to each of   M   parties is associated a valueso that  s 1  ≤  s 2  ≤  ...  ≤  s j  ≤  ...  ≤  s M   where  j  repre-sents the generic party, while  s 1 ,  s M   identify the two extremeparties; the winning coalition is constituted from  M   partieswith  M   ≤  M  . The parameter  s c  will be only affected bythe interested parties, each of which will act on the basis of the weight assumed in the coalition. Its value is calculated as  weighted average of the parameters of the coalition parties: s c  =  k a k  · s k  (11)where  k  refers to the parties which form the coalition. The a k  weight associated to the  k-th  party is obtained taking intoaccountthe V   k  voteswhichitreceivedwithrespecttothetotalvotes of the coalition: a k  =  V   k  h  V   h (12) 2.4 Coalition Regeneration This macro-state contains a fundamental part of the process,since it allows the robots to change their political position inthe  ROBOT ISSUES SPACE  adapting to dynamic changes of the external conditions. It is composed by three sub-states,described below:1.  Trial Balance : As we said in the previous sections, weadopted an economic method to evaluate mission results;following this approach, we provided every robot citi-zen with a starting capital, called  Equity , that he can in-crease or decrease in accordance with mission develop-ment. Also the government is provided with an equity,incremented from mission to mission by means of, likewe will see, the taxes. The first step for each robot is con-stituted by drawing up a balance of the activities carriedout during the mission, in terms of costs and rewards. Toachieve this goal each robot converts the achieved goalsinto monetary units (MU), in order to compare them, bymeans of a series of reward functions that we developedfor this purpose. In the following we illustrate two func-tions, whichconvertsupportoperations, f  S  ( · ) , andenemydestructions,  f  E  ( · ) , in terms of MU rewards. f  S  ( R i ) =  αSupNumbSupTime  (13) f  E  ( R i ) =  β   ∗ enemies  (14)where SupNumb and SupTime represents the number of support operations performed by the robot  R i  and the av-erage time spent to complete such operations, enemiesrepresent the number of opponent robots destroyed and α  and  β   are two conversion coefficients, whose values(respectively 100 and 10) have been chosen after severaltries, in according with the realism of the obtained resultsrelated to the different tests. After completing this con-version, each robot citizen  R i  draws up an economic re-port, given by the difference between costs and rewards,achieving a profit or a loss named  Operating Income : O.I. ( R i ) =  k ( rew k,i  − cost k,i )  (15)In case of profit, each robot must pay a certain percentageof it to the government, in order to simulated what hap-pens in the real life. The result after this payment is the  Net Income : N.I. ( R i ) =  O.I. ( R i ) − TaxRate ∗ O.I. ( R i )  (16)This result is used by the robot to calculate an economicperformance index; the one we have chosen is ROE (  Re-turn On Equity ) thus expressing the profit in terms of per-centage of the invested capital. ROE  ( R i ) =  N.I. ( R i ) Equity ( R i )  (17)In the end each robot sends its report to the governmentmembers, which collect and add them in order to obtain atotal mission result, calculating finally the mission index,ROE(Gov). ROE  ( Gov ) =  k  N.I. ( R k ) Equity ( Gov )  (18)2.  Estimate Business : In this second step, robots must un-derstand if the index value just calculated represents agood result or less. Keeping inspiration from economicindex analysis, we propose a comparison on historical ba-sis; in other words, each robots keeps track of its pre-vious performances, and then uses this stored values tocarry out the comparison. In particular, he calculates anactualized average ROE; this actualization is needed be-cause we saw like, in our simulations with MissionLabsoftware, indexes tended towards the bottom, because of thestatisticaldecreasingoffactorslikebombandenemiesnumber during the mission, and the consequent diminish-ing of the reward that can be received by the robots. Forthis actualization process we kept inspiration from ValueBased Economy (Copeland et al., 2000) : AvROE  ( R i ,t ) =  nk =1  ROE  ( R i ,t k )(1 +  γ  ) δ n  (19)Where γ   is the interest rate applied for the actualization of the old values (that can be set by the government in accor-dance with their political trend) and  δ   represents, in oursimulation, the number of legislatures passed from the be-ginning of the mission (to much more weigh last values).In the end, to dynamically create a merit range, each robotdivides the ROE axis in a certain number of sub-intervals,centered around the average value obtained. The  M  i  fac-tors that divide the interval are represented in our simu-lation by a percentage shift from this central value. So,according to the current ROE value obtained, each robotknows what is the merit strip which he belongs to. Anal-ogously, government members follow the same proce-dure, using for that purpose the values related with previ-ous global performances, apart from the political trend of those coalitions. They will then able to classify their work against the others one, and to know which is their merit  strip. According to this result, the government distributesto all the citizens a dividends, which could balance a pos-sible individual loss. It is to note like the dividend is equalfor each robot, and not proportional to its performance. Infact it is not a reward, but only a stimulus to convince allthe citizens to confirm their political idea (if they alreadybelonged to the winning coalition) or to change it (if theywere of the opposite party).3.  Update Political Position : This last state is responsiblefor the effective updating of the positions of the robotsin the  ROBOT ISSUES SPACE . Like we introduced in thefirst sections, each party represented in the  POLITICALIDEOLOGY SPACE  is characterized by a mass center de-pending on the political masses associated to the robotswhich expressed a vote for that party. Therefore, everyrobot makes the difference between this center of massand its own position, obtaining the  political gap : P  gap ( R i ) =  i  m i,j  · r i  i  m i,j − Pos ( R i )  (20)where the index  i  describes all the robots of the colony,and the index  j  represents the winning party which therobots are calculating their distance from. The robots willfinally update their position, approaching or dismissingthegoverningcoalitionmasscenterproportionallytotheirsatisfaction/dissatisfaction degree: Pos ( R i ,t  + 1) =  Pos ( R i ,t ) +  χ ∗ P  gap ( R i ,t )  (21)where  Pos ( R i ,t )  indicate the position in the  POLITICALIDEOLOGY SPACE  ofrobot R i , duringthemissiont, whilethe coefficient  χ  is chosen to weigh its movement toward,or by, the center of mass of the winning coalition, in pro-portion with its merit strip. The last step of this updat-ing is constituted by the scaling of this positions, to allowrobots position change also in the  ROBOT ISSUES SPACE . 3. Experimental Results In order to show E-MIP performances, we created two alter-native architectures, named Dictatorship and Anarchy. Theseones had a double aim: they are at the same time opponentsand part of the E-MIP framework. We used in fact them eitherindividually, to evaluate their own performances, or insideE-MIP architecture. In other words, we supposed them likethe two extreme strategies (left and right) between whom ourmodel can dynamically chose, as we saw in the previous para-graph. Infigure4weshowthehighest levelFSAbywhichweimplemented the E-MIP architecture described in the previ-ous sections inside the Mission Lab tool. As we illustrated inthe previous section, coalition regeneration is a macro-state,representing the economical process of the model. It’s inter-nal structure is showed in figure 5: Figure 4: High Level FSA in the Mission Lab SoftwareFigure 5: Economic FSA in the Mission Lab Software It is composed by the following states:1. Trial Balance2. Estimate Business3. Update Political PosEither  Trial Balance  or  Estimate Business  are macro-states,whose detail’s are not showed here for space problems. Thefirst one is responsible for the balance determination of eachrobot individually and of the whole colony. The second oneanalyzes instead the performance obtained, comparing it withpreviousperformances. So eachrobot obtain itsactualperfor-mance, and it’s able to evaluate its satisfaction degree aboutgovernment choices. The final step of the entire process isrepresented by the  update political pos  state of figure 5, inwhich each robots update its political position in the three di-mensional space, ready for new elections.
Similar documents
View more...
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks