Travel & Places

A framework for retrieval in case-based reasoning systems

Description
A case-based reasoning (CBR) system supports decision makers when solving new decision problems (ie, new cases) on the basis of past experience (ie, previous cases). The effectiveness of a CBR system depends on its ability to retrieve useful previous
Published
of 23
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  A framework for retrieval incase-based reasoning systems Ali Reza Montazemi and Kalyan Moy Gupta School of Business, McMaster University, Hamilton, Ontario, Canada L8S 4M4 E-mail: Montazem@mcmaster.ca; kgupta@atlantis.com A case-based reasoning (CBR) system supports decision makers when solving newdecision problems (i.e., new cases) on the basis of past experience (i.e., previous cases).The effectiveness of a CBR system depends on its ability to retrieve useful previous cases.The usefulness of a previous case is determined by its similarity with the new case. Existingmethodologies assess similarity by using a set of domain-specific production rules.However, production rules are brittle in ill-structured decision domains and their acquisitionis complex and costly. We propose a framework of methodologies based on decision theoryto assess the similarity of a new case with the previous case that allows amelioration of thedeficiencies associated with the use of production rules. An empirical test of the framework in an ill-structured diagnostic decision environment shows that this framework significantlyimproves the retrieval performance of a CBR system. 1.Introduction A CBR system supports problem solving based on past experience with similardecision problems. To assist a decision-maker (DM), the process followed by a CBRsystem is as follows [36,44,52] (see figure 1): a previous case (or cases) similar tothe new decision problem (new case) is (are) retrieved  ; the solution of the previouscase is mapped   as a solution for the new case; the mapped solution is adapted   toaccount for the differences between the new case and the previous case; and theadapted solution is then evaluated   against hypothetical situations. To aid in futuredecision making, feedback of the success or failure of the evaluated solution isobtained from the DM.Most frequently, recently developed CBR systems retrieve previous cases toprovide decision support [1,32]. The retrieval of relevant previous cases is critical tothe success of a CBR system [39]. Central to retrieval methodologies is the search forand the filtering of previous cases, and an assessment of the similarities of a new casewith previous cases. Production rules are essential to this filtering and assessmentprocess [29]. However, acquisition of production rules creates a bottleneck in CBRsystems development. In this paper, to eliminate the need for production rules, © J.C. Baltzer AG, Science PublishersAnnals of Operations Research 72(1997)51–7351   A.R. Montazemi, K.M. Gupta     Retrieval in case-based reasoning systems Figure 1. Processes in a CBR system. decision theoretic techniques are used for similarity assessment, and constraint-basedmethodology is proposed for filtering previous cases. Explanation-based learning isused to acquire these constraints and the result is the elimination of the difficultyassociated with the use of production rules in the filtering process.The paper is structured as follows: section 2 provides an overview of retrievalmethodologies; section 3 presents the methodologies for similarity assessment andfiltering; and section 4 describes our investigation for assessing effectiveness of theproposed methodology. Section 5 closes the article. 2.Retrieval methodologies overview The aim of case-based retrieval is to retrieve the most useful previous casestowards the optimal resolution of a new case [23,31] and to ignore those previouscases that are irrelevant [33]. Retrieval in a CBR system takes place as follows. Basedon a description of the new case, the case-base is searched   for previous cases thathave the potential to provide decision support (see figure 2). Typically, the search isunder-constrained and a large number of previous cases are retrieved [5]. It is,however, possible to filter the previous cases based on exclusion criteria [49]; thisinvolves comparison and filtering  [6]. The previous cases that remain after filteringare matched   and ranked   in order of decreasing degree of similarity. Matching is theprocess that assesses the degree of similarity of a potentially useful previous case withthe new case. 52  2.1.Matching A case can be considered a schema that consists of a set of attribute value pairs(i.e., descriptors) [21,29]. For example, in a credit assessment decision scenario, aloan manager may assess several attributes value pairs (e.g., attribute “Character of the applicant” has a value of “average”). Matching involves establishing the similar-ity of the schema of the new case with the schema of previous cases. Matching in-volves two steps:(1)assessment of similarity of the schemata of the new case and the previous casealong the descriptors, and(2)assessment of overall similarity of the schemata by a matching function.Similarity of the schemata of two cases along descriptors has been assessed bydomain-specific matching rules (e.g., [53], JULIA [25], and PROTOS [40]). Forexample, a matching rule can determine that the descriptor “color of the object” withthe value orange  is very similar to the same descriptor with the value red  . However,large numbers of matching rules would be required to determine the similarity of allpossible pairs of values for the descriptor “color of the object”. Acquisition of match-ing rules, therefore, can be an onerous task [44].The overall similarity of a new case with a previous case is assessed byaggregation of similarity along descriptors by using a matching function. In thisinvestigation, to assess the overall similarity, we used the nearest-neighbour (NN)matching function. Nearest-neighbour matching [17] is widely used in current CBRsystems [11,16,26,29]. The overall similarity ( OS   NN  ) of the new case “ n ” and theprevious case  p k   using the NN matching function is as follows: Figure 2. Retrieval in CBR. OS n pw sim a aw  NN k ik ini pimik im k k k  (,)(,),() =  == ∑∑ 11 1  A.R. Montazemi, K.M. Gupta     Retrieval in case-based reasoning systems 53  where sim ( a in , a i p k  ) is the similarity of the new case with a previous case k   along adescriptor pair, and w i  is the importance of the i th descriptor.The NN matching function assesses overall similarity by a weighted linearcombination of similarities along descriptors. This is similar to the methods used inmulti-attribute decision making. Weighting represents the degree of importance of thedescriptors towards the goal of a decision problem. The NN matching function hasbeen adopted from the pattern matching literature. In pattern matching, all previouscases are represented by the same set of descriptors and their importance is determinedby means of an inductive machine learning technique which minimizes classificationerror [17]. However, this approach is not feasible in CBR systems because the numberof descriptors that could be used to describe previous cases is large, and only a subsetof descriptors can be used to describe a particular previous case.The importance of a descriptor in CBR systems has been used at two levels of granularity – global and local [29]. At a global level, the importance of a descriptoris the same irrespective of the previous case in which it is used, whereas at a locallevel, the importance of a descriptor is specific to a previous case. The global level iscoarse and context insensitive. In contrast, the local level is fine grained and contextsensitive (e.g., MEDIATOR [28]). In some CBR systems, the degree of importance of a descriptor for the local level is acquired from the domain expert by a knowledgeengineer [26,40]. However, assessments of the importance of a descriptor providedby a domain expert can be noisy [7,10]. Furthermore, the importance of descriptorsacquired from domain experts is static and independent of the previous cases in thecase base. An alternative approach is to determine the degree of importance, dynami-cally, during retrieval. For example, domain-specific rules are used to determine adescriptor’s importance during retrieval in HYPO [3]. During retrieval, this approachtakes into account the context of the new case. Nonetheless, the need to determinerules for this methodology limits its application.New cases are matched with the previous case with the purpose of applying thesolution of a previous case to the new case [48]. Matching in CBR systems is onlypartial, because CBR systems support ill-structured decision problems. This lack of structure in the decision problems could lead to instances in which, despite a highdegree of overall similarity, the solution of a previous case is not applicable to thenew case. To deal with such instances, filtering is necessary. 2.2.Comparison and filtering Improper application of the solution of a previous case to a new case results inwhat is called over-generalization. Over-generalization can occur when the solutionof a previous case is not applicable because of certain conditions existing in the newcase [19]. For example, in the credit assessment decision environment, the decisionrules used to assess the loan application of a middle-aged entrepreneur may not beapplicable to an assessment of loan application from a young entrepreneur, despite  A.R. Montazemi, K.M. Gupta     Retrieval in case-based reasoning systems 54  a high degree of similarity along other descriptors. To prevent this type of over-generalization, production rules are used to assess the validity of a previous casetoward the new case [13,19,40,49]. Production rules have two limitations. First, theyassume well-defined domain knowledge; and second, their acquisition from a domainexpert is fraught with difficulty [8]. This is why, instead of rules, we propose the useof constraints that take into account the imperfections of domain knowledge andprovide an explanation-based learning method to acquire these constraints. 3.Proposed retrieval methodologies In response to the description of a new case that consists of a set of descriptors,the case memory is searched to determine a set of candidate previous cases poten-tially useful for providing decision support in the new case. Candidate previous casesare matched with the new case and rank ordered in decreasing degree of similarity.This is effected by the proposed retrieval methodologies which have the followingcomponents:(1) Similarity assessment along descriptors : This is a multi-attribute decisionmaking technique to determine the closeness of the new case and a previous casealong descriptors.(2) Contextual determination of importance of descriptors : This is a domain inde-pendent technique to determine the importance of descriptors in previous casesin the context of a new case during retrieval.(3)  Acquisition and application of validity constraints : This is an explanation-basedlearning method to acquire the validity constraints for preventing over-generali-zation. 3.1.Similarity assessment  The values of descriptors are depicted by a variety of scoring scales. These canbe numeric, ordinal or nominal valued [29]. For example, in an assessment of creditworthiness, the descriptor “character of applicant” is measured on the following fivepoint scale representing ordinal linguistic values: Very PoorPoorAverageGoodVery goodExcellentI|||||012345  A a i m nin = = … {} ,1 Each descriptor i  has an acceptable range  R i . Let  A n be the set of descriptors in theproblem-schema of the new case such that:  A.R. Montazemi, K.M. Gupta     Retrieval in case-based reasoning systems 55
Search
Tags
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks