   A STUDY OF INDEXING CONSISTENCY: CONSISTENCY BETWEEN THELIBRARY OF CONGRESS AND THE BRITISH LIBRARY CATALOGERS   Yasar Tonta   School of Library and Information Studies,University ofCalifornia, Berkeley, CA 94720    Abstract   The article aims to compare the indexing consistencybetween the Library of Congress (LC) and the BritishLibrary (BL) catalogers with regards to their using theLibrary of Congress Subject Headings (LCSH). Eighty-twotitles, published in 1987 in the field of Library andInformation Science (LIS), were identified for comparison,and for each title its LC subject headings, assigned byboth LC and BL catalogers, were compared. By applyingHooper's "consistency of a pair" equation, the averageindexing consistency value was found for 82 titles. Theaverage indexing consistency value between LC and BLcatalogers is 16 percent for ”exact• matches, and 36percent for ”partial• matches. The major findings of thestudy are discussed, and, in the Appendix, the examples ofLCSH that assigned by both LC and BL catalogers for thesame document are provided along with its consistencyvalue.   Introduction   It has for long been observed that different indexers tendto assign different index terms to the same document. Inother words, "the indexers differ considerably in theirjudgment as to which terms reflect the contents of thedocument most adequately"(1). Essentially, indexingconsistency is seen as "a measure of the similarity ofreaction of different human beings processing the sameinformation"(2).   Indexing consistency in a group of indexers is defined as"the degree of agreement in the representation of theessential information content of the document by certainsets of indexing terms selected individually andindependently by each of the indexers in the group"(3).   Indexing consistency studies reported in the literaturehave shown that the consistency values vary a great deal  between the indexers. Hooper(4), Leonard(5), and Markey(6)reported the results of some 25 published and unpublishedindexing consistency experiments in which the indexingconsistency values ranged from 4 percent to 82 percent.However, the indexing consistency scores of variousstudies, as researchers rightly caution us, should beconsidered separately and not compared. For, it appearsthat consistency values depend on a number of factors underwhich the indexing was performed. Zunde and Dexter(7)listed 25 factors affecting the indexing performance. (Seealso, Tarr and Borko(8).) For instance, factors such as theuse of classification schedules and other indexing aids,the employment of subject specialists as indexers, andindexer training, among others, have greatly improved theconsistency values (9,10). Markey (11) offers a moredetailed discussion of these factors, relating some of thefactors with the findings of previous studies.   Indexing consistency also depends on the consistencymeasure used in the evaluation. Studies reported in theliterature employed a variety of methods and differentformulae to calculate indexing consistency values. In fact,as Cooper (12) puts it, "this circumstance makesgeneralization about their findings difficult". (For moreinformation about various indexing consistency formulae andstatistical techniques involved in consistency studies, seeZunde and Dexter (13), Hooper (14), Leonard (15), Markey(16), and Rolling (17).)   It is assumed that there is a relationship between indexingconsistency and the "indexing quality". That is to say, "anincrease in consistency can be expected to cause animprovement in 'indexing quality'"(18).   For some authors what is more important, and needs to bethoroughly scrutinized, is the relationship betweenindexing consistency and the effectiveness of informationretrieval. Cooper (19) further suggests that "until thisrelationship [i.e., the relationship between indexingconsistency and retrieval performance] has beeninvestigated, there is little point in measuringinterindexer consistency at all". Leonard attempted toinvestigate this relationship in his doctoral dissertationand found that "inter-indexer consistency and retrievaleffectiveness exhibit a tendency toward a direct, positiverelationship, i.e. high inter-indexer consistency inassignment of terms appears to be associated with a high  retrieval effectiveness of the documents indexed" (20).However, he feels that "[c]onsiderably more research isneeded before the relationship between inter-indexerconsistency and retrieval effectiveness can satisfactorilybe defined" (21).    Methodology   This study attempts to compare the indexing consistencybetween the Library of Congress (LC) and the BritishLibrary (BL) catalogers. For the comparison, the bookspublished in 1987 in the field of Library and InformationScience (LIS) (020 in Dewey) were chosen. First, all thetitles published in 1987 in the United Kingdom wereidentified using the ”BNB Subject Catalogue• (Volume 1).There were 237 titles altogether. Secondly, using the ISBNnumbers provided, all 237 titles were searched on OCLCdatabase. Of 237, 217 titles were found on OCLC. (The restwere either serials, microform copies or localpublications.)Thirdly, of 217, titles that were cataloged and given theLibrary of Congress Subject Headings (LCSH) by both LC andBL catalogers were identified. (In this study, the terms"indexing" and "cataloging" are used interchangeably.) The040 field in the MARC format was used to identify thesrcin of cataloging information. For instance, UKM standsfor UK MARC, i.e., cataloged by BL, and DLC stands for LC,i.e., cataloged by LC. (Items that were cataloged accordingto LC practices by libraries other than LC (by the NationalLibrary of Medicine, for example) are not included in thesample.) By checking the 040 field for each record found onOCLC, it was possible to download all the records that werecataloged by both BL and LC. It turned out that there were82 items. That is to say, all the subject headings (LCSH)that were assigned to 82 items by BL and LC were comparedwith regards to consistency.For each of the 82 items in my sample, subject headingsassigned by LC and BL catalogers were identified from thedownloaded data, and then compared. LC subject headingswere readily available as the 600 (personal name), 610(corporate name), 611 (conference, congress, meeting,, 630 (uniform title), 650 (topical LCSH) and 651(geographical LCSH) fields in the MARC format areexclusively used for all kinds of LC subject headings.    Finally, the "consistency of a pair of indexers" formula,defined by Rodgers and developed by Hooper, was applied tofind out the indexing consistency value for each titlecataloged by LC and BL catalogers. According to Hooper'sequation, "the consistency of one indexer with respect to asecond is based on the number of times the two indexersagree on the use of a term, divided by the total number ofterms used by either indexer (based on the   specific document)22.   Hooper's "consistency of a pair" formula is as follows:   CP(%) = A / (A + M + N   where:   CP: is the consistency of term assignment between twoindexers (consistency expressed as a percentage);   A: is the number of term agreements between 'M' and 'N' fora specific document;B: is the number of terms used by 'M' but not used by 'N';and,   C: is the number of terms used by 'N' but not used by 'M'.   Having obtained the indexing consistency value for eachtitle, I calculated the average indexing consistency valuebetween BL and LC catalogers for 82 titles.Further explanation is due here. It was assumed that eachindividual cataloger at LC approaches the same document inthe same way and assigns the same subject heading(s), whichin fact may not be true. This assumption was made for BLcatalogers, too. I am aware of the fact that what I foundis ”not• the individual interindexer consistency valuebetween the two indexers but, rather, the indexingconsistency value between LC and BL catalogers as twodifferent groups.   Findings   The major findings of the study are as follows:    1. LC catalogers assigned 282 subject headings for 82 itemswhile BL catalogers assigned 127. In other words, on theaverage, LC assigned 3.44 subject headings per title. Thesame average isÜf _ Ü_1.55 for BL catalogers. The markeddifference with regards to the average number of subjectheadings between LC and BL is understandable. It is obviousthat BL relies on PRECIS (Preserved Context Index System)for subject access rather than LCSH, whereas LC completelydepends on LCSH for subject retrieval.   It appears that BL catalogers tend to keep the number ofLCSH assigned for each title to the minimum. Only for twotitles (2.4 percent) did BL catalogers assign more subjectheadings than LC catalogers. For 17 titles (20.7 percent)both BL and LC catalogers assigned the same number ofsubject headings. For the rest of 63 titles (73.9 percent)LC catalogers assigned more LCSH than BL catalogers.   2. Each and every subject heading for the same title thatassigned by LC and BL catalogers was compared. It turnedout that 49 out of 127 BL-assigned subject headings”exactly• matched the LC-assigned subject headings. "Exactmatches" were allowed for variants in spelling (i.e.,catalog - catalogue) and punctuation (i.e., on-line -online), but not for synonyms (i.e., non-book - audio-visual). In the second run, I softened my rule and tried toidentify ”partial• matches, too. Forty-four BL-assignedheadings partially matched further. A synonym in amultiple-word-subjectªheading was treated as a "match" aslong as it was not the first word in that subject heading.The lack of a subdivision in a subject heading was alsoallowed for partial matches if the main part of the subjectheading matched exactly. Seventeen BLªassigned subjectheadings for 12 titles were completely different from LC-assigned ones. (Examples of subject headings assigned by LCand BL catalogers for the same titles are given inAppendix.)   3. Average indexing consistency value for exact matchesbetween BL and LC catalogers is 16 percent.   For both exact and partial matches, average indexingconsistency value between BL and LC catalogers is 36percent. (Several examples of consistency values are givenin Appendix.)  
