Reliability of the interRAI suite of assessment instruments: a 12-country study of an integrated health information system

  BioMed   Central Page 1 of 11 (page number not for citation purposes) BMC Health Services Research Open Access Research article Reliability of the interRAI suite of assessment instruments: a 12-country study of an integrated health information system JohnPHirdes* 1,2 , GunnarLjunggren 3 , JohnNMorris 4 , DinnusHMFrijters 5 , HarrietFinne Soveri 6 , LenGray  7 , MagnusBjörkgren 8  and ReudiGilgen 9  Address: 1 University of Waterloo, 200 University Ave W, Waterloo, ON N2L 3G1, Canada, 2 Homewood Research Institute, Guelph, Canada, 3 Stockholm County Council, Box 17533, S-11891, Stockholm, Sweden, 4 Hebrew Senior Life, 1200 Centre St, Boston, MA 02131, USA, 5 PRISMANT, Box 14006, Papendorpseweg 65, 3528 BJ Utrecht, The Netherlands, 6 STAKES, Lintulahdenkuja 4, Box 220, FIN-00531, Helsinki, Finland, 7 University of Queensland, Academic Unit in Geriatric Medicine, Princess Alexandra Hospital, Ipswich Road, Woolloongabba, Brisbane,  Australia, 8 Chydenius Institute, Pitkänsillankatu 1-3, 67100 Kokkola, Finland and 9 Stadtspital Waid Zürich, Klinik für Akutgeriatrie, Tièchestraße 99, CH-8037, Zürich, SwitzerlandEmail: JohnPHirdes*;;;; HarrietFinne;; MagnusBjö;* Corresponding author Abstract Background: A multi-domain suite of instruments has been developed by the interRAI researchcollaborative to support assessment and care planning in mental health, aged care and disabilityservices. Each assessment instrument comprises items common to other instruments andspecialized items exclusive to that instrument. This study examined the reliability of the items fromfive instruments supporting home care, long term care, mental health, palliative care and post-acutecare. Methods: Paired assessments on 783 individuals across 12 nations were completed within 72hours of each other by trained assessors who were blinded to the others' assessment. Reliabilitywas tested using weighted kappa coefficients. Results: The overall kappa mean value for 161 items which are common to 2 or more instrumentswas 0.75. The kappa mean value for specialized items varied among instruments from 0.63 to 0.73.Over 60% of items scored greater than 0.70. Conclusion: The vast majority of items exceeded standard cut-offs for acceptable reliability, withonly modest variation among instruments. The overall performance of these instruments showedthat the interRAI suite has substantial reliability according to conventional cut-offs for interpretingthe kappa statistic. The results indicate that interRAI items retain reliability when used across caresettings, paving the way for cross domain application of the instruments as part of an integratedhealth information system. Published: 30 December 2008 BMC Health Services Research  2008, 8 :277doi:10.1186/1472-6963-8-277Received: 26 March 2008Accepted: 30 December 2008This article is available from:© 2008 Hirdes et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the srcinal work is properly cited.  BMC Health Services Research  2008, 8 :277 2 of 11 (page number not for citation purposes) Background Population aging and the increased burden of disability inmiddle and high income nations pose unique challengesto health care systems. The lives of frail elderly individualsand persons with disability are affected by complex inter-actions of physical, social, medical and environmentalfactors that necessitate multidisciplinary approaches tocare. Services tend to be provided by a variety of healthand social service agencies including both community and facility-based settings. For example, persons who areexperiencing cognitive loss or decline of functional ability may receive support from home care agencies, supportivehousing, rehabilitation services, or nursing homes. Simi-larly, persons with mental health problems may receivepsychiatric services in primary care, community mentalhealth programs, mental health group homes, or in-patient psychiatric units of hospitals. At the end of life,palliative care may be provided by community-basedagencies or by residential hospices, but periodic contact  with acute hospitals is also not uncommon. For each of these populations, health and social services are intendedto be provided through an integrated system of care rather than through a singular organization. The need to receive support from multiple service agencieshas important implications for persons with complex careneeds. At the individual level, there may be a risk of dis-continuity of care if information systems are not compat-ible or if clinically relevant information is not sharedbetween agencies. This may mean that needs are not iden-tified[1] when transitions are made between service pro- viders, longitudinal change in functional status may goundetected as the person moves between service settings,or care plans are not followed through when the personreceives care from another sector. The lack of coordinationof information gathering can result in duplication of effort, increased assessment burden, and frustrationamong care recipients and their support network. For these reasons, there is a clear need for an integrated, multi-sectoral approach to assessment for persons with complex care needs. The interRAI family of assessment instruments http://  was designed to be used with a variety of  vulnerable populations [2,3]. The first interRAI instru- ment was the Resident Assessment Instrument (RAI),developed in the United States in response to the Omni-bus Reconciliation Act of 1987 [4]. The interRAI network  was established initially based on the international col-laborative efforts of clinicians and researchers to apply theRAI to nursing home residents in other countries [5]. By 1996, interRAI released the RAI-Home Care with the aimof establishing a compatible assessment approach in com-munity based care settings that served populations at risk of nursing home placement or required post-acute or long term home care services [6-8]. The RAI-Mental Health instrument [9,10] was the first interRAI instrument  designed to be used with a general adult population inpsychiatric hospital settings including, but not limited to,geriatric psychiatry. Other interRAI instruments devel-oped in the 1990's include the RAI-Acute Care [11,12], RAI-Post Acute Care [13], and RAI-Palliative Care [14].  The development of all these assessment instruments wasguided by the design principles for the srcinal RAI. Theassessments were intended to use all sources of informa-tion available. Judgments were to be based on observabletraits, have operational definitions and coding instruc-tions that specified inclusion and exclusion criteria, anduse clearly delimited time frames for observationsanchored to a specific assessment reference date. In addi-tion, each of these instruments was intended to support applications for multiple audiences including care plan-ning, outcome measurement, quality improvement, andresource allocation. To this end, efforts were made toretain the capacity to derive or extend existing outcomemeasures (e.g., scales related to cognition, ADL, pain,depression, behaviour) and decision support algorithms(e.g., case mix algorithms, quality indicators). The initial set of RAI instruments was developed in a serialprocess. For new instruments, this meant that lessonslearned from the use of earlier instruments were takeninto account in the development of subsequent instru-ments. However, the need to refine older instrumentsbecame apparent as innovations were identified. Also, it  was recognized that the family of instruments should berefined in parallel, treating the collective set of instru-ments as an integrated system rather than as complemen-tary, but independent assessments for specialized caresettings.In the year 2000, interRAI launched a multinational effort to update the entire family of RAI instruments and todevelop new instruments for sectors not yet addressed by the existing instruments. The result of this effort is an inte-grated suite of instruments providing compatible assess-ment approaches for nursing homes, home care, acutecare, post acute care, palliative care, assisted living, sup-portive housing, services for persons with intellectual dis-abilities, community mental health, emergency psychiatry, and inpatient psychiatry. The initial focus of the redevelopment effort for the interRAI suite of assess-ment instruments was to identify a common core set of about 70 items that would be present in all instruments, with exceptions permitted only for specialized settings where prevalence rates for the item will be negligible (e.g.,pressure ulcers in in-patient psychiatry). Examples includeitems such as cognitive skills for decision making, activi-ties of daily living (e.g., personal hygiene, toilet use, eat-  BMC Health Services Research  2008, 8 :277 3 of 11 (page number not for citation purposes) ing), mood (e.g., negative statements, persistent anger,crying/tearfulness), behaviour problems (e.g., verbalabuse, resisting care), falls, and health symptoms (e.g.,pain frequency and intensity, fatigue). The next step wasto identify over 100 optional items that would appear inmany, but not all, instruments. These items were expectedto be relevant to several service settings, but not pervasiveenough in all service settings to warrant inclusion in thelist of core items. Examples include long-term memory,situational memory, hearing aid use, family/close friendsfeeling overwhelmed by the person's illness, instrumentalactivities of daily living (e.g., meal preparation, financialmanagement, phone use), stamina, additional healthconditions (e.g., extrapyramidal symptoms, abnormalthought processes, delusions), medication adherence, andpreventive interventions and screening (e.g., influenza vaccination, breast screening). Finally, specialized itemsthat would only appear in specific instruments were alsoidentified. For example, the in-patient psychiatry instru-ment has 170 unique measures (e.g., number of lifetimepsychiatric admissions, command hallucinations, suicid-ality, illicit drug use, police intervention for criminalbehaviour, history of sexual violence or assault as perpe-trator, problem gambling) whose prevalences would betoo low to warrant their use in non-mental health settings.Once the initial item set was identified for the interRAIinstruments, a 12-country effort was launched to evaluatethe psychometric properties of the instruments in differ-ent health care settings. The present paper reports on theresults of that cross-national effort with a particular emphasis on inter-rater reliability. There have been severalstudies of the reliability and validity of the early versionsof interRAI instruments for nursing homes [15-20], home care[6,7,21], mental health [10], acute care [11], and pal- liative care [14]. The general trend of these studies hasbeen to show improved reliability over time with newer  versions of these instruments, and they provided consist-ent evidence of good psychometric properties across pop-ulations and service settings. The multinational effort described here was launched in 2005 and included fiveinstruments from the new suite designed for use in the fol-lowing care settings: nursing homes, home care, rehabili-tation, palliative care, and in-patient psychiatry Results for the new interRAI Acute Care [22] and interRAI IntellectualDisability [23] have been reported elsewhere. Reliability and validity results for new instruments in the suite (e.g.,interRAI Community Mental Health) will be reported infuture publications. Methods Study Participants interRAI Fellows from 12 countries (Australia, Canada,Czech Republic, France, Iceland, Italy, Japan, South Korea,Netherlands, Norway, Spain, and United States) volun-teered to test one or more of five instruments available by 2005 for long-term care facilities, home care, palliativecare, post-acute care, and mental health. Individualresearchers selected instruments based on the availability of pilot sites in their countries, which was often depend-ent on patterns of use of the earlier versions of the instru-ments. For each instrument, interRAI created a detaileditem by item instruction manual, with item definitions,process instructions, and examples. Field staff members were trained to do the assessments following this instruc-tional set. These trained clinicians then completed dualassessments for 783 individuals. As shown in Table 1, themost widely tested instrument was the LTCF (8 countries,31% of assessments) and the least common was MH (1country, 11% of assessments). The largest number of assessments came from Canada (147 pairs), the UnitedStates (141 pairs), and Iceland (80 pairs), and the fewest  were obtained from Spain (29 pairs) and Japan (28 pairs). Data Collection  The five assessment instruments used in this study werethe interRAI Long Term Care Facility (interRAI LTCF),interRAI Home Care (interRAI HC), interRAI Post AcuteCare (interRAI PAC), interRAI Palliative Care (interRAIPC), and interRAI Mental Health (interRAI MH). All itemsare coded using the same assessment approach; namely,the assessor uses all sources of information and then exer-cises clinical judgement as to the most appropriate answer based on standardized coding guidelines provided in theinstrument's training manual. Most items permit the useof multiple information sources including personal inter- views, review of the chart, direct observation of the per-son, communication with informal caregivers, and use of clinical communication between health care staff (e.g.,tracking forms, clinical correspondence). However, a lim-ited number of items are restricted to recording only theperson's self report (e.g., self-rated health; self-rated mooditems dealing with depression, anxiety and anhedonia;personal goals of care). All items include standardizedresponses sets with item definitions, inclusions/exclu-sions, and observational time frames provided in themanual and on the assessment form. As noted earlier, allassessments include a set of common core data elements, which are the primary focus the present analyses, as wellas specialized items unique to that service setting. Theitems typically are rated based on the presence or absenceof a condition, frequency of its occurrence in a standard-ized time frame (typically three days), or severity based onanchor terms defined in the assessment manual (e.g., painseverity). The number of common and unique items for each instrument examined in the present study arereported in Table 2.In each participating site, two health professionals com-pleted their assessment of the same individual independ-  BMC Health Services Research  2008, 8 :277 4 of 11 (page number not for citation purposes) ently at different times. Assessors were blinded to others'results and they were not permitted to discuss the case with each other, nor were they permitted to exchangeinformation; however, they were both able to access thepersons chart when completing their assessment. Theintent of this approach was to use a conservative method-ology for evaluating inter-rater reliability in a manner that  would mimic real-world assessment experiences. Thismethodology provides a more realistic appraisal of inter-rater reliability than would be obtained from use of artifi-cial case examples or simultaneously completed assess-ments.It was not the purpose of the study to make comparisonsbetween sites or countries. Therefore, the protocol did not call for selection of randomized samples, and there wasno requirement for countries to target common popula-tions or settings. In order to reduce burden on staff and tosimplify the approach for obtaining consent, conveniencesamples were used, but study coordinators were encour-aged to obtain data for heterogeneous samples in order totest the applicability of the instruments in diverse clinicalpopulations. The study protocol required dual assessments to be done within 72 hours, but staff were encouraged to completethem in less time particularly in settings with a higher risk of rapid rate of clinical change (e.g., post-acute or pallia-tive care). The actual number of hours between assess-ments was not recorded electronically, so it is not possibleto estimate the average time between assessments. Assessors were trained to use a variety of informationsources, such as direct observation, interviews with theperson under care, family, friends, or formal service pro- viders, and review clinical records, both medical and nurs-ing. The assessors were ordinary clinical staff, externalresearch staff, or a mixture of both. Most assessors werenurses, but other professionals were also used. In line withinterRAI's standard approach to coding, they were allinstructed to exercise their best clinical judgment in order to record observations based on their evaluation of themost accurate information source. All assessments were recorded in paper form and either entered locally into a database or sent to the project teamfor transcribing and analysis. Data from all countries werethen combined into a single analytic data set for the crossnational analyses. Assessors were asked to track the time used to completethe assessment and to fill out a debriefing form on their experience in doing the assessment. This information wasused by local project coordinators in participating coun-tries to monitor how close the instruments were to meet-ing the target completion time and to collect comments Table 1: Reliability samples by country and interRAI instrument CountryinterRAI Assessment InstrumentNumber of instruments evaluated Long-Term Care Facility (LTCF)Home Care (HC)Palliative Care (PC)Post-Acute Care (PAC)Mental Health (MH)All InstrumentsAustralia1826442Canada58891472Czech republic3030602France3116472Iceland3030602Italy233030833 Japan29291Korea3029592The Netherlands2916452Norway3010402Spain28281United States1697281413All Countries5- Participants (n)24622012610289783- Participants (%)31.428. Study sites8654124  BMC Health Services Research  2008, 8 :277 5 of 11 (page number not for citation purposes) that could be useful in future discussions regarding therefinement of the instruments. However, since the forms were considered an optional part of the protocol, thisinformation was not always forwarded to the project coor-dinators and these data were not compiled electronically for further analysis. That said, there were no internationalreports of major problems with the study protocol or draft  versions of instruments reported to the study team.  Analysis  The reliability of the various interRAI instruments wasevaluated mainly with weighted kappa coefficients using Fleiss-Cohen weights [24]. For binary items, ordinary kappa coefficients were used. According to Landis andKoch[25], kappa values below 0.40 should be consideredpoor, between 0.41 to 0.60 should be considered moder-ate, 0.61 to 0.80 should be considered substantial, andabove 0.81 should be considered almost perfect. How-ever, the stability of kappa estimates is affected by the dis-tributional properties of items. For example, binary items with less than five percent of cases in one of the two values will yield unstable estimates, even with sample sizes of several hundred cases.Kappa values are generally preferred as indicator of relia-bility over percentage agreement, because the latter may under-represent the reliability of multi-level items withmodest disagreements regarding severity of a given prob-lem. Percentage agreement will also over-represent relia-bility for binary items with highly skewed distributions. Table 2: Distribution of selected characteristics by interRAI instrument VariableinterRAI Assessment Instrument Long-Term CareFacility (LTCF)Home Care (HC)Palliative Care (PC)Post-Acute Care(PAC)Mental Health(MH)All InstrumentsAge group- <653.510.413.88.3- 1 9.9- 65–8450.960.446.868.157.5- 85+45.629.239.423.632.6Female72.362.158.766.0-65.7Widowed52.747.943.236.95.742.4Impaired in decision making83.445.765.938.243.859.5Rarely/never understands others4.91.518.00.0-5.5Makes negative statements23.617.416.510.817.918.5Sad/pained facial expressions22.412.522.927.447.223.5Any aggressive behaviour 2 17.30.5--25.812.4Hallucinations or delusions1.62.3-0.916.83.8Early Loss ADL: Personal Hygiene Impaired65.832.977.830.119.148.5Mid Loss ADL: Walking Impaired57.738.278.642.77.948.1Late Loss ADL: Eating Impaired21.818.357.64.87.922.7Pain not present61.934.242.441.60.041.3Severe/excruciating pain present6.923.724.88.9-15.8Non-smoker93.190.493.595.157.392.6Falls10.815.79.430.30.013.2Hospitalized in last 90 days 3 12.220.3---15.9N24622012610289783In some cases, items are not available because they were not gathered as part of the study protocol or they are not available for specific instruments.Any occurrence of verbal abuse, physical abuse, socially inappropriate behaviour, or resisting careExcludes psychiatric admissions
