British norms for the NEO-FFI and problems with the scale

The NEO Five Factor Inventory (NEO-FFI) was given to 1025 British subjects as part of three independent research studies. Data from these studies were pooled and subjected to item-level analyses. Using standard scoring criteria from the measure
  The NEO Five Factor Inventory (NEO-FFI) was given to 1025 British subjects as part of threeindependent research studies. Data from these studies were pooled and subjected to item-level analyses.Using standard scoring criteria from the measure provisional British norms were produced which werebroadly equivalent to those obtained in the USA. The individual subscales showed good internalconsistency. However, the item-level principal components analysis using varimax and oblique rotationand con®rmatory factor analysis revealed that only the Neuroticism, Agreeableness andConscientiousness traits were coherently represented in the main factors derived by the analysis.Openness and Extraversion factors did not show such stability or consistency. It is argued that as aresult of these diculties, thoughtlessly embracing the NEO-FFI as a quick and ecient instrument formeasuring the `Big Five' personality traits is perhaps premature, as the instrument requires modi®cationand improvement before it can truly be regarded as measuring ®ve independent personality traits. 7 2000Elsevier Science Ltd. All rights reserved. Keywords: `Big Five'; Personality; Psychometrics; Item-analysis; NEO-FFI; NormsPersonality and Individual Dierences 29 (2000) 907±9200191-8869/00/$ - see front matter 7 2000 Elsevier Science Ltd. All rights reserved.PII: S0191-8869(99)00242-1 * Corresponding author. Fax: +44-116-246-0379. E-mail address: (V. Egan).  1. Introduction The recognition of the `Big Five' personality traitsÐNeuroticism (N), Extraversion (E),Openness (O), Agreeableness (A), and Conscientiousness (C)Ðallegedly re¯ects a converginggeneral consensus in dierential psychology (Brand & Egan, 1989). The `Big Five' model canincorporate a number of other personality theories, suggesting that it provides a useful generalframework for viewing human behaviour (Costa & McCrae, 1995). Evidence for similar `BigFive' components has been found in other excursions into questionnaire and lexical research(Goldberg, 1993).Measurement of the `Big Five' is possible using a number of instruments, of which the moststandard is the revised NEO Personality Inventory (NEO PI-R, Costa & McCrae, 1992). TheNEO PI-R comprises 240 statements, to which the individual responds by stating whether they`strongly disagree', `disagree', are `neutral', `agree', or `strongly agree' with a given propositionabout themselves. Items are summed to provide an overall measure of the ®ve broad traits.Each broad trait has six facet scores (i.e., elements of the trait which converge to give anoverall trait description), for example `Anxiety', `Angry Hostility', `Depression', Self-Consciousness', `Impulsiveness' and `Vulnerability' all contribute to overall N.In many clinical and research settings subjects are unable or unwilling to complete a lengthyquestionnaire, and general information about personality is regarded as sucient. For thisreason, a short-form of the NEO PI-R was developed; the NEO Five Factor Inventory (NEO-FFI, Costa & McCrae, 1992). The NEO-FFI comprises 60 items derived from a factor analysisof the 1986 administration of the NEO-PI. For each domain, the 12 items with the highestpositive loading on the corresponding trait were taken. The resulting short scales correlatedupwards of 0.68 (and mostly substantially higher) with the full NEO-PI trait scales, anddemonstrated good internal reliabilities (McCrae & Costa, 1989). This brief personalityinstrument accounts for about 85% of the variance in convergent validity criteria, as derivedfrom ratings of similar traits using adjective endorsement, and spouse and peer ratings (Costa& McCrae, 1992, p. 54). Clearly, for a quick and eective general measure of personality theNEO-FFI appears to be more than adequate.While the `Big Five' model is useful conceptually, there continue to be rumblings of discontent and uncertainty about NEO-PI's empirical and theoretical underpinnings (Eysenck,1991). For example, NEO-PI Openness is not orthogonal to intelligence, as Openness appears apredictor of IQ as measured by the revised Wechsler Adult Intelligence Scale (McDonald,1995), and Conscientiousness may be little more than a facet of Eysenck's `Psychoticism' factor(Draycott & Kline, 1995). The psychometry behind the items and factor structure of the NEO-FFI also appear more ambiguous than one would perhaps desire; for example, despite thewealth of empirical information in the NEO PI-R Manual (Costa & McCrae, 1992), nosrcinal item-level analyses of their scales were provided. Subsequent studies have attempted toredress this shortcoming. For example, a study carried out using Canadian female studentvolunteers examined the NEO-FFI at an item level found that some items in the O and Ascales did not load highly on their corresponding component, and that their observed scoresdeviated signi®cantly from the norms for women in the manual (Holden & Fekken, 1994). Inanother study, con®rmatory factor analysis seeking to con®rm the purported structure of theNEO-FFI found that whilst 35% of the observed variance could be explained by ®ve factors, V. Egan et al. /Personality and Individual Dierences 29 (2000) 907±920 908  the comparative ®t index (an indication of how closely the theoretical model ®ts the observeddata) was just 0.66, suggesting a weak factor structure (Mooradian & Nezlak, 1996). A further,and perhaps fundamental problem with the NEO-FFI is that the scales are not independent;for example Deary et al. (1996) found that the N scale of the NEO-FFI correlated with E at À 0.42 ( P < 0.001), and with C at À 0.39 (also P < 0.001). As the dimensional traits arecorrelated, a 5-factor solution may not be optimal, and analyses of the NEO-FFI at a traitdimension level have suggested that 5, 4, 3 and even 2-factor solutions are possible (Ackerman& Heggestad, 1997; Ferguson & Patterson, 1998). To the degree that the NEO-FFI derivesfrom the most discriminating items of the NEO PI-R, these various problems, uncertaintiesand practical problems may be perpetuated amongst those researchers who continue to useeither of these scales.The current study sought to generate British norms for the trait scores which would enableresearchers and clinicians working in the UK to consider individual records, and no longerhave to rely on possibly invalid American norms. It also sought to examine the psychometricproperties of the NEO-FFI within a British cohort in order to ascertain whether similar itemambiguity could be observed in a British sample. 2. Subjects and method 2.1. Subjects The study cohort comprised 1025 subjects derived from the pooled information from studiesconducted by Willock et al. (1999), Deary et al. (1996), and Egan et al. (1999). The Willock etal. study (1999) comprised 252 individuals being examined as part of a study of individualdierences in¯uencing decision-making in Scottish farmers. The farmers in the study had anincome greater than £16,000 per annum. The Deary et al. study (1996) involved 454 consultantdoctors in Scotland across a range of major specialities being studied as part of aninvestigation into work-related stress. The Egan et al. study (1999) comprised 301 individuals(of whom 112 were clinical referrals to the Regional Forensic Psychology Service forassessment and treatment) recruited as subjects in a study of sensational interests andpersonality traits. The control group in this latter study comprised cleaners, security men,®shermen as well as more educated professionals. Pooling the three studies provided a cohortwith a good range of skills, mental ability and putative psychopathology, making it a morerepresentative cross-section of British society than cohorts comprising one occupation, orstudent samples. The full sample comprised 803 males (78.1%), and 221 females (21.5%); fourindividuals did not provide information on their sex. Nine hundred and sixty-two individualsprovided information about their age, their mean age being 44.9 years (SD=13.2). 2.2. Method  The 60 items of the NEO-FFI were recorded for all participants. Dimensional scores derivedfrom the standard NEO-FFI scoring system were calculated and used to produce mean,standard deviation, and alpha reliability values for a large and relatively representative British V. Egan et al. /Personality and Individual Dierences 29 (2000) 907±920 909  sample. T-scores based on British ranges for the individual scales broken down by sex werecalculated. The NEO-FFI items were then subjected to principal components analysis withvarimax and oblique rotation. A scree test was used to identify components to be retained. Tovigorously test the structure of the NEO-FFI, we examined the factors obtained against thosefor other samples, and conducted a con®rmatory factor analysis (CFA) using structuralequation (SEQ) modelling of the items to see how well they ®tted together in relation to themodel proposed to underlie the measure. 3. Results 3.1. British norms For the purposes of generating provisional British NEO-FFI norms, the items were summedaccording to standard scoring procedures in the instrument manual (Costa & McCrae, 1992).Table 1 presents a summary of trait means, standard deviations and alpha reliabilities for thefull sample. All scales were highly reliable, with N being particularly so. Table 1 also presents acomparison of male and female scores on the ®ve NEO-FFI scales; all but C show signi®cantsex dierences, with males being lower on average in N, E, O and A than females; there wasno dierence for C. The large number of subjects in the sample allowed us to provide T-scorenorms for men and women in the UK. These are presented in a format similar to the pro®leprinted on the inside of Form S of the NEO-FFI simplifying interpretation of the raw scoresby converting them to T-values: i.e., they have a mean of 50 and an SD of 10. These tables arepresented in Appendices A and B. 3.2. Intercorrelations between the trait scores Table 2 presents the intercorrelations between the ®ve trait scores derived from the NEO-FFI. These indicate that the traits are indeed correlated, with N being substantially associatedwith lower E, lower A, and lower C, and E being associated with higher levels of C.Exploratory factor-analysis of these ®ve `orthogonal' dimensions revealed, after varimax Table 1Means, standard deviations, and alpha reliabilities for a large British sample tested using the NEO-FFI ( n = 1025),and broken down by sex (raw scores)All subjects ( n =1025) Men ( n =802) Women ( n =221) t P <Mean SD Alpha reliability Mean SD Mean SDN 19.5 8.6 0.87 19.04 8.3 21.36 9.3 À 3.36 0.001E 27.1 5.9 0.74 26.83 5.9 28.3 6.0 À 3.21 0.001O 26.5 6.5 0.72 26.02 6.4 28.6 6.3 À 5.28 0.001A 29.7 5.9 0.74 29.25 5.8 31.5 5.6 À 5.10 0.001C 32.1 6.6 0.84 32.09 6.4 31.9 7.5 0.36 n.s. V. Egan et al. /Personality and Individual Dierences 29 (2000) 907±920 910  rotation, a two factor solution which converged in three iterations and explained 59.9% of thevariance, the eigenvalues being 1.84 and 1.15. Factor 1 had high positive loadings for E (0.74),C (0.67), and A (0.50), and a high negative loading for N ( À 0.77). This dimension thusencompasses outgoing, orderly, good-natured and emotionally-stable features into a singlecontinuum representing optimally non-psychopathological features. The second factor wasprimarily de®ned by O, which loaded at 0.90 with the underlying dimensional construct. Thisfactor had lower loadings for C ( À 0.38) and A (0.32), and could be interpreted as representingsomething other than general adjustment of personality. 3.3. Item-level-analysis of the NEO-FFI  To examine from where the sources of variance in the scale the observed factor solutionderived, principal components analysis of the 60 test items was conducted. This extracted 14factors with eigenvalues over 1, which explained a total of 55.8% of the observed variance inthe NEO-FFI. These factors underwent varimax rotation and converged in 35 iterations. Ascree test of these factors suggested that the ®rst ®ve factors extracted (explaining a total of 36.9% of the variance) represented the main sources of variance in the NEO-FFI data matrix(Fig. 1). 3.4. Varimax rotation of the NEO-FFI items Table 3 presents the factor loadings between the NEO-FFI items (reordered and labelled forease of reading) and the ®rst ®ve varimax factors (also re-ordered for ease of reading). Factor1 is clearly and unequivocally N, with all items loading highly and positively on the factor; italso contains two items from the E dimension (E9 and E12), and individual items from the Aand C traits (A6, and C11, respectively). Factor 2 contains seven of the E items, but alsocontains the A7 item `Most people I know like me'. Factor 3 contains nine of the 12 O items;the item O8 (`I believe we should look to our religious authorities for decisions on moralissues') does not load on any dimension within the current ®ve-factor solution. Factor 4contains all 12 A items, but also has a signi®cant negative loading for item N8 (`I often getangry at the way people treat me'). Factor 5 comprises all C items, but also positive loadingsfor items E11 (`I am a very active person') and A10 (`I generally try to be thoughtful and Table 2Intercorrelations (Pearson's r ) between NEO-FFI trait scores ( n =1025) a N E O A CN ±  À 40 ÃÃÃ 07 Ã À 22 ÃÃÃ À 36 ÃÃÃ E ± 16 ÃÃÃ 22 ÃÃÃ 30 ÃÃÃ O ± 08 Ã À 15 ÃÃÃ A ± 13 ÃÃÃ C ±  a Decimal point dropped; two-tailed test; ÃÃÃ = P <0.001; Ã = P <0.02. V. Egan et al. /Personality and Individual Dierences 29 (2000) 907±920 911
