Genealogy

Being Both Too Liberal and Too Conservative: The Perils of Treating Grouped Data as though They Were Independent

Description
Being Both Too Liberal and Too Conservative: The Perils of Treating Grouped Data as though They Were Independent
Categories
Published
of 18
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  10.1177/1094428104268542ORGANIZATIONAL RESEARCH METHODSBliese, Hanges / NONINDEPENDENCE AND TYPE I ERROR Being Both Too Liberal and Too Conservative:The Perils of Treating Grouped Dataas Though They Were Independent PAUL D. BLIESE U.S. Army Medical Research Unit–Europe  PAUL J. HANGES University of Maryland  Organizational data are inherently nested; consequently, lower level data aretypically influenced by higher level grouping factors. Stated another way, al-most all lower level organizational data have some degree of nonindependencedue to work group, geographic membership, and so on. Unaccounted-for nonindependence can be problematic because it affects standard error estimatesused to determine statistical significance. Currently, researchers interested inmodeling higher level variables routinely use multilevel modeling techniques toavoidwell-knownproblemswithTypeIerrorrates.Inthisarticle,however,theau-thors examine how nonindependence affects statistical inferences in cases inwhich researchers are interested only in relationships among lower level vari-ables.Theyshowthatignoringnonindependence whenmodelingonlylowerlevelvariables reduces power (increases Type II errors), and through simulations, theauthors show where this loss of power is most pronounced.  Keywords:  multilevel; power; error; applied  Inrecentyears,organizationalresearchershavebecomeincreasinglycognizantoftheneed to incorporate hierarchical structure into statistical analyses (e.g., Bliese, 2002;Hofmann, 1997; Kozlowski & Klein, 2000). Organizational researchers recognizethat any single data point is, in all likelihood, partially influenced by contextual fac-tors—individual employee behavior is affected by work group membership, an orga-nization’s performance is influenced by economic characteristics of the geographicregion, and so on. Statistically, data with these characteristics are described as havingnonindependence due to groups (see Kenny & Judd, 1986). Although nonindepen-dence is prevalent, researchers do not always account for nonindependence in their  Authors’Note:  An earlier version of this article was presented at the 61st Academy of Managementmeetings, Washington, D.C., 2001. Organizational Research Methods , Vol. 7 No. 4, October 2004 400-417DOI: 10.1177/1094428104268542© 2004 Sage Publications 400  analyses. Thus, we describe the consequences of failing to model nonindependence.In so doing, we show that failing to model nonindependence affects standard errorestimates,andwedemonstratethatmodelingtheeffectsofnonindependenceisimpor-tantevenifresearchershavenointerestinexaminingtheroleofhigherlevelvariables.Although nearly all organizational researchers work with nonindependent data,multilevelmodelingtechniquesfordealingwithnonindependencehavebeenusedpri-marily by researchers interested in examining higher level attributes as explanatoryvariables. For instance, it is widely accepted that individual behavior is influenced bygroup membership; however, issues relatedto nonindependence have been addressedalmost exclusively among researchers interested in variables such as group collectiveefficacy (Jex & Bliese, 1999), group safety climate (Hofmann & Stetzer, 1998), orgroup cohesion (Kidwell, Mossholder, & Bennett, 1997). Indeed, issues of nonindependence are generally ignored unless researchers have specific interests inexamining the role of higher level variables.Inthisarticle,wedemonstratethatmultilevelmodelingtechniquessuchasrandomcoefficientmodeling(RCM)provideadvantagestoresearcherswho collectdatafromhierarchical structures even if the researchers have no particular interest in modelingthe influence of higher level variables. For example, a researcher might collect indi-vidual data and be interested in predicting individual performance from individual-levelattributessuchasindividualcognitiveability,individualmotivation,andindivid-ualexperience.Theindividualperformancedatawouldalmostcertainlycontainsomedegree of nonindependence if the data were collected from employees nested withinworkgroups.Someworkgroupswouldhavehighaverageperformancelevelsandoth-ers would have low average performance levels, and each individual’s performancewould be partially related to group membership.Insituationsinwhichnonindependenceexistsbutinwhichresearchersaretheoret-ically and practically interested only in lower level processes, it may be tempting toignore the fact that the data come from preexisting groups. Our goal is to show theimplications of ignoring this nonindependence and help researchers determinewhether they should be concerned about nonindependence even if they have no inter-est in modeling higher level variables. Background Itisoftenstatedthatnonindependenceposessubstantialproblemsforthestatisticalanalysis of data (Barcikowski, 1981; Bryk & Raudenbush, 1992; Heck & Thomas,2000; Kenny & Judd, 1986; Kreft & de Leeuw, 1998). Many commonly used tech-niques such as regression and ANOVA assume responses are independently sampledfromapopulationandthattherearenopreexistinginterdependenciesamongdataele-ments. Many authors have pointed out that nonindependence poses a threat to com-mon regression and ANOVA models because it affects the standard error or varianceestimates used to establish statistical significance (see Bryk & Raudenbush, 1992).For example, Barcikowski (1981) used procedures described by Walsh (1947) andshowed that relatively low levels of nonindependence (e.g., an intraclass correlationcoefficient [ICC] of .05) combined with a relatively large group size of 25 resulted inanobservedTypeIerrorrateof.19.Clearly,itisproblematictobelieveoneistestingahypothesis with an alpha level of .05 when in fact the probability of a Type I error is.19. This constitutes a serious leniency bias. Bliese, Hanges / NONINDEPENDENCE AND TYPE I ERROR 401  In discussing RCM, authors often emphasize how RCM can be used to minimizethenegativeeffectsofnonindependenceonstandarderrorestimates.Justasillustratedby the Barcikowski (1981) example, researchers are warned of the need to modelnonindependencetoavoidtoomanyTypeIerrors(Bryk&Raudenbush,1992;Heck&Thomas, 2000; Snijders & Bosker, 1999). For instance, Heck and Thomas (2000)stated that “if the assumption of independent observations is violated, the standarderrors of parameters in the model are underestimated—potentially resulting in agreaterlikelihoodofthefalseattributionofstatisticaleffectswherenoneshouldexist”(p. 1).Basedonthisliterature,organizationalresearchersmightreasonablyconcludethatignoringnonindependencealwaysleadstostandarderrorsthataretoosmallandstatis-tical tests that are too liberal. We build on work in the area of ANOVA designs, how-ever, to show that failing to control for nonindependence does not always lead to toomanyTypeIerrors.Infact,incertainsituations,ignoringnonindependencecanleadtotoomanyTypeIIerrors. Thatis,insteadofmakingteststooliberal,ignoring preexist-ing interdependencies in the data may reduce the power of statistical tests. ANOVA Designs Applied organizational researchers rarely employ classic balanced ANOVAdesigns. Nonetheless, statistical work within the context of the ANOVA model pro-vides some of the most complete descriptions of how nonindependence influencesTypeIandTypeIIerrorrates.WorkbyKennyandJudd(1986)andKenny,Kashy,andBolger (1998), in particular, illustrates the differential effects of nonindependence inANOVAmodels.Becausetheimplicationsofthisworkmaynotbeimmediatelyobvi-oustoresearcherswhoworkwithnonexperimentalmodels,webeginbyreviewingthework of  Kenny and Judd (1986). We subsequently use this framework to discussnonindependenceinnonexperimentaldesigns.Inallexamples,welimitthediscussionto models with only two hierarchical levels, although the basic ideas can be extendedto more levels.In their 1986 article, Kenny and Judd consider how nonindependence affectsvariance estimates in ANOVA models. Kenny and Judd use rho ( ρ ) to refer tononindependence due to groups. For readers familiar with RCM notation,  ρ  is esti-matedusingtheICC(Bryk&Raudenbush,1992).IntheANOVAliterature, ρ iscom-monly referred to as the ICC(1) or ICC(1,1) (see Bliese, 2000, for a complete discus-sionofterminology). Intermsofinterpretation, ρ isanestimateoftheamountoftotalvariance explained by group membership. Thus, a value of .10 indicates that 10% of the variance in any one individual’s response can be explained by the group to whichhe or she belongs.Controlling for nonindependence is important in experimental ANOVA designsbecause nonzero values affect estimates of both the treatment mean square and errormean square. To understand the specific role of   ρ  on mean squares, we review Kennyand Judd (1986) and discuss the impact of nonindependence in nested versus crossedexperimentaldesigns.UnlikeKennyandJudd,however,weconsideronlycaseswhere ρ ispositive.AsnotedbyBliese(2000)andKennyandJudd(1986),negative ρ valuesare possible when  ρ  is calculated within an ANOVA framework. Nonetheless, nega-tive  ρ  values are much more the exception than the norm within organizationalresearch. 402 ORGANIZATIONAL RESEARCH METHODS  Nested and Crossed ANOVA Designs In nested designs, nonindependence leads to too many Type I errors, so ignoringnonindependencemakesstatisticalteststooliberal.KennyandJudd(1986)illustratedthiswithahierarchicalnesteddesign(Kirk,1994).Aspecificexampleofthisdesignisprovided in Figure 1, and we use this example to illustrate Kenny and Judd’s (1986)argument. Figure 1 illustrates a situation in which a researcher is interested in theeffect of some experimental treatment on a dependent variable. The experimentaltreatment has three levels (i.e.,  k   levels): A, B, and C. For example, in organizationalresearch, a researcher may use this design when he or she gets permission from  g  dif-ferent organizations (i.e., nine organizations in the Figure 1 example) to sampleemployees for a study. Multiple individuals ( m  individuals) participate from eachorganization(i.e.,12individualsintheFigure1example),andalltheindividualsfromthe same organization are randomly assigned to the same experimental treatment. Inotherwords,organizationsarenestedwithinexperimentaltreatments.Inthisexample,atotalof36individuals(i.e., n observations)areineachtreatmentlevel.Inthisnesteddesign, the experimental treatment varies across organizations but is constant withinorganizations.Because organizations are randomly assigned to treatmentlevels, there are no pre-existing similarities among respondents in different treatment conditions. That is,individualsinonetreatmentconditionshouldbeindependentofindividualsinanothertreatment condition. Notice, however, that when  ρ  is greater than zero, there will besome degree of preexisting similarity among respondents within the same treatmentcondition.Thispreexistingsimilaritywillbeanartifactoforganizationalmembership.Perhaps not surprisingly, this nonindependence will be reflected in the treatment anderror mean square terms. Specifically, Kenny and Judd (1986) show thatthe expectedtreatment mean square E(MS t ) expressed in terms of   ρ  is Bliese, Hanges / NONINDEPENDENCE AND TYPE I ERROR 403 Treatment (k)ABCOrg 1(m=12)Org 2(m=12)Org 3(m=12)Org 6(m=12)Org 5(m=12)Org 4(m=12)Org 8(m=12)Org 7(m=12)Org 9(m=12) Figure 1: Illustrative Example of a Nested Design  σ 2 [1 +  ρ ( m  – 1)] +  n σ α 2  ,  (1) and the expected mean square error E(MS e ) is σ ρ 2 1 11 − −−       ( ) mn . (2) Inthesetwoequations, m and n representthenumberofpeoplefromeachorganization(i.e., m =12)withineachtreatmentlevel(i.e., n =36); σ 2 representsthevarianceoftheresiduals (error variance), and σ α 2 represents the treatment effect.Examining Equation 1 reveals three things about nonindependence. First, whendata are independent ( ρ  = 0), the expected treatment mean square is  σ 2 +  n σ α 2 .  This istheexpectedtreatmentmeansquareformulaforaone-wayANOVAtypicallyreportedin textbooks. Second, Equation 1 reveals that as  ρ  increases, the expected treatmentmean square, E(MS t ), also increases. Unfortunately, it is common to interpret a largemean square as evidence of a strong treatment effect (i.e., an increase in  σ α 2 ) ratherthan as an artifact of preexisting group membership. Finally, it is evident from Equa-tion1thatincreasesingroupsizewillleadtoincreasesinE(MS t )foranynonzeroposi-tivevalueof  ρ .Inotherwords,largegroupsizescanexaggeratetheimpactofasmall ρ value on E(MS t ), resulting in a substantially inflated treatment effect.Equation 2 reveals the impact of nonindependence on the mean square error term,E(MS e ).As ρ increases,E(MS e )decreases.Forinstance,if  ρ were.10inour example,then σ 2 wouldbemultipliedby1–(.10 × 11)/35or.97.If  ρ were.30,then σ 2 wouldbemultipliedby.91.Clearly,thehigherthevalueof  ρ ,thelowertheestimateofE(MS e ).Together, the two formulas demonstrate that ignoring nonindependence in nesteddesigns leadstoE(MS t ) valuesthataretoo largeand E(MS e ) valuesthataretoo small.An inflation of the treatment effect coupled with a reduction in the error term willclearly lead to an inflated  F   value. Thus, the formulas presented by Kenny and Judd(1986)explainwhyBarcikowski(1981)couldgetTypeIerrorratesof.19withanICCvalue of .05 and relative large group sizes of 25.Although this is an experimental model, there is an important analogue betweenthis design and nonexperimental models familiar to organizational researchers. Thekey feature of the nested model is all individuals in a specific organization are in thesame treatment condition. Stated another way, all individuals in a preexisting groupreceived the same value on an independent variable.Interestingly, although nonindependence causes tests to be too liberal in nesteddesigns, the opposite effect occurs in crossed designs. Figure 2 illustrates a crosseddesigninwhichtreatmentconditionsvarywithinorganizations.Thefigurerepresentsa situation in which a researcher has access to 12 employees ( m  = 12) in each of nineorganizations( g =9). Theexperimentaltreatmentinthiscasedifferswithinorganiza-tions: Employees from any one organization are randomly assigned to three differentexperimental treatments.In crossed designs, nonzero  ρ  values bias both E(MS t ) and E(MS e ). The most pro-nounced effect, however, is on the treatment mean square, E(MS t ) (Kenny & Judd,1986). In crossed designs, nonindependence reduces the distinctiveness of the treat- 404 ORGANIZATIONAL RESEARCH METHODS
Search
Similar documents
View more...
Tags
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks