Travel & Places

A graphical technique for assessing differences among a set of rankings

Description
A graphical technique for assessing differences among a set of rankings
Published
of 13
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  JOURNAL OF CHEMOMETRICS, zyxwvut OL. zyxwvu , 81-93 1994) SHORT COMMUNICATION A GRAPHICAL TECHNIQUE FOR ASSESSING DIFFERENCES AMONG A SET OF RANKINGS z AVID HIRST zyxw cottish Agricultural Statistics Service, Rowett Research Institute, Aberdeen, U.K AND TORMOD NAES Matforsk, Osloveien I N-I430 Aas, Norway SUMMARY A graphical method of assessing differences between sets of rankings based on cumulative ranks is developed. The method can be used to identify rankings that differ over all or just part of the range of objects ranked. The method is applied to an example of sensory evaluation of green peas in which ten assessors scored six attributes on each of 60 samples. KEY WORDS Sensory evaluation Cumulative ranks Assessor variation 1. INTRODUCTION Suppose there are zyxwvuts   objects ranked from 1 to n by zyxw n different assessors. It is of interest how these assessors differ in the way they have ranked the objects. Assume throughout that there are no tied ranks. Statistics such as Kendall’s coefficient of concordance’ can be used to get an overall measure of agreement among assessors, while pairwise rank correlations measure similarity between two assessors, but neither of these statistics gives any indication of how the relationship between assessors varies among the objects. For example, it is possible that the assessors agree exactly on the ranks of the ‘top ten’ objects but have no agreement among the others. Alternatively, they may agree on a division of the objects into two groups of high and low ranks but disagree on the ranks within those groups. In this communication a graphical technique based on cumulative ranks is developed for examining this kind of difference. One area where this kind of technique is useful is in the treatment of data from sensory analysis of food products. This kind of analysis, preferably done by a trained sensory panel, is a very useful way to measure important quality characteristics of food products. Typically, a number of assessors, zyxwvu n, either rank or score on a numerical axis) n different products for a set of predetermined attributes. There are, however, a number of problems in this kind of analysis that require checking of the data before the final analysis. For instance, assessors can 0886-9383/94/01008 1 zyxwvutsrq   13 11 SO zyxwvut   994 by John Wiley Sons, Ltd. Received 18 March 1993 Accepted 3 July 1993  82 SHORT COMMUNICATION zyxwv misunderstand or confuse attributes, they can use the numerical axis differently or they may simply vary in their ability to detect differences between the products. Quick techniques such as the cumulative rank plots for revealing individual differences among the assessors are very useful since they allow problems to be detected and corrected at an early stage. An application of the plot in sensory analysis of green peas is discussed later. 2. THE CUMULATIVE RANK PLOT Initially assume that there is some known underlying ordering of the objects. Let r j be the rank give by the jth assessor to the ith object according to this known order. Let its cumulative rank be cij, where This is minimized for any zyxwv = 1, zyxwvu .., n if the assessor’s ranking agrees exactly with the underlying order, i.e. if r j = i Therefore define the minimum cumulative rank as mini where If all the objects were given the same ranking, these rankings would clearly all equal zy n + 1)/2 and the cumulative rankings would be equ;, where equ; = n l)i/2. Now subtract this from c;j to get a polygonal graph yij for the jth assessor defined by Yij = Cij eqU; (3) This final stage creates the distinctive U-shape of the plots. Also define the minimum possible graph bi as zyxwvut 4) z ; = mini equi 3. PROPERTIES OF THE PLOTS The basic idea of these plots is that if an assessor is in good agreement with the underlying order, then his graph will be close to the ‘baseline’ defined by the minimum graph. The following results are useful. a) The ‘area under the graph’, i.e. the area between an assessor’s graph and the baseline, is proportional to one minus the rank correlation Spearman’s rho) between that assessor’s ranks and the underlying order. Hence the further from the baseline an assessor’s graph is, the worse is his agreement with the underlying order. For proof see the Appendix. This also implies that rankings which are positively correlated with the underlying order will lie largely within the area bounded by the U-shape, while rankings with a negative correlation will lie largely within its reflection in the line yij = 0. Hence it is very easy to distinguish between uncorrelated rankings and those with a high negative correlation. b) The plots are ‘self-scaling’ in that the size of the U-shape formed by the baseline depends only on the number of objects. All other graphs must remain above this line and below its ‘mirror image’ which corresponds to the reverse of the underlying order). This makes it very easy to compare two different plots. c) The height of the graph i.e. the difference between the graph and the baseline) at any point xis a measure of the ability of the assessor to distinguish objects of true rank x or lower  SHORT COMMUNICATION zyxwv 3 from those of higher rank, i.e. is a measure of the ‘confusion’ at that point. See the Appendix for details. zyxwvut 4. ESTIMATING THE UNDERLYING ORDER It is often the case that the underlying order is not known and therefore must be estimated. The simplest method, as recommended by Kendall, is to average the ranks over all assessors and to rank these mean ranks to get a ‘consensus’ ranking. This consensus has the property that of all possible rankings it maximizes the sum of correlations with the srcinal rankings. Therefore it minimizes the mean area under the graphs. A disadvantage of this method is that it gives equal weight to each assessor and therefore the consensus is distorted by ‘abnormal’ assessors. In particular, if one or more assessors are negatively correlated with the true order, the consensus will make no sense. A better alternative is to take the first eigenvector of XXT, where X is the zyx   zyxw   rn matrix of ranks, with columns representing assessors and rows objects, standardized by subtracting column means. Note that the elements of this eigenvector are the scores of the first component of a principal component analysis of the data, where the assessors are regarded as variables.) The elements of this vector are then ranked. This has the following advantage: the eigenvector is equal to a weighted sum of the srcinal rankings where the weights are proportional to the correlation with this sum. Hence negatively correlated assessors contribute in a sensible manner to the consensus, while uncorrelated assessors are downweighted. This is the method of estimating the underlying order used in all the examples in this paper. zyxwv 5. A SCALE FOR THE PLOT Result a) in Section zyxwvuts   says that there is a linear relationship between the correlation between any ranking and the consensus and the area under the graph. It is therefore possible to draw a scale on the plot which can be used to give a visual estimate of this correlation. If graphs corresponding to different correlations, e.g. 0.2, 0.4, 0-6 and 0.8, are constructed, then the area under these graphs can be compared with the area under an assessor’s graph and an estimate of the correlation made. Clearly any graph with the correct area can be used as a scale, but there is one in particular that has a useful property. This is the ‘expected graph’ and is constructed as follows. Call the objects 01, ..., on, where the true rank of object ok is k. Assume that an assessor gives scores s1 ..., , to these objects, where Sk - N k, zyx *). Hence there is ‘equal confusion’ about each object. Let the scores be ranked to give rank rk to object Ok. It can now be shown see Appendix) that the expected rank of Ok is E rk), where for 2k < n. There is a simple extension to larger k. These expected ranks can be used to create an ‘expected graph’ whose shape and area depends only on u and n. For any u it is easy to calculate this area and to relate it to the correlation. A plot of correlation against IJ for n = 60 is shown in Figure 1. Using this plot, expected graphs corresponding to any desired correlations can be constructed and so a scale can be drawn on the cumulative rank plots. The advantage of this technique is twofold: firstly, it gives an easy method of drawing lines corresponding to any correlation; secondly, the shape of the line corresponds to an interesting  84 SHORT COMMUNICATION zyxw 1.0 zyxwvutsrq   zyxwvutsrqponmlk 0.9. zyxwvuts 0.8. 0.7 0.6 0.5 0.4 0.3 zyxwvutsr 0.2 0.1 0 10 20 30 40 50 60 zyxw 0 80 90 100 Figure 1 Plot of standard deviation of scores against correlation with consensus for zy   = zy 0: x-axis title, standard deviation; y-axis title, correlation with consensus hypothesis, namely that there is equal confusion about all objects. Hence any marked divergence from this hypothesis can be seen. 6. EXAMPLE Sixty batches of peas consisting of 27 varieties at different degrees of maturity were prepared and served to ten trained assessors in two replicates. The serving order was randomized. Six attributes were assessed on a continuous scale from 1 to 9. They were ‘pea-flavour’, ‘sweetness’, ‘fruitiness’, ‘off-flavour’, ‘mealiness’ and ‘hardness’. This technique is known as ‘descriptive sensory analysis’. This experiment is perhaps rather unusual in the number of samples assessed, but it is by no means unique and serves as a good example of the use of the cumulative rank plot. It is possible that effects such as carry-over and order of tasting will have  SHORT COMMUNICATION 85 z influenced the results to some degree, but for the purposes of this example they will not be considered. In this kind of experiment there are often very large differences between assessors in terms of the proportion and area of the scale used and in their sensitivity to the attributes. For example, there is no particularly good reason why all the assessors’ scores should be linear functions of some underlying scale, though this is often implicitly assumed. See Reference 2 for a discussion of some possible sources of variation between assessors. Often all that can reasonably be assumed is that an assessor’s scores are some unknown monotonic function of an underlying scale. Therefore in this example only the ranks of the scores, meaned over the replicates, are considered, although this is not the usual way of analysing this type of data and it is possible that some information will be lost. Nevertheless, the cumulative rank plots give considerable insight to the performance of the assessors. The cumulative rank plots for all six attributes are given in Figure 2. Consider first ‘pea- flavour’ in Figure 2(a). It is clear that three assessors differ markedly from the others, six are in good agreement with the consensus and one lies in between. The six ‘good’ assessors all lie below the zyxwvut .8 correlation line. It is also clear that there is much better overall agreement for peas with low ranks, i.e. those with the lowest pea-flavour. This can be interpreted as a difference in sensitivity among the assessors. Six assessors can detect and assess pea-flavour over the whole range, while three can only detect differences between the peas with least flavour. For ‘off-flavour’ (Figure 2(d)) there is very little agreement for samples with low rank, but much better agreement for those peas with a high off-flavour. Six assessors agree quite well for samples with ranks above zyxwv 0, while the other four only agree on the top ten ranks. There is clearly a difference in sensitivity here as well, but also there may well be no difference in off-flavour for the lowest-ranking peas. For ‘mealiness’ (Figure 2(e)) there is very good agreement over the whole range among nine out of the ten assessors, with their graphs all lying below the 0.8 correlation line. One assessor, on the other hand, does not agree at all with the consensus. This was investigated further by constructing a ‘replicate plot’. Here each assessor’s first replicate is regarded as the ‘consensus’ and the cumulative rank graph plotted for the second. All assessors are plotted on the same graph, though obviously the consensus is not the same in each case. The plot can then used to assess the consistency of the assessors. It is clear that for the one odd assessor there is no significant correlation between replicates, whereas for the others the correlations are all greater than 0.4, indicating a reasonable degree of consistency. The hardness plot is given in Figure 2(f). Here it is clear that all assessors agree very well on the rankings and this is supported by the replicate plot in Figure 3(b). There is also a suggestion in both of these plots that the assessors find it slightly easier to rank the harder samples. For comparison, Table 1 gives the rank correlations of each assessor with the consensus ranking for each attribute and also the coefficient of concordance for each attribute. (This is a measure of overall agreement among the assessors. See Reference 1 for details.) This table is in agreement with the conclusions from the plots, though less detailed information can be obtained from it. Finally, the plots have been used to investigate the relationship between the attributes for each assessor. The method used here is to construct a ‘consensus attribute’ for each assessor by ranking the first eigenvector of YY zyxwv , where Y is the 60 zyx   6 matrix of samples, by attributes. This is then used as the consensus and the cumulative rank plot constructed for the six attributes. These plots are shown in Figures 4(a) and 4(b) for two of the assessors. It is clear that for the first assessor the attributes are very highly correlated, with all six lines lying either
Search
Similar documents
View more...
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks