A curvature-based multiresolution automatic karyotyping system

A curvature-based multiresolution automatic karyotyping system
of 12
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
  Digital Object Identifier (DOI) 10.1007/s00138-002-0076-zMachine Vision andApplications (2003) 14: 145–156 Machine Vision andApplications A curvature-based multiresolution automatic karyotyping system Cristina Urdiales Garc´ıa,Antonio Bandera Rubio, Fabi´anArrebola P´erez, Francisco Sandoval Hern´andez Departamento Tecnolog´ıa Electr´onica, E.T.S.I. Telecomunicaci´on, Universidad de M´alaga, Campus de Teatinos, 29071 M´alaga, Spain Received: 26 June 2000 /Accepted: 4 December 2001Published online: 3 June 2003 – c  Springer-Verlag 2003 Abstract.  This paper presents a method to segment, charac-terise and pair a set of chromosomes in a cell of an eukary-otic organism. This method yields several new features: (i)chromosomes are captured at non-uniform resolution to min-imise the problem instance; (ii) segmentation is adaptivelyconducted by means of a hierarchical structure in a fast way;(iii) the curvature of each chromosome is studied at high reso-lution by means of attentive steps; (iv) a very short and uncor-relatedfeaturevectorisextractedfromcurvaturebyanalysingits spectral components; and (v) a multistage benchmark clas-sifier is used to pair chromosomes according to shape andbanding. The method has been tested with publicly avail-abledatabases.Resultsweresuccessfullycomparedtomanualkaryotypes. Key words:  Karyotyping –Analysis of chromosomes – Mul-tiresolution images – Curvature function – Chromosome clas-sification 1 Introduction Chromosome analysis is an important task for clinical diag-nosis of several genetic defects. Traditionally, cells are clas-sified according to their karyotype, which is a tabular arrayin which the chromosomes are aligned in pairs. Manual clas-sification of human chromosomes is a slow and tedious op-eration, hence for over two decades much effort has been fo-cused on automating it. Consequently, the karyotyping errorratehasbeenprogressivelyreduced,andefficientkaryotypingsystems are now coming into use (Granum 1982; Carothersand Piper 1994; Graham and Piper 1994). Automatic kary-otyping systems (AKS) aim at producing a karyotype withoutthe intervention of a human operator.AnAKS must integratethe following stages in an efficient way (Ritter and Gallegos1997): (i) acquisition of an image of a cell, (ii) segmenta-tion of the image to locate the different chromosomes, (iii) Correspondence to : Cristina Urdiales Garc´ıa(e-mail:, Tel.: +34-952-132757,Fax: +34-952-131447) characterisation of each chromosome, and (iv) classificationof the chromosomes into pairs. These processes yield differ-ent technical difficulties and, in fact, an important group of commercially availableAKSs (VanVliet et al. 1990; GrahamandPiper1994)requirehumaninterventiontoperformcertaintasks.Chromosomes are only visible as distinct bodies towardstheendofthecelldivisioncycle,fromprophasetometaphase,when they are long, string-like objects. At these stages, thecell is segmented into regions to locate and extract the dif-ferent chromosomes from the background. Images presentingall chromosomes at a suitable resolution are large, so locatingeach object in the image may require too many computationalresources.Hence,segmentationisusuallyperformedbyasim-ple algorithm – typically a thresholding method (Graham andPiper 1994; Ji 1994; Nickolls et al. 1981; Piper et al. 1980;Van Vliet et al. 1990, Lundsteen and Piper 1989) – to keepa bounded computational complexity. However, a more com-plex method is required to cope with fine details. In particular,the most efficient systems use a procedure that iterates be-tween a thresholding phase and a classification phase, untila stable outcome is achieved. After the image has been seg-mented into a set of regions, those corresponding to separatechromosomesneedtobedetected.Thisprocessisnoteasybe-cause of touching and overlapping chromosomes, but severalmethods have been proposed to disentangle them (Denisovand Dudkin 1994; Ji 1994; Van Vliet et al. 1990; Agam andDinstein 1997; Charters and Graham 1999).Traditionally, chromosomes are classified according tobanding techniques (Piper and Granum 1989; Charters andGraham 1999). However, banding techniques may not be ac-curate enough to achieve the desired success rate. Most au-thors propose shape-based techniques to complement band-ing ones. The combination of both techniques reduces the er-ror rate of karyotyping systems (Carothers and Piper 1994;Lerner 1998). Hence, it is necessary to find a suitable set of features to characterise a given chromosome. Such featuresshould be as uncorrelated as possible in order to achieve ashort feature vector, which is easier to handle (Lerner 1998).Itisalsodesirabletochoosefeaturesasresistantaspossibletotransformations and noise, because chromosomes rarely ap-pear in optimal conditions.A typical feature vector for a chro-  146 C. Urdiales Garc´ıa et al.:A curvature-based multiresolution automatic karyotyping system mosome yields from 5–20 elements, and common featuresused are area, length, weighted density distribution (WDD)(PiperandGranum1989)andthecentromericindex(Denisovand Dudkin 1994; Graham and Piper 1994; Piper et al. 1980;VanVliet1990).However,themedialaxisofthechromosomeis required to obtain most of those features, and its calcula-tion is difficult (Ritter and Schreib 2001). Some medial axiscalculation methods are based on second order moments of the chromosome grey level values (Groen et al. 1989), butthey fail if chromosomes are bent. Other techniques rely ona skeletonisation algorithm to compute the medial axis (Piperand Granum 1989).Classifiers usually work at the chromosome or cell levels.Chromosomelevelclassifiersrelyonthemaximumlikelihoodcriteria (Graham and Piper 1994; Piper et al. 1980; Van Vlietet al. 1990), and their misclassification rates range between5% and 20%, but rejection rates are usually not specified. Celllevel classifiers use several rules, valid for the karyotypingproblem, to improve error rates. Also, if it can be assumedthan all cells in a sample have an identical karyotype, such akaryotypemaybedeterminedbycombininginformationfromseveral cells (Carothers and Piper 1994).Inthispaper,weproposeanautomatickaryotypingmethodthatincludessomenewfeatures.Insteadofcapturingchromo-somes at uniform resolution, non-uniform resolution imagesareusedtoenhancetheefficiencyofthesegmentationprocess.To work with such images, segmentation is achieved in a fastand reliable way by using a hierarchical structure known asa foveal polygon. Objects detected after segmentation are notnecessarily individual chromosomes, because of overlappingandtouchingobjects.Hence,apost-processing,disentanglingstep is proposed to deal with clusters of touching or over-lapping chromosomes. This algorithm calculates the chromo-some longitudinal axis (Ritter and Schreib 2001) and uses abanding pattern technique to disentangle overlaps. After allthe chromosomes have been identified, their shapes are rep-resented by their curvature, which is adaptively calculated toenhance its stability against noise (Bandera et al. 2000). Cur-vature functions are too long to be handled efficiently. There-fore, they are compressed into a short, low-correlated featurevector by using a pseudo-base. Then, chromosomes are clas-sified into seven groups by using a group classifier that workswith the chromosome normalised size, and the proposed fea-ture vector is extracted from the chromosome shape. Finally,a type classifier splits the groups into different types by usinga banding pattern-based method.This paper is organised as follows: the segmentation al-gorithm is presented in Sect. 2. Section 3 presents the chro-mosome characterisation algorithm, and Sect. 4 presents theclassificationalgorithm.Finally,testsandresultsarepresentedin Sect. 5 and conclusions are given in Sect. 6. 2 Segmentation stage Chromosomesegmentationisaprocedurethatrequiresawidefield of vision presenting all available chromosomes. In addi-tion,chromosomesmustbeathighresolutiontocopewithfinedetail regarding further analysis.Vision systems dealing withlarge high resolution images work with an enormous load of data, and therefore require a large amount of computational Fig. 1a–d.  Non uniform resolution geometries:  a  cartesian-exponential foveal geometry ( m =3,  d =2);  b  adaptive shifted carte-sian exponential foveal geometry ( m =3, L d =3, R d =1, T  d =1, B d =3); c centredfovealmultiresolutionimage( m =3);and d adaptiveshiftedfoveal multiresolution image ( m =3, L d =4, R d =12, T  d =4, B d =12) resources.However,inchromosomeanalysis,thebackgroundis not relevant, so it is more efficient to use non-uniform res-olution images, where only areas of interest are presented athigh resolution. Such images may simultaneously present awide field of view and a low data volume (Arrebola et al.1998). This paper proposes use of a new type of foveal im-age for this purpose. The proposed image geometry presentstwo important advantages: (i) it presents a clearly lower datavolume; and (ii) it can be easily adapted to multiresolutionsegmentation techniques. It has been demonstrated than mul-tiresolution segmentation techniques yield better results thanconventional 2D techniques (Hird and Wilson 1989). Hence,a multiresolution adaptive segmentation technique is used inthiswork.Thistechniquehasbeendevelopedfortheproposedimage geometry, and its main advantage is that the resultingsegmentationisadaptedtotheshapeoftheobjectsinthescene(Arrebola et al 1998). 2.1 Foveal geometries The most popular non-uniform resolution geometry is thecartesian-exponential foveal geometry (Bandera and Scott1989), where images present a high resolution square areaknown as fovea at their centre and a decreasing resolutionprofile towards the periphery (Fig. 1a). This geometry is de-fined by its number of resolution rings, m , and its subdivisionfactor,  d  (number of subrings inside each ring). Most exist-ing processing techniques can be easily adapted to cartesian-exponential foveal images (Bandera and Scott 1989). How-ever, camera repositioning is required to capture every objectin the scene at the highest resolution level. Also, the size of   C. Urdiales Garc´ıa et al.:A curvature-based multiresolution automatic karyotyping system 147 Fig.2.a Uniformresolutionimage; b adap-tive shifted foveal image;  c  foveal adaptivepolygon (FAP) structure; and  d  levels 0–3( m =3, L d =10, R d =6, T  d =14, B d =2) the fovea must be at least equal to the area of the largest chro-mosome in the scene. Consequently, the image data volume isnotoptimisedforsmallerchromosomesbecausealargepartof the background is also presented at high resolution. To solvethis problem, the authors proposed (Arrebola et al. 1998) anew foveal geometry where the fovea can be repositioned andresized (Fig. 1b). In this case, five parameters are required todefinethestructure:thenumberofresolutionrings, m ;theleft( L d ) and right ( R d ) subdivision factors; and the bottom ( B d )and top ( T  d ) subdivision factors. Subdivision factors can beeasily calculated as a function of the size of the image and theposition of the fovea (Arrebola et al. 1998).The main problem of working with foveal images is thatthey cannot be processed by means of standard algorithmsin a straight way. The luminosity data provided by a pixelvalue only has meaning if it is associated with a particularlocation of the image, which is implicitly given by its posi-tion in the defined grid in uniform resolution images. How-ever, in non-uniform resolution images, both the location andsize of the area related to each pixel must be provided. Thisinformation might be presented by means of a four columntable or a linked list, but these structures are space-variant,and therefore incompatible with most image processing tools(BanderaandScott1989).Thefovealadaptivepolygon(FAP)is a more efficient data structure. Traditionally, a FAP is builtby averaging the values of each  2  ×  2  nodes of a given level i  into a single one, with the base level ( i  = 0 ) as the foveaof the image. Then, a link is established between those fournodes (sons) and the computed one (parent) to preserve thetopological relationships in upper levels. The resulting areaof computed nodes presents half the resolution in level  i , andtherefore can be grouped with ring i +1  into level i +1 . Thisprocedure goes on until a level presenting the whole field of view is available. This level is known as a ‘waist’. If the waiststill presents too large a number of nodes to allow fast pro-cessing, the structure may grow up to lower resolution levels.Although polygons were srcinally developed to work with acentred foveal geometry, they have been adapted to adaptiveshifting geometries (Arrebola et al. 1998). Figure 2c shows aFAP built over a two rings foveal image (Fig. 2b). Figure 2dshows the different levels of the structure. It can be seen howthe resolution decreases progressively at higher levels. It canalso be noted that the waist level presents the whole field of view (Fig. 2d). Levels above the waist also present the wholefield of view at progressively decreasing resolutions. 2.2 Multiresolution segmentation Once a FAP is available, a typical 2D segmentation techniquecanbeappliedtoanylevelofthestructure.Thelowerthenum-ber of nodes of the level, the lower the computational cost of the process. However, segmentation results obtained in thisway are not adapted to the shape of the objects in the sceneunless the process is carried out at a high resolution level.Segmentation can also be conducted in a hierarchical way bygrouping areas yielding a similar colour into as few nodes aspossible in the upper levels. This kind of segmentation can beachieved by using the adaptive link principle, srcinally pro-posed for pyramidal structures (Burt et al. 1981). This tech-nique has been adapted to the proposed FAPs as follows:1. First, the FAP is stabilised according to the adaptive link principle. This procedure iteratively rearranges the linksbetweennodesinsuccessivelevelsaccordingtotheirlike-ness and their spatial proximity in a bottom-up adaptiveway. To stabilise two successive levels  i  and  i  + 1 , thefollowing steps are required:(a) For each son node  S   at level  i , the parent node  P  presenting the most similar colour value among the 2  ×  2  set of nodes immediately above  S   at level  i + 1 must be found. Then,  S   is unlinked from its previousparent and relinked to  P  .(b) After all nodes at level i have been relinked, level i +1 isrecalculatedaccordingtotheactualisedlinkset.The  148 C. Urdiales Garc´ıa et al.:A curvature-based multiresolution automatic karyotyping system value of a given parent node at level  i  + 1  is equal tothe average of the nodes it is linked to at level  i .(c) Iflevel i +1 doesnotchange,thenlevels i and i +1 arestabilised. Otherwise, the process must be repeated.When the whole structure is stabilised, each of its nodes islinked to an irregular region of pixels yielding an homo-geneous colour level at the base.2. The bounding box associated with a region at the base iscalculated.Eachboundingboxenclosesadifferentregion,butsomeboundingboxesarerelatedtothebackgroundandsome objects might be split into several bounding boxes.3. To remove background bounding boxes, every node of theworking level is probed as root to a chromosome.A nodeis root to a chromosome if the average grey level of thenodes linked to the node at waist level is different fromthe background colour. Since the background presents anhomogeneous colour, this step is very easy in this type of image.4. Small boxes associated with background stains are re-moved by using a rejection criterion. A root node is re-moved if the area of the region at the base it is linked tois smaller than a threshold  U  area . This threshold is easilyfixed because the size of the smallest chromosomes (chro-mosome pairs 21 and 22 or chromosomeY) is known forthe capture conditions.5. Toobtainasingleboundingboxforeachdetectedobject,amergingprocedureisapplied.Tworegionsaremergedintoa single one if (a) their bounding boxes are overlapped,and (b) nodes linked to their roots at the waist level areconnected.6. Toachieveseparatechromosomes,alldetectedobjectsareanalysed. If there are two or more chromosomes inside asingle bounding box, they are split into different boxes. If chromosomes are overlapped or touching, they need to bedisentangled.It is important to note that no assumption about the totalnumber of objects in the image is made during the segmenta-tionprocess.Whenthesestepsareaccomplished,eachbound-ing box at the base contains a single object which may consistof several overlapping chromosomes. Figure 3 shows an ex-ample of the proposed procedure. Initially, a FAP is built overthe foveal image in Fig. 3a. Figure 3b shows all the detectedobjects after removing bounding boxes containing no objects.Figure 3c shows the final detected objects at base level. Someboxes may seem to contain more than one object, because allboxesareprintedoverthesameimage.Actually,eachboxcon-tainsasingleobject(Fig.3d).ItcanbenotedthatinFig.3c,alldetected chromosomes are presented at high resolution. Thisisachievedbymeansofattentionalsteps.Eachattentionalstepconsists of repositioning and resizing the fovea to cover eachboundingboxinFig.3bcompletely.Itcanalsobeappreciatedthat the contrast of the srcinal image is enhanced by using ahistogramstretchingprocedure.Thisenhancementisrequiredfor further steps.Themainadvantageoftheproposedprocedureisitsspeed.Thresholding-based segmentation methods are very fast, buttheyrequireheavypost-processingtoachieveobjectdetection.Also, it is not obvious which threshold to use. Most meth-ods rely on iterative thresholding depending on classificationresults, but these methods are obviously slower because the Fig. 3a–d.  Hierarchical segmentation process:  a  foveal image;  b detected objects after background removal and merging procedure; c  final detected objects at base level;  d  individual objects whole karyotyping procedure needs to be performed severaltimes. More complex segmentation methods, like the popularsplit and merge algorithm, take several minutes to segment 512  ×  512  pixels images like those presented in this paper.Pyramid-basedhierarchicaltechniquesrequireapproximately20 seconds to achieve segmentation. The proposed polygon-based hierarchical technique requires less than a second tosegment a  512  ×  512  image. 2.3 Disentangling of overlapping chromosomes Automatic separation of overlapping chromosomes is essen-tial for the analysis of chromosome images, specially at theprophase stage. Ji (1994) proposed a disentangling methodconsisting of decomposing a thresholded object into individ-ual components by reasoning about shapes. Agam and Din-stein (1997) also applied reasoning about shape to separatetouching or slightly overlapping chromosomes. Charters andGraham (1999) proposed an alternative to disentagling over-laps. Their method consists of identifying consistent pairs of short sections of a banding pattern to make believable chro-mosomes.In this paper, an algorithm based on the banding patternapproach is presented. This algorithm is only applied to de-tected objects whose longitudinal axis presents two or moresections. Initially, a set of rules is defined to obtain the chro-mosome arm endings. These rules are very similar to thoseproposed by Ritter and Schreib (2001), but we allow the exis-tence of more than two chromosome arm endings. Hence, wecan process overlapped and touching chromosomes. Then, apotentialfieldinsidetheobjectbodyiscalculatedtoobtaintheskeleton of the object. Finally, chromosome arm endings are  C. Urdiales Garc´ıa et al.:A curvature-based multiresolution automatic karyotyping system 149 Fig. 4. a  Overlapping chromosomes;  b  object body;  c  detected dominant points (square are section endings);  d  longitudinal axis and potentialfield; e longitudinalaxis; f  sectionsassociatedto e ; g section-chromosomecomparation;and h the12possiblecomparationsofthefoursectionsof   f   with a chromosome prototype and the most plausible combination  joined to this skeleton. The resulting longitudinal axis can benoisy, so a post-processing algorithm is used to prune pathsthat are not connected to a valid chromosome arm ending.Figures 4b–d show the different steps of the longitudinal axiscalculation method.The proposed method for disentangling is explained withthe schematic pair of overlapping chromosomes in Fig. 4a.The banding pattern in the overlapping region in Fig. 4a isnot valid, but there are four valid sections of banding patterns(Fig. 4f). Each of these four sections may be matched to allchromosome banding patterns to obtain the probability of be-longingtoagivenchromosomecluster.Eachchromosomehastwo arms, and it is not known  a priori  which arm a given sec-tion corresponds to. Hence, there are two probability valuesassociatedwitheachchromosome-sectionmatching:theprob-ability of being a section of one arm of that chromosome; andthe probability of being the other one. The probability of be-longing to a given chromosome is obtained by correlating thedensity profiles of the chromosome prototype and the densityprofile of the section (Charters and Graham 1999). Figure 4gpresentsthedensityprofileofSect.1inFig.4fandthedensityprofiles from both arms of chromosome 7. It may be observedthat the density profile from one arm of a chromosome is amirrored version of the density profile from the other arm.If there are  N  section  sections available for a given object,then there are  N  section ( N  section  −  1)  ways to built a chromo-some prototype using these sections (Fig. 4h). The score of each combination is given by  p ij  =  CI  i  · l i  + CI  j  · l j  (1) CI  i  and CI  j  beingthecorrelationindexbetweentheprototypeandsections i and  j ,respectively,and l i  and l j  beingthelengthof sections i and  j . Prototypes yielding the maximum average  p ij  by using all sections are the overlapped chromosomes.Figure 4h shows all the combinations of the four sections inFig. 4f. The combination presenting the highest score of be-ing a real chromosome is given by sections 1 and 3. Such acombination is very similar to chromosome 7.In this work, chromosome prototypes were obtained byusing the karyotypes provides by the Wisconsin State Lab. of Hygiene and ZooWeb  1 .The density profile of each prototypeis equal to the average profile of 60 different chromosomesof the same cluster. These prototypes have been previouslynormalised in length (Carothers and Piper 1994).Theefficiencyoftheproposeddisentanglingalgorithmhasbeen tested using 200 real overlapping pairs of chromosomes.Even though sections may not individually match any givenprototype completely, 62% of overlaps is correctly resolved.This rate is similar to rates reported by banding-based meth-ods, but it could be improved by using partial chromosomemodels (Charters and Graham 1999). Further tests includingsegment orientation information (Popescu et al. 1999) did notimprove the results, because most errors are due to too largea non-studied area (segments a and b in Fig. 4e). 1 http://worms.zoology.wiscedu/zooweb/Phelps/karyotypehtml
Similar documents
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks