Education

A conjugate gradient algorithm for the astrometric core solution of Gaia

Description
A conjugate gradient algorithm for the astrometric core solution of Gaia
Categories
Published
of 24
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  Astronomy & Astrophysics  manuscript no. aa17904-11 c  ESO 2011December 20, 2011 A conjugate gradient algorithmfor the astrometric core solution of Gaia A. Bombrun 1 , L. Lindegren 2 , D. Hobbs 2 , B. Holl 2 , U. Lammers 3 , and U. Bastian 1 1 Astronomisches Rechen-Institut, Zentrum f ¨ur Astromomie der Universit¨at Heidelberg, M¨onchhofstr. 12–14, DE-69120 Heidelberg, Germanye-mail:  abombrun@ari.uni-heidelberg.de, bastian@ari.uni-heidelberg.de 2 Lund Observatory, Lund University, Box 43, SE-22100 Lund, Swedene-mail:  lennart@astro.lu.se, berry@astro.lu.se, david@astro.lu.se 3 EuropeanSpaceAgency(ESA),EuropeanSpaceAstronomyCentre(ESAC),P.O.Box(Apdo.deCorreos)78,ES-28691Villanuevade la Ca˜nada, Madrid, Spaine-mail:  Uwe.Lammers@sciops.esa.int Received 17 August 2011  /   Accepted 25 November 2011 ABSTRACT Context.  The ESA space astrometry mission Gaia, planned to be launched in 2013, has been designed to make angular measurementson a global scale with micro-arcsecond accuracy. A key component of the data processing for Gaia is the astrometric core solution,which must implement an e ffi cient and accurate numerical algorithm to solve the resulting, extremely large least-squares problem. TheAstrometric Global Iterative Solution (AGIS) is a framework that allows to implement a range of di ff  erent iterative solution schemessuitable for a scanning astrometric satellite. Aims.  Our aim is to find a computationally e ffi cient and numerically accurate iteration scheme for the astrometric solution, compatiblewith the AGIS framework, and a convergence criterion for deciding when to stop the iterations. Methods.  We study an adaptation of the classical conjugate gradient (CG) algorithm, and compare it to the so-called simple iteration(SI) scheme that was previously known to converge for this problem, although very slowly. The di ff  erent schemes are implementedwithin a software test bed for AGIS known as AGISLab. This allows to define, simulate and study scaled astrometric core solutionswith a much smaller number of unknowns than in AGIS, and therefore to perform a large number of numerical experiments in areasonable time. After successful testing in AGISLab, the CG scheme has been implemented also in AGIS. Results.  The two algorithms CG and SI eventually converge to identical solutions, to within the numerical noise (of the order of 0.00001 micro-arcsec). These solutions are moreover independent of the starting values (initial star catalogue), and we conclude thatthey are equivalent to a rigorous least-squares estimation of the astrometric parameters. The CG scheme converges up to a factor fourfaster than SI in the tested cases, and in particular spatially correlated truncation errors are much more e ffi ciently damped out withthe CG scheme. While it appears to be di ffi cult to define a strict and robust convergence criterion, we have found that the sizes of theupdates, and possibly the correlations between the updates in successive iterations, provide useful clues. Key words.  Astrometry – Methods: data analysis – Methods: numerical – Space vehicles: instruments 1. Introduction The European Space Agency’s Gaia mission (Perryman et al.2001; Lindegren et al. 2008; Lindegren 2010) is designed tomeasure the astrometric parameters (positions, proper motionsand parallaxes) of around one billion objects, mainly stars be-longing to the Milky Way Galaxy and the local group. The sci-entific processing of the Gaia observations is a complex task thatrequires the collaboration of many scientists and engineers witha broad range of expertise from software development to CCDs.A consortium of European research centres and universities, theGaia Data Processing and Analysis Consortium (DPAC), hasbeen set up in 2005 with the goal to design, implement and oper-ate this process (Mignard et al. 2008). In this paper we focus ona central component of the scheme, namely the astrometric coresolution, which solves the corresponding least-squares problemwithin a software framework known as the Astrometric GlobalIterative Solution, or AGIS (Lammers et al. 2009; Lindegrenet al. 2011; O’Mullane et al. 2011). In a single solution, the AGIS software will simultaneouslycalibrate the instrument, determine the three-dimensional orien-tation (attitude) of the instrument as a function of time, producethe catalogue of astrometric parameters of the stars, and link it toan adopted celestial reference frame. This computation is basedon the results of a preceding treatment of the raw satellite data,basically giving the measured transit times of the stars in theinstrument focal plane (Lindegren 2010). The astrometric coresolution can be considered as a least-squares problem with neg-ligible non-linearities except for the outlier treatment. Indeed, itshould only take into account so-called primary sources, that isstars and other point-like objects (such as quasars) that can as-trometrically be treated as single stars to the required accuracy.The selection of the primary sources is a key component of theastrometric solution, since the more that are used the better theinstrument can be calibrated, the more accurate the attitude canbe determined, and the better the final catalogue will be. Thisselection, and the identification of outliers among the individualobservations, will be made recursively after reviewing the resid-uals of previous solutions (Lindegren et al. 2011). What remainsis then, ideally, a ‘clean’ set of data referring to the observationsofprimarysources,fromwhichtheastrometriccoresolutionwillbe computed by means of AGIS. 1   a  r   X   i  v  :   1   1   1   2 .   4   1   6   5  v   1   [  a  s   t  r  o  -  p   h .   I   M   ]   1   8   D  e  c   2   0   1   1  A. Bombrun et al.: A conjugate gradient algorithm for Gaia From current estimates, based on the known instrument ca-pabilities and star counts from a Galaxy model, it is expectedthat at least 100 million primary sources will be used in AGIS.Nonetheless, the solution would be strengthened if even moreprimary sources could be used. Moreover, it should be remem-bered that AGIS will be run many times as part of a cyclic datareduction scheme, where the (provisional) output of AGIS isused to improve the raw data treatment (the Intermediate DataUpdate; see O’Mullane et al. 2009). Hence, it is important toensure that AGIS can be run both very e ffi ciently from a compu-tational viewpoint, and that the end results are numerically accu-rate, i.e., very close to the true solution of the given least-squaresproblem.Based on the generic principle of self-calibration, the atti-tude and calibration parameters are derived from the same setof observational data as the astrometric parameters. The result-ing strong coupling between the di ff  erent kinds of parametersmakes a direct solution of the resulting equations extremely dif-ficult, or even unfeasible by several orders of magnitude withcurrent computing resources(Bombrun et al. 2010). On the otherhand, this coupling is well suited for a block-wise organizationoftheequations,where,forexample,alltheequationsforagivensource are grouped together and solved, assuming that the rele-vant attitude and calibration parameters are already known. Theproblem then is of course that, in order to compute the astro-metric parameters of the sources to a given accuracy, one needsto know first the attitude and calibration parameters to corre-sponding accuracies; these in turn can only be computed oncethe source parameters have been obtained to su ffi cient accuracy;and so on. This organization of the computations therefore nat-urally leads to an iterative solution process. Indeed, in AGIS theastrometric solution is broken down into (at least) three distinctblocks, corresponding to the source, attitude and calibration pa-rameter updates, and the software is designed to optimize datathroughput within this general processing framework (Lammerset al. 2009). Cyclically computing and applying the updates inthese blocks corresponds to the so-called simple iteration (SI)scheme (Sect. 2.1), which is known to converge, although veryslowly.However, it is possible to implement many other iterativealgorithms within this same processing framework, and someof them may exhibit better convergence properties than the SIscheme. For example, it is possible to speed up the convergenceif the updates indicated by the simple iterations are extrapolatedby a certain factor. More sophisticated algorithms could be de-rived from various iterative solution methods described in theliterature.The purpose of this paper is to describe one specific suchalgorithm, namely the conjugate gradient (CG) algorithm witha Gauss–Seidel preconditioner, and to show how it can be im-plemented within the AGIS processing framework. We want tomake it plausible that it indeed provides a rigorous solution tothe given least-squares problem. Also, we will study its conver-gence properties in comparison to the SI scheme and, if possible,derive a convergence criterion for stopping the iterations.Our focus is on the high-level adaptation of the CG algo-rithm to the present problem, i.e., how the results from the dif-ferent updating blocks in AGIS can be combined to provide thedesired speed-up of the convergence. To test this, and to ver-ify that the algorithm provides the correct results, we need toconduct many numerical experiments, including the simulationof input data with well-defined statistical properties, and iteratethe solutions to the full precision allowed by the computer arith-metic. On the other hand, since it is not our purpose to validatethe detailed source, instrument and attitude models employed bythe updating blocks, we can accept a number of simplificationsin the modelling of the data, such that the experiments can becompleted in a reasonable time. The main simplifications usedin the present study are as follows:1. For conciseness we limit the present study to the source andattitude parameters, whose mutual disentanglement is by farthe most critical for a successful astrometric solution (cf.Bombrun et al. 2010). For the final data reduction many cal-ibration parameters must also be included, as well as globalparameters(suchasthePPNparameter γ  ;Hobbsetal.2010),and possibly correction terms to the barycentric velocity of Gaia derived from stellar aberration (Butkevich & Klioner2008). These extensions, within the CG scheme, have beenimplemented in AGIS but are not considered here.2. We use a scaled-down version of AGIS, known as AGISLab(Sect. 4.1), which makes it possible to generate input dataand perform solutions with a much smaller number of pri-mary sources than would be required for the (full-scale)AGIS system. This reduces computing time by a large factor,while retaining the strong mutual entanglement of the sourceand attitude parameters, which is the main reason why theastrometric solution is so di ffi cult to compute.3. Therotationofthesatelliteisassumedtofollowtheso-callednominal scanning law, which is an analytical prescription forthepointingoftheGaiatelescopesasafunctionoftime.Thatis, we ignore the small ( <  1 arcmin) pointing errors that thereal mission will have, as well as attitude irregularities, datagaps, etc. The advantage is that the attitude modelling be-comes comparatively simple and can use a smaller set of at-titude parameters, compatible with the scaled-down versionof the solution.4. The input data are ‘clean’ in the sense that there are no out-liers, and the observation noise is unbiased with known stan-dard deviation. This highly idealised condition is importantin order to test that the solution itself does not introduce un-wanted biases and other distortions of the results.An iterative scheme should in each iteration compute a betterapproximation to the exact solution of the least-squares prob-lem. In this paper we aim to demonstrate that the SI and CGschemes are converging in the sense that the errors, relative toan exact solution, vanish for a su ffi cient number of iterations.Since we work with simulated data, we have a reference point inthe true values of the source parameters (positions, proper mo-tions and parallaxes) used to generate the observations. We alsoaim to demonstrate that the CG method is an e ffi cient schemeto solve the astrometric least-squares problem, i.e., that it leads,in a reasonable number of iterations, to an approximation thatis su ffi ciently close to the exact solution. An important problemwhen using iterative solution methods is how to know when tostop, and we study some possible convergence criteria with theaim to reach the maximum possible numerical accuracy.The paper provides both a detailed presentation of the SI andCG algorithms at work in AGIS and a study of their numeri-cal behaviour through the use of the AGISLab software (Hollet al. 2010). The paper is organized as follows: Section 2 gives a brief overview of iterative methodsto solve alinear least-squaresproblem. Section 3 describes in detail the algorithms consid-ered here, viz., the SI and CG with di ff  erent preconditioners. InSect.4weanalyzetheconvergenceofthesealgorithmsandsomeproperties of the solution itself. Then, Sect. 5 presents the im-plementation status of the CG scheme in AGIS before the mainfindings of the paper are summarized in the concluding Sect. 6. 2  A. Bombrun et al.: A conjugate gradient algorithm for Gaia 2. Iterative solution methods This section presents the mathematical basis of the simple itera-tion and conjugate gradient algorithms to solve the linear least-squares problem. For a more detailed description of these andother iterative solution methods we refer to Bj¨orck (1996) and van der Vorst (2003). A history of the conjugate gradient method can be found in Golub & O’Leary (1989). Let  Mx  =  h  be the overdetermined set of observation (de-sign)equations,where  x isthevectorofunknowns,  M   thedesignmatrix, and  h  the right-hand side of the design equations. Theunknowns are assumed to be (small) corrections to a fixed setof reference values for the source and attitude parameters. Thesereference values must be close enough to the exact solution thatnon-linearities in  x  can be neglected; thus  x  =  0  is still withinthe linear regime. Moreover, we assume that the design equa-tions have been multiplied by the square root of their respectiveweights, so that they can be treated by ordinary (unweighted)least-squares. That is, we seek the vector  x  that minimizes thesum of the squares of the design equation residuals, Q  =    h  −  Mx  2 ,  (1)where   ·   is the Euclidean norm. It is well known (cf.Appendix A) that if   M   has full rank, i.e.,   Mx   >  0 for all  x    0 ,this problem has a unique solution that can be obtained by solv-ing the normal equations  Nx  =  b ,  (2)where  N   =  M    M   is the normal matrix,  M   is the transpose of   M  ,and  b  =  M    h  the right-hand side of the normals. This solutionis denoted ˆ  x  =  N  − 1  b . In the following, the number of unknownsis denoted  n  and the number of observations  m    n . Thus  M  ,  x and  h  have dimensions  m  ×  n ,  n  and  m , respectively, and  N   and  b  have dimensions  n  ×  n  and  n .The aim of the iterative solution is to generate a sequenceof approximate solutions  x 0 ,  x 1 ,  x 2 ,  ... , such that     k   →  0 as k   → ∞ , where    k   =  x k   −  ˆ  x  is the truncation error in iteration k  . The design equation residual vector at this point is denoted  s k   =  h  −  Mx k   (of dimension  m ), and the normal equation resid-ual vector is denoted  r k   =  b  −  Nx k   =  −  N    k   (of dimension  n ).The least-squares solution ˆ  x  corresponds to ˆ  r  =  0 . At this pointwe still have in general   ˆ  s   >  0, since the design equations areoverdetermined. If   x (true) are the true parameter values, we de-note by  e k   =  x k   −  x (true) the estimation errors in iteration  k  . Afterconvergence we have in general   ˆ e   >  0 due to the observa-tion noise. The progress of the iterations may thus potentially be judged from several di ff  erent sequences of vectors, e.g.: –  the design equation residuals  s k  , whose norm should be min-imized; –  the vanishing normal equation residuals  r k  ; –  the vanishing parameter updates  d  k   =  x k  + 1  −  x k  ; –  the vanishing truncation errors    k  ; and –  the estimation errors  e k  , which will generally decrease butnot vanish.The last two items are of course not available in the real exper-iment, but it may be helpful to study them in simulation experi-ments. We return in Sect. 4.4 to the definition of a convergencecriterion in terms of the first three sequences.Given the design matrix  M   and right-hand side  h  (or alter-natively the normals  N  ,  b ), we use the term  iteration scheme  forany systematic procedure that generates successive approxima-tions  x k   starting from the arbitrary initial point  x 0  (which couldbe zero). The schemes are based on some judicious choice of a  preconditioner   matrix  K   that in some sense approximates thenormal matrix  N   (Sect. 2.3). The preconditioner must be suchthat the associated system of linear equations,  Kx  =  y , can besolved with relative ease for any  y .For the astrometric problem  N   is actually rank-deficient witha well-defined null space (see Sect. 3.3), and we seek in principlethe pseudo-inverse solution, ˆ  x  =  N  †  b , which is orthogonal to thenull space. By subtracting from each update its projection ontothe null space, through the mechanism described in Sect. 3.3, weensure that the successive approximations remain orthogonal tothe null space. In this case the circumstance that the problem isrank-deficient has no impact on the convergence properties (seeLindegren et al. 2011, for details). 2.1. The simple iteration (SI) scheme  Given  N  ,  b ,  K   and an initial point  x 0 , successive approximationsmay be computed as  x k  + 1  =  x k   +  K  − 1  r k   ,  (3)which is referred to as the  simple iteration  (SI) scheme. Its con-vergence is not guaranteed unless the absolute values of theeigenvalues of the so-called iteration matrix  I   −  K  − 1  N   are allstrictly less than one, i.e.,  | λ max |  <  1 where  λ max  is the eigenvaluewith the largest absolute value. In this case it can be shown thatthe ratio of the norms of successive updates asymptotically ap-proaches  | λ max | . Naturally,  | λ max |  will depend on the choice of   K  .The closer it is to 1, the slower the SI scheme converges.Depending on the choice of the preconditioner, the simpleiteration scheme may represent some classical iterative solutionmethod. For example, if   K   is the diagonal of   N   then the schemeis called the Jacobi method; if   K   is the lower triangular part of   N   then it is called the Gauss–Seidel method. 2.2. The conjugate gradient (CG) scheme  The normal matrix  N   defines the metric of a scalar product inthe space of unknowns  R n . Two non-zero vectors  u ,  v  ∈  R n aresaid to be conjugate in this metric if   u   Nv  =  0. It is possible tofind  n  non-zero vectors in  R n that are mutually conjugate. If   N  is positive definite, these vectors constitute a basis for  R n .Let  {  p 0 ,...,  p n − 1 }  be such a conjugate basis. The desired so-lution can be expanded in this basis as ˆ  x  =  x 0  +  n − 1 k  = 0  α k   p k  .Mathematically, the sequence of approximations generated bythe CG scheme corresponds to the truncated expansion  x k   =  x 0  + k  − 1  κ = 0 α κ  p κ  ,  (4)with residual vectors  r k   ≡  N  (ˆ  x  −  x k  )  = n − 1  κ = k  α κ  Np κ  .  (5)Since  x n  =  ˆ  x  it follows, in principle, that the CG converges tothe exact solution in at most  n  iterations. This is of little practicaluse, however, since  n  is a very large number and rounding errorsin any case will modify the sequence of approximations longbefore this theoretical point is reached. The practical importanceof the CG algorithm instead lies in the remarkable circumstancethat a very good approximation to the exact solution is usuallyreached for  k     n . 3  A. Bombrun et al.: A conjugate gradient algorithm for Gaia From Eq. (5) it is readily seen that  r k   is orthogonal to eachof the basis vectors  p 0 , . . . ,  p k  − 1 , and that  α k   =  p k    r k  / (  p  k   Np k  ).In the CG scheme a conjugate basis is built up, step by step, atthe same time as successive approximations of the solution arecomputed. The first basis vector is taken to be  r 0 , the next one isthe conjugate vector closest to the resulting  r 1 , and so on.Using that  x k  + 1  =  x k   +  α k   p k   from Eq. (4), we have  s k  + 1  =  s k   − α k   Mp k   from which   s k  + 1  2 =    s k   2 − α 2 k   p  k   Np k   ≤   s k   2 .  (6)Each iteration of the CG algorithm is therefore expected to de-crease the norm of the  design equation  residuals    s k   . By con-trast, although the norm of the  normal equation  residual    r k   vanishes for su ffi ciently large  k  , it does not necessarily decreasemonotonically, and indeed can temporarily increase in some it-erations.Using the CG in combination with a preconditioner  K   meansthat the above scheme is applied to the solution of the pre-conditioned normal equations  K  − 1  Nx  =  K  − 1  b .  (7)For non-singular  K   the solution of this system is clearly thesame as for the srcinal normals in Eq. (2), i.e., ˆ  x . Using apreconditioner can significantly reduce the number of CG itera-tions needed to reach a good approximation of ˆ  x . In Sect. 3 andAppendix B we describe in more detail the proposed algorithm,based on van der Vorst (2003). 2.3. Some possible preconditioners  The convergence properties of an iterative scheme such as theCG strongly depend on the choice of preconditioner, which istherefore a critical step in the construction of the algorithm. Thechoice represents a compromise between the complexity of solv-ing the linear system  Kx  =  y  and the proximity of this system tothe srcinal one in Eq. (2). Considering the sparseness structureof   M    M   there are some ‘natural’ choices for  K  . For the astro-metric core solution with only source and attitude unknowns, thedesign equations for source  i  =  1 . . .  p  (where  p  is the number of primary sources) can be summarized S i  x si  +  A i  x a  =  h si  ,  (8)with  x si  and  x a  being the source and attitude parts of the un-known parameter vector  x  (for details, see Bombrun et al. 2010).The normal equations (2) then take the form  S 1  S 1  0  . . .  0  S 1   A 1 0  S 2  S 2  . . .  0  S 2   A 2 ............... 0 0  . . .  S  p  S  p  S  p   A  p  A 1  S 1  A 2  S 2  . . .  A  p  S  p  i  A i   A i   x 1  x 2 ...  x  p  x a  =  S 1   h s 1 S 2   h s 2 ... S  p   h sp  i  A i   h si  .  (9)It isimportant to notethat thematrices  N  si  ≡  S i  S i  aresmall (typ-ically 5 × 5), and that the matrix  N  a  ≡  i  A i   A i , albeit large, hasa simple band-diagonal structure thanks to our choice of repre-senting the attitude through short-ranged splines. Moreover, nat-ural gaps in the observation sequence make it possible to break up this last matrix into smaller attitude segments (indexed  j  inthe following) resulting in a blockwise band-diagonal structure.The band-diagonal block associated with attitude segment  j  isdenoted  N  aj ; hence  N  a  =  diag(  N  a 1 ,  N  a 2 , . . . ).Considering only the diagonal blocks in the normal matrix,we obtain the  block Jacobi preconditioner  ,  K  1  =  S 1  S 1  0  . . .  0 00  S 2  S 2  . . .  0 0 ............... 0 0  . . .  S  p  S  p  00 0  . . .  0   i  A i   A i  .  (10)Since the diagonal blocks correspond to independent systemsthat can be solved very easily, it is clear that  K  1  x  =  y  can readilybe solved for any  y .Considering in addition the lower triangular blocks we ob-tain the  block Gauss–Seidel preconditioner  ,  K  2  =  S 1  S 1  0  . . .  0 00  S 2  S 2  . . .  0 0 ............... 0 0  . . .  S  p  S  p  0  A 1  S 1  A 2  S 2  . . .  A  p  S  p  i  A i   A i  .  (11)Again, considering the simple structure of the diagonal blocks,it is clear that  K  2  x  =  y  can be solved for any  y  by first solvingeach  x si , whereupon substitution into the last row of equationsallows to solve  x a .  K  2  is non-symmetric and it is conceivable that this prop-erty is unfavourable for the convergence of some problems. Onthe other hand, the symmetric  K  1  completely ignores the o ff  -diagonal blocks in  N  , which is clearly undesirable. The  symmet-ric block Gauss–Seidel preconditioner   K  3  =  K  2  K  − 11  K   2  (12)makes use of the o ff  -diagonal blocks while retaining symmetry.The corresponding equations  K  3  x  =  y  can be solved as two suc-cessive triangular systems: first,  K  2  z  =  y  is solved for  z , then  K  − 11  K   2  x  =  z  is solved for  x  (see below). It thus comes with thepenaltyof requiringroughly twiceasmanyarithmeticoperationsper iteration as the non-symmetric Gauss–Seidel preconditioner.If the normal matrix in Eq. (9) is formally written as  N   =   N  s  L   L N  a   ,  (13)where  L  is the block-triangular matrix below the main diagonal,and  N  a  =  i  A i   A i , the preconditioners become  K  1  =   N  s  00  N  a   ,  K  2  =   N  s  0  L N  a   ,  K  3  =   N  s  L   L N  a +  LN  − 1 s  L   . . .  (14)The second system to be solved for the symmetric block Gauss–Seidel preconditioner involves the matrix  K  − 11  K   2  =   I N  − 1 s  L  0  I    ,  (15)where  I   is the identity matrix. This second step therefore doesnot a ff  ect the attitude part of the solution vector. 4  A. Bombrun et al.: A conjugate gradient algorithm for Gaia 3. Algorithms In this section we present in pseudo-code some algorithms thatimplement the astrometric core solution using SI or CG. Theyare described in some detail since, despite being derived fromwell-known classical methods, they have to operate within anexisting framework (viz., AGIS) which allows to handle the verylarge number of unknowns and observations in an e ffi cient man-ner. Indeed, the numerical behaviour of an algorithm may de-pend significantly on implementation details such as the order of certain operations, even if they are mathematically equivalent.In the following, we distinguish between the already intro-duced  iterative schemes  on one hand, and the  kernels  on theother. The kernels are designed to set up and solve the precondi-tioner equations, and therefore encapsulate the computationallycomplex matrix–vector operations of each iteration. By contrast,the iteration schemes typically involve only scalar and vectoroperations. The AGIS framework has been set up to perform (asone of its tasks) a particular type of kernel operation, and it hasbeen demonstrated that this can be done e ffi ciently for the full-size astrometric problem (Lammers et al. 2009). By formulatingthe CG algorithm in terms of identical or similar kernel opera-tions, it is likely that it, too, can be e ffi ciently implemented withonly minor changes to the AGIS framework.The complete solution algorithm is made up of a particularcombination of kernel and iterative scheme. Each combinationhas its own convergence behaviour, and in Sect. 4 we examinesome of them. Although we describe, and have in fact imple-mented, several di ff  erent kernels, most of the subsequent studiesfocus on the Gauss–Seidel preconditioner, which turns out to beboth simple and e ffi cient.In the astrometric least-squares problem, the design matrix  M   and the right-hand side  h  of the design equations depend onthe current values of the source and attitude parameters (whichtogether form the vector of unknowns  x ), on the partial deriva-tives of the observed quantities with respect to  x , and on theformal standard error of each observation (which is used for theweight normalization). Each observation corresponds to a row of elements in  M   and  h . For practical reasons, these elements arenot stored but recomputed as they are needed, and we may gen-erally consider them to be functions of   x . For a particular choiceof preconditioner and a given  x , the kernel computes the scalar Q  and the two vectors  r  and  w  given by Q  =    h −  Mx  2 ,  r  =  M   (  h −  Mx ) , w  =  K  − 1  r .  (16)For brevity, this operation is written( Q ,  r , w )  ←  kernel(  x ) .  (17)For given  x , the vector  r  is thus the right-hand side of normalequations and  w  is the update suggested by the pre-conditioner,cf. Eq. (3).  Q  =    s  2 , the sum of the squares of the designequation residuals, is the  χ 2 -type quantity to be minimized bythe least-squares solution; it is needed for monitoring purposes(Sect. 4.4) and should be calculated in the kernel as this requiresaccess to the individual observations. It can be noted that  K   alsodepends on  x , although in the linear regime (which we assume)this dependence is negligible. 3.1. Kernel schemes  We have implemented the three preconditioners discussed inSect. 2.3, viz., the block Jacobi (Algorithm 1), the block Gauss– Algorithm 1  – Kernel scheme with block Jacobi preconditioner 1:  Q  ←  02:  for all  attitude segments  j , zero [  N  aj  |  r aj ]3:  for all  sources  i  do 4: zero [  N  si  |  r si ]5:  for all  observations  l  of the source  do 6: calculate  S l ,  A l ,  h l 7:  Q  ←  Q +  h l   h l 8: [  N  si  |  r si ]  ←  [  N  si  |  r si ] + S l  [ S l  |  h l ]9: [  N  aj  |  r aj ]  ←  [  N  aj  |  r aj ] +  A l  [  A l  |  h l ]10:  end for 11:  w si  ←  solve([  N  si  |  r si ])12:  end for 13:  for all  attitude segments  j  do 14:  w aj  ←  solve([  N  aj  |  r aj ])15:  end for 16:  return  Q ,  r  =  (  r s 1 , . . . ,  r a 1 , . . . ) and  w  =  ( w s 1 , . . . , w a 1 , . . . ) Seidel(Algorithm2)andthesymmetricblockGauss–Seidelpre-conditioner (Algorithm 3). For the sake of simplicity, the algo-rithms presented here considers only the source and attitude un-knowns; for the actual data processing they must be extended toinclude the calibration and global parameters as well (Lindegrenet al. 2011).In the following, we use [  B  |  b c  . . . ] to designate a systemof equations with coe ffi cient matrix  B  and right-hand sides  b ,  c ,etc. This notation allows to write compactly several steps wherethe coe ffi cient matrix and (one or several) right-hand sides canformally be treated as a single matrix. Naturally, the actual cod-ing of the algorithms can sometimes also benefit from this com-pactness. For square, non-singular  B  the process of solving thesystem  Bx  =  b  is written in pseudo-code as  x  ←  solve([  B  |  b ]).AkeypartoftheAGISframeworkistheabilitytotakealltheobservations belonging to a given set of sources and e ffi cientlycalculatethecorrespondingdesignequations(8).Foreachobser-vation  l  of source  i , the corresponding row of the design equa-tions can be is written S l  x si  +  A l  x aj  =  h l  ,  (18)where  j  is the attitude segment to which the observation belongs, S l  and  A l  contain the matrix elements associated with the sourceand attitude unknowns  x si  and  x aj , respectively. 1 In practice, theright-hand side  h l  for observation  l  is not a fixed number, but isdynamically computed for current parameter values as the dif-ference between the observed and calculated quantity, dividedby its formal standard error. This means that  h l  takes the placeof the design equation residual  s l , and that the resulting  x  mustbe interpreted as a correction to the current parameter values. InAlgorithms 1–3 this complex set of operations is captured by the pseudo-code statement ‘calculate  S l ,  A l ,  h l ’.In the block Jacobi kernel (Algorithm 1), [  N  si  |  r si ]  ≡ [ S i  S i  |  S i   h i ] are the systems obtained by disregarding the o ff  -diagonal blocks in the upper part of Eq. (9). Similarly [  N  aj  |  r aj ],for the di ff  erent attitude segments  j , together make up the band-diagonal system [  i  A i   A i  |  i  A i   h i ] in the last row of Eq. (9).The kernel scheme for the block Gauss–Seidel precondi-tioner (Algorithm 2) di ff  ers from the above mainly in that theright-hand sides of the observation equations (  h l ) are modified(in line 11) to take into account the change in the source pa-rameters, before the normal equations for the attitude segmentsare accumulated. However, since the kernel must also return the 1 The observations are normally one-dimensional, in which case  S l and  A l  consist of a single row, and the right-hand side  h l  is a scalar.5
Search
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks