Letters

A Numerical Aggregation Algorithm for the Enzyme-Catalyzed Substrate Conversion

Description
A Numerical Aggregation Algorithm for the Enzyme-Catalyzed Substrate Conversion
Categories
Published
of 14
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  A Numerical Aggregation Algorithm for theEnzyme-Catalyzed Substrate Conversion Hauke Busch 1 , Werner Sandmann 2 , and Verena Wolf  3 1 German Cancer Research Center, D-69120 Heidelberg, Germany h.busch@dkfz.de 2 University of Bamberg, D-96045 Bamberg, Germany werner.sandmann@wiai.uni-bamberg.de 3 University of Mannheim, D-68131 Mannheim, Germany wolf@informatik.uni-mannheim.de Abstract. Computational models of biochemical systems are usuallyvery large, and moreover, if reaction frequencies of different reactiontypes differ in orders of magnitude, models possess the mathematicalproperty of stiffness, which renders system analysis difficult and ofteneven impossible with traditional methods. Recently, an acceleratedstochastic simulation technique based on a system partitioning, the slow-scale stochastic simulation algorithm, has been applied to the enzyme-catalyzed substrate conversion to circumvent the inefficiency of standardstochastic simulation in the presence of stiffness. We propose a numericalalgorithm based on a similar partitioning but without resorting to simu-lation. The algorithm exploits the connection to continuous-time Markovchains and decomposes the overall problem to significantly smaller sub-problems that become tractable. Numerical results show enormous effi-ciency improvements relative to accelerated stochastic simulation. Keywords: Biochemical Reactions, Stochastic Model, Markov Chain,Aggregation. 1 Introduction The complexity of living systems has led to a rapidly increasing interest in mod-eling and analysis of biochemically reacting systems. Different types of computa-tional mathematical models exist, where quantitative and temporal relationshipsare often given in terms of  rates and the specific meaning of these rates dependson the chosen model type. Of course, the different model types are intimatelyrelated since they represent the same type of system. A comprehensive treatmentof computational models can be found in [2].Models, not only in the context of biochemical systems, are distinguished interms of their states and state changes ( transitions ) where a state consists of acollection of variables that sufficiently well represents the relevant 1 parameters 1 Any model is a simplified abstraction of the real system and both suitability of amodel and the relevant parameters depend on the scope of the study. C. Priami (Ed.): CMSB 2006, LNBI 4210, pp. 298–311,2006.c  Springer-Verlag Berlin Heidelberg 2006  A Numerical Aggregation Algorithm 299 of the srcinal system at any time. The set of all states, also referred to as the state space , may be either discrete, meaning only a countable number of statesthat can be mapped to a subset of the natural numbers N , or the state spacemay be continuous. In both discrete and continuous state space models the statetransitions may occur deterministically or stochastically.For a long time the model type of choice for biochemically reacting systemswas a deterministic model with continuous state space, based on the law of massaction and expressed in terms of the chemical rate equations leading to a systemof nonlinear ordinary differential equations (ODE) that often turns out to bedifficult to solve. The stochastic approach, motivated by the observation thatbiochemical reactions occur randomly, leads to a system of partial differentialequations, the chemical master equation  (CME).Since direct solution of the CME is often analytically intractable, stochasticsimulation is in widespread use to analyze biochemically reacting systems. In par-ticular, Gillespie’s stochastic simulation algorithm[12,13]and its enhancements[11,8] that are slightly modified implementations are very popular. The algo-rithm basically consists of generating exponentially distributed times betweensuccessive reactions and drawing uniformly distributed numbers from the unitinterval, the latter to decide which type of reaction occurs next. In that waythe temporal evolution of the system is imitated by simulating an associateddiscrete-state Markov process or – in other words – an associated continuous-time Markov chain[3,10,14]. Stochastic simulation of continuous-time Markovchains is well known at the latest since the early 1960s as indicated by[10,17]and the references therein.Although the CME arises from a stochastic model there is no need to applystochastic solution methods. In particular there is a significant difference betweena stochastic model and a stochastic simulation, although in the systems biologyliterature ”the stochastic approach” and ”the stochastic simulation algorithm”are often taken as the same thing. To open access to a wider range of analysismethodologies, it is important and useful to realize and exploit the link betweenbiochemical reactions and Markov processes. Computational probability [16]hasspent much effort to solve Markov processes analytically/numerically without re-sorting to stochastic simulation. In particular in computer systems performanceanalysis Markov chains with extremely large state spaces arise very often, and”numerical solution of Markov chains” is a vital research area[23,24].A major drawback of stochastic simulation is the random nature of simulationresults. Despite the fact that Gillespie’s algorithm is termed exact, a stochasticsimulation can never be exact. Mathematically, it constitutes a statistical esti-mation procedure implying that the results are subject to statistical uncertaintyand in order to draw meaningful conclusions it is necessary to make statisticallyvalid statements on the results. The exactness of Gillespie’s algorithm is only ”inthe sense that it takes full account of the fluctuations and correlations”[13] of reactions within a single simulation run. It is common sense in stochastic simu-lation theory and practice[20] that one should never rely on a single simulationrun and Gillespie mentioned that it is ”necessary to make several simulation runs  300 H. Busch, W. Sandmann, and V. Wolf  from time 0 to the chosen time t, all identical with each other except for theinitialization of the random number generator”. In fact the reliability of simula-tion results strongly depends on a sufficiently large number of simulation runs,and a proper determination of that number has to be carefully done in terms of mathematical statistics (cf.[17,20]).Furthermore, stochastic simulation is inherently costly. In many cases evena single simulation run is extremely computer time demanding and thus reduc-ing the space complexity compared to numerical methods has to be paid by asignificant increase of time complexity. Therefore, often approximations, as forexample the explicit τ  -leaping method[15], are required to achieve simulationspeed up. As an immediate consequence even the exactness in the sense statedabove gets lost. Serious difficulties arise, both for deterministic and stochasticmodels, in the presence of multiple time scales or stiffness. Several approximatestochastic simulation algorithms such as the implicit τ  -leap method [22], theslow-scale stochastic simulation algorithm [5]and the multiscale stochastic sim-ulation algorithm[6] have been proposed to deal with these specific problems. Asa representative stiff reaction set, we consider the enzyme-catalyzed substrateconversion S  1 + S  2 c 1 −−  −− c 2 S  3 c 3 −−  S  1 + S  4 (1)of a substrate S  2 into a product  S  4 via an enzyme-substrate complex  S  3 , cat-alyzed  (accelerated) by an enzyme S  1 . Stiffness and different time scales arise,if the reversible reaction is much faster than the irreversible one. This is ex-pressed by the condition c 2  c 3 on the stochastic reaction rate constants (see2.1for details). Approximate stochastic simulation algorithms for (1) have been recently proposed in[21]and[7]. Both approaches are closely related in that they are based on the idea of partitioning the system and solving subproblemsby different simulation techniques. A similar idea also appeared in[18]. Tech-niques based on partitioning the system are often also referred to as aggregation techniques .As outlined above a clear disadvantage of stochastic simulation comparedto numerical analysis, provided that such an analysis would be possible, is therandom nature of simulation results. Thus, we argue that if a problem maybe tackled both by stochastic simulation and by numerical analysis, the lattershould be preferred. We propose an aggregation technique based on a partition-ing similar to that in [21]and [7]but without resorting to simulation. Instead, in our method all resulting subproblems become tractable and are solved nu-merically. Since the simulation methods mentioned above are approximations,they obviously have two sources of inaccuracy, the approximation error due tothe partitioning and the inherent statistical uncertainty of stochastic simulation,whereas our method only has the approximation error.The basic ingredients of our method are the continuous-time Markov chaininterpretation as an abstraction from the srcinal system under considerationand the specific aggregation of states and transitions. We thereby revisit ideasfrom the analysis of fault-tolerant computer systems [1]and we appropriatelymodify these ideas according to our requirements. The remainder of this paper  A Numerical Aggregation Algorithm 301 is organized as follows. In section2we formally describe our model and discusssolution approaches. The numerical aggregation algorithm (NAA) is derived insection3,and its accuracy and efficiency are demonstrated in section4. Finally, section5concludes the paper and gives directions of further research. 2 Mathematical Model The general stochastic framework for biochemical systems leading to the CMEhas been well known for a long time (cf. [12,13]). Here, we first establish andelucidate the intimate connection to continuous-time Markov chains (CTMC)and we introduce our notations thereby focusing on system (1). Then a brief exposition of numerical solution methods and the arising problems is given witha particular emphasis on large and stiff systems. 2.1 Biochemical Reactions and Markov Chains Let X  ( t )=  X  1 ( t ) ,X  2 ( t ) ,X  3 ( t ) ,X  4 ( t )  be a vector such that X  i ( t ) ,i ∈ { 1 , 2 , 3 , 4 } is a discrete random variable describing the number of molecules of species S  i at time instant t. If  X  ( t ) = x := ( x 1 ,x 2 ,x 3 ,x 4 ) ∈ N 4 , the system is in state x attime t, meaning that for each S  i the current number of molecules is x i . Assume,that initially the number of enzyme molecules is x (0)1 and for the substrate it is x (0)2 whereas no molecules of the enzyme-substrate complex or the product arepresent. For all possible states of the system: x 1 + x 3 = x (0)1 and x 2 + x 4 = x (0)2 .Hence, the maximum numbers of molecules of  S  1 and S  3 are x (0)1 and for S  2 and S  4 they are x (0)2 . This implies a state space of size n = ( x (0)1 + 1) · ( x (0)2 + 1).Note that n is usually very large. For example, if  x (0)1 = 200 and x (0)2 = 3000, itfollows n = 201 · 3001 ≈ 6 · 10 5 . Here, the state space grows exponentially in thenumber of involved species. Moreover, the number of molecules of a species canbe very large. Let S  := { ( x 1 ,...,x 4 ) : x 1 + x 3 = x (0)1 ∧ x 2 + x 4 = x (0)2 } be thestate space of  X  ( t ) . System (1) consists of the three biochemical reactions R 1 : S  1 + S  2 c 1 −−  S  3 , R 2 : S  3 c 2 −−  S  1 + S  2 , R 3 : S  3 c 3 −−  S  1 + S  4 , where the stochastic interpretation is that the reaction rates (also called tran-sition rates ) are proportional to the number of participating molecules and tothe stochastic reaction rate constants c j , j ∈ { 1 , 2 , 3 } . For details and a rigorousformal justification see [12,14]. The propensity function  that gives the transitionrates of the reactions R j is defined by λ 1 ( x ) = c 1 x 1 x 2 ,λ 2 ( x ) = c 2 x 3 ,λ 3 ( x ) = c 3 x 3 . Note that the c j do not depend on the specific time t . The next state of the system only depends on x and the reaction type. If there exists a transitionfrom state x to state x  with transition rate q ( x,x  ) ∈ { λ 1 ( x ) ,λ 2 ( x ) ,λ 3 ( x ) } then  302 H. Busch, W. Sandmann, and V. Wolf  we write x q ( x,x  ) −−−−→ x  . More precisely, for x = ( x 1 ,x 2 ,x 3 ,x 4 ) R 1 : x c 1 x 1 x 2 −−−−→ ( x 1 − 1 ,x 2 − 1 ,x 3 + 1 ,x 4 ) , if  x 1 ,x 2 > 0 and x 3 < x (0)1 ,R 2 : x c 2 x 3 −−−→ ( x 1 + 1 ,x 2 + 1 ,x 3 − 1 ,x 4 ) , if  x 1 < x (0)1 ,x 2 < x (0)2 and x 3 > 0 ,R 3 : x c 3 x 3 −−−→ ( x 1 + 1 ,x 2 ,x 3 − 1 ,x 4 + 1) , if  x 1 < x (0)1 ,x 4 < x (0)2 and x 3 > 0 . The probability of leaving x within a small time interval of length Δt via areaction of type R j is given by λ j ( x ) Δt . Correspondingly, the probability of staying in x within this interval is given by 1 − Λ ( x ) Δt where Λ ( x ) := λ 1 ( x ) + λ 2 ( x ) + λ 3 ( x ) equals the sum of all outgoing rates of  x and is called the exit rate . If  Λ ( x ) is small, state x is slow  whereas otherwise x is a fast  state whichis due to the fact that 1 /Λ ( x ) is the mean sojourn time in x . Let p t ( x ) be theprobability that X  ( t ) = x . Then  p t + Δt ( x ) = (1 − Λ ( x ) Δt ) ·  p t ( x ) +  x  : x  = x  ,x  q ( x  ,x ) −−−−→ x q ( x  ,x ) Δt ·  p t ( x  ) . This leads to the differential equations˙  p t = ddtp t = lim Δt → 0  p t + Δt −  p t Δt = Qp t where p t ∈ R n ≥ 0 is the vector with entries p t ( x ) and Q ∈ R n × n is defined by 2 Q ( x,x  ) := ⎧⎪⎪⎨⎪⎪⎩ − Λ ( x ) , if  x = x  ,q ( x,x  ) , if  x q ( x,x  ) −−−−→ x  ,0 , otherwise.Process X  ( t ) is called a (homogeneous) continuous-time Markov chain  (orCTMC for short). A CTMC is uniquely described by the (infinitesimal) gen-erator matrix  Q and an initial distribution (cf.[3,10]). In general the stochasticinterpretation of chemical equations in the style of (1) always yields a CTMC asindicated in[12,13]. 2.2 Numerical Solution of Markov Chains Stochastic systems in general, and in particular Markov chains, are analyzedwith respect to their temporal evolution where one distinguishes transient  and steady-state analysis. The latter refers to systems in equilibrium whereas theformer refers to the phase where an equilibrium has not yet been reached. Alarge amount of work exists on the numerical solution of Markov chains[24],where numerical solution means to compute probability distributions, eithertime-dependent transient distributions or steady-state distributions. 2 We assume that the state space is mapped to N .
Search
Tags
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks