Description

A Numerical Aggregation Algorithm for the Enzyme-Catalyzed Substrate Conversion

All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.

Related Documents

Share

Transcript

A Numerical Aggregation Algorithm for theEnzyme-Catalyzed Substrate Conversion
Hauke Busch
1
, Werner Sandmann
2
, and Verena Wolf
3
1
German Cancer Research Center, D-69120 Heidelberg, Germany
h.busch@dkfz.de
2
University of Bamberg, D-96045 Bamberg, Germany
werner.sandmann@wiai.uni-bamberg.de
3
University of Mannheim, D-68131 Mannheim, Germany
wolf@informatik.uni-mannheim.de
Abstract.
Computational models of biochemical systems are usuallyvery large, and moreover, if reaction frequencies of diﬀerent reactiontypes diﬀer in orders of magnitude, models possess the mathematicalproperty of stiﬀness, which renders system analysis diﬃcult and ofteneven impossible with traditional methods. Recently, an acceleratedstochastic simulation technique based on a system partitioning, the slow-scale stochastic simulation algorithm, has been applied to the enzyme-catalyzed substrate conversion to circumvent the ineﬃciency of standardstochastic simulation in the presence of stiﬀness. We propose a numericalalgorithm based on a similar partitioning but without resorting to simu-lation. The algorithm exploits the connection to continuous-time Markovchains and decomposes the overall problem to signiﬁcantly smaller sub-problems that become tractable. Numerical results show enormous eﬃ-ciency improvements relative to accelerated stochastic simulation.
Keywords:
Biochemical Reactions, Stochastic Model, Markov Chain,Aggregation.
1 Introduction
The complexity of living systems has led to a rapidly increasing interest in mod-eling and analysis of biochemically reacting systems. Diﬀerent types of computa-tional mathematical models exist, where quantitative and temporal relationshipsare often given in terms of
rates
and the speciﬁc meaning of these rates dependson the chosen model type. Of course, the diﬀerent model types are intimatelyrelated since they represent the same type of system. A comprehensive treatmentof computational models can be found in [2].Models, not only in the context of biochemical systems, are distinguished interms of their states and state changes (
transitions
) where a state consists of acollection of variables that suﬃciently well represents the relevant
1
parameters
1
Any model is a simpliﬁed abstraction of the real system and both suitability of amodel and the relevant parameters depend on the scope of the study.
C. Priami (Ed.): CMSB 2006, LNBI 4210, pp. 298–311,2006.c
Springer-Verlag Berlin Heidelberg 2006
A Numerical Aggregation Algorithm 299
of the srcinal system at any time. The set of all states, also referred to as the
state space
, may be either discrete, meaning only a countable number of statesthat can be mapped to a subset of the natural numbers
N
,
or the state spacemay be continuous. In both discrete and continuous state space models the statetransitions may occur deterministically or stochastically.For a long time the model type of choice for biochemically reacting systemswas a deterministic model with continuous state space, based on the law of massaction and expressed in terms of the
chemical rate equations
leading to a systemof nonlinear ordinary diﬀerential equations (ODE) that often turns out to bediﬃcult to solve. The stochastic approach, motivated by the observation thatbiochemical reactions occur randomly, leads to a system of partial diﬀerentialequations, the
chemical master equation
(CME).Since direct solution of the CME is often analytically intractable, stochasticsimulation is in widespread use to analyze biochemically reacting systems. In par-ticular, Gillespie’s stochastic simulation algorithm[12,13]and its enhancements[11,8] that are slightly modiﬁed implementations are very popular. The algo-rithm basically consists of generating exponentially distributed times betweensuccessive reactions and drawing uniformly distributed numbers from the unitinterval, the latter to decide which type of reaction occurs next. In that waythe temporal evolution of the system is imitated by simulating an associateddiscrete-state Markov process or – in other words – an associated continuous-time Markov chain[3,10,14]. Stochastic simulation of continuous-time Markovchains is well known at the latest since the early 1960s as indicated by[10,17]and the references therein.Although the CME arises from a stochastic model there is no need to applystochastic solution methods. In particular there is a signiﬁcant diﬀerence betweena stochastic model and a stochastic simulation, although in the systems biologyliterature ”the stochastic approach” and ”the stochastic simulation algorithm”are often taken as the same thing. To open access to a wider range of analysismethodologies, it is important and useful to realize and exploit the link betweenbiochemical reactions and Markov processes. Computational probability [16]hasspent much eﬀort to solve Markov processes analytically/numerically without re-sorting to stochastic simulation. In particular in computer systems performanceanalysis Markov chains with extremely large state spaces arise very often, and”numerical solution of Markov chains” is a vital research area[23,24].A major drawback of stochastic simulation is the random nature of simulationresults. Despite the fact that Gillespie’s algorithm is termed exact, a stochasticsimulation can never be exact. Mathematically, it constitutes a statistical esti-mation procedure implying that the results are subject to statistical uncertaintyand in order to draw meaningful conclusions it is necessary to make statisticallyvalid statements on the results. The exactness of Gillespie’s algorithm is only ”inthe sense that it takes full account of the ﬂuctuations and correlations”[13] of reactions within a single simulation run. It is common sense in stochastic simu-lation theory and practice[20] that one should never rely on a single simulationrun and Gillespie mentioned that it is ”necessary to make several simulation runs
300 H. Busch, W. Sandmann, and V. Wolf
from time 0 to the chosen time
t,
all identical with each other except for theinitialization of the random number generator”. In fact the reliability of simula-tion results strongly depends on a suﬃciently large number of simulation runs,and a proper determination of that number has to be carefully done in terms of mathematical statistics (cf.[17,20]).Furthermore, stochastic simulation is inherently costly. In many cases evena single simulation run is extremely computer time demanding and thus reduc-ing the space complexity compared to numerical methods has to be paid by asigniﬁcant increase of time complexity. Therefore, often approximations, as forexample the explicit
τ
-leaping method[15], are required to achieve simulationspeed up. As an immediate consequence even the exactness in the sense statedabove gets lost. Serious diﬃculties arise, both for deterministic and stochasticmodels, in the presence of multiple time scales or stiﬀness. Several approximatestochastic simulation algorithms such as the implicit
τ
-leap method [22], theslow-scale stochastic simulation algorithm [5]and the multiscale stochastic sim-ulation algorithm[6] have been proposed to deal with these speciﬁc problems. Asa representative stiﬀ reaction set, we consider the enzyme-catalyzed substrateconversion
S
1
+
S
2
c
1
−−
−−
c
2
S
3
c
3
−−
S
1
+
S
4
(1)of a
substrate
S
2
into a
product
S
4
via an
enzyme-substrate complex
S
3
,
cat-alyzed
(accelerated) by an enzyme
S
1
.
Stiﬀness and diﬀerent time scales arise,if the reversible reaction is much faster than the irreversible one. This is ex-pressed by the condition
c
2
c
3
on the
stochastic reaction rate constants
(see2.1for details). Approximate stochastic simulation algorithms for (1) have been
recently proposed in[21]and[7]. Both approaches are closely related in that
they are based on the idea of partitioning the system and solving subproblemsby diﬀerent simulation techniques. A similar idea also appeared in[18]. Tech-niques based on partitioning the system are often also referred to as
aggregation techniques
.As outlined above a clear disadvantage of stochastic simulation comparedto numerical analysis, provided that such an analysis would be possible, is therandom nature of simulation results. Thus, we argue that if a problem maybe tackled both by stochastic simulation and by numerical analysis, the lattershould be preferred. We propose an aggregation technique based on a partition-ing similar to that in [21]and [7]but without resorting to simulation. Instead,
in our method all resulting subproblems become tractable and are solved nu-merically. Since the simulation methods mentioned above are approximations,they obviously have two sources of inaccuracy, the approximation error due tothe partitioning and the inherent statistical uncertainty of stochastic simulation,whereas our method only has the approximation error.The basic ingredients of our method are the continuous-time Markov chaininterpretation as an abstraction from the srcinal system under considerationand the speciﬁc aggregation of states and transitions. We thereby revisit ideasfrom the analysis of fault-tolerant computer systems [1]and we appropriatelymodify these ideas according to our requirements. The remainder of this paper
A Numerical Aggregation Algorithm 301
is organized as follows. In section2we formally describe our model and discusssolution approaches. The numerical aggregation algorithm (NAA) is derived insection3,and its accuracy and eﬃciency are demonstrated in section4. Finally,
section5concludes the paper and gives directions of further research.
2 Mathematical Model
The general stochastic framework for biochemical systems leading to the CMEhas been well known for a long time (cf. [12,13]). Here, we ﬁrst establish andelucidate the intimate connection to continuous-time Markov chains (CTMC)and we introduce our notations thereby focusing on system (1). Then a brief exposition of numerical solution methods and the arising problems is given witha particular emphasis on large and stiﬀ systems.
2.1 Biochemical Reactions and Markov Chains
Let
X
(
t
)=
X
1
(
t
)
,X
2
(
t
)
,X
3
(
t
)
,X
4
(
t
)
be a vector such that
X
i
(
t
)
,i
∈ {
1
,
2
,
3
,
4
}
is a discrete random variable describing the number of molecules of species
S
i
at time instant
t.
If
X
(
t
) =
x
:= (
x
1
,x
2
,x
3
,x
4
)
∈
N
4
, the system is in state
x
attime
t,
meaning that for each
S
i
the current number of molecules is
x
i
. Assume,that initially the number of enzyme molecules is
x
(0)1
and for the substrate it is
x
(0)2
whereas no molecules of the enzyme-substrate complex or the product arepresent. For all possible states of the system:
x
1
+
x
3
=
x
(0)1
and
x
2
+
x
4
=
x
(0)2
.Hence, the maximum numbers of molecules of
S
1
and
S
3
are
x
(0)1
and for
S
2
and
S
4
they are
x
(0)2
. This implies a state space of size
n
= (
x
(0)1
+ 1)
·
(
x
(0)2
+ 1).Note that
n
is usually very large. For example, if
x
(0)1
= 200 and
x
(0)2
= 3000, itfollows
n
= 201
·
3001
≈
6
·
10
5
. Here, the state space grows exponentially in thenumber of involved species. Moreover, the number of molecules of a species canbe very large. Let
S
:=
{
(
x
1
,...,x
4
) :
x
1
+
x
3
=
x
(0)1
∧
x
2
+
x
4
=
x
(0)2
}
be thestate space of
X
(
t
)
.
System (1) consists of the three biochemical reactions
R
1
:
S
1
+
S
2
c
1
−−
S
3
, R
2
:
S
3
c
2
−−
S
1
+
S
2
, R
3
:
S
3
c
3
−−
S
1
+
S
4
,
where the stochastic interpretation is that the
reaction rates
(also called
tran-sition rates
) are proportional to the number of participating molecules and tothe stochastic reaction rate constants
c
j
,
j
∈ {
1
,
2
,
3
}
. For details and a rigorousformal justiﬁcation see [12,14]. The
propensity function
that gives the transitionrates of the reactions
R
j
is deﬁned by
λ
1
(
x
) =
c
1
x
1
x
2
,λ
2
(
x
) =
c
2
x
3
,λ
3
(
x
) =
c
3
x
3
.
Note that the
c
j
do not depend on the speciﬁc time
t
. The next state of the system only depends on
x
and the reaction type. If there exists a transitionfrom state
x
to state
x
with transition rate
q
(
x,x
)
∈ {
λ
1
(
x
)
,λ
2
(
x
)
,λ
3
(
x
)
}
then
302 H. Busch, W. Sandmann, and V. Wolf
we write
x
q
(
x,x
)
−−−−→
x
.
More precisely, for
x
= (
x
1
,x
2
,x
3
,x
4
)
R
1
:
x
c
1
x
1
x
2
−−−−→
(
x
1
−
1
,x
2
−
1
,x
3
+ 1
,x
4
)
,
if
x
1
,x
2
>
0 and
x
3
< x
(0)1
,R
2
:
x
c
2
x
3
−−−→
(
x
1
+ 1
,x
2
+ 1
,x
3
−
1
,x
4
)
,
if
x
1
< x
(0)1
,x
2
< x
(0)2
and
x
3
>
0
,R
3
:
x
c
3
x
3
−−−→
(
x
1
+ 1
,x
2
,x
3
−
1
,x
4
+ 1)
,
if
x
1
< x
(0)1
,x
4
< x
(0)2
and
x
3
>
0
.
The probability of leaving
x
within a small time interval of length
Δt
via areaction of type
R
j
is given by
λ
j
(
x
)
Δt
. Correspondingly, the probability of staying in
x
within this interval is given by 1
−
Λ
(
x
)
Δt
where
Λ
(
x
) :=
λ
1
(
x
) +
λ
2
(
x
) +
λ
3
(
x
) equals the sum of all outgoing rates of
x
and is called the
exit rate
. If
Λ
(
x
) is small, state
x
is
slow
whereas otherwise
x
is a
fast
state whichis due to the fact that 1
/Λ
(
x
) is the mean sojourn time in
x
. Let
p
t
(
x
) be theprobability that
X
(
t
) =
x
. Then
p
t
+
Δt
(
x
) = (1
−
Λ
(
x
)
Δt
)
·
p
t
(
x
) +
x
:
x
=
x
,x
q
(
x
,x
)
−−−−→
x
q
(
x
,x
)
Δt
·
p
t
(
x
)
.
This leads to the diﬀerential equations˙
p
t
=
ddtp
t
= lim
Δt
→
0
p
t
+
Δt
−
p
t
Δt
=
Qp
t
where
p
t
∈
R
n
≥
0
is the vector with entries
p
t
(
x
) and
Q
∈
R
n
×
n
is deﬁned by
2
Q
(
x,x
) :=
⎧⎪⎪⎨⎪⎪⎩
−
Λ
(
x
)
,
if
x
=
x
,q
(
x,x
)
,
if
x
q
(
x,x
)
−−−−→
x
,0
,
otherwise.Process
X
(
t
) is called a
(homogeneous) continuous-time Markov chain
(orCTMC for short). A CTMC is uniquely described by the
(inﬁnitesimal) gen-erator matrix
Q
and an initial distribution (cf.[3,10]). In general the stochasticinterpretation of chemical equations in the style of (1) always yields a CTMC asindicated in[12,13].
2.2 Numerical Solution of Markov Chains
Stochastic systems in general, and in particular Markov chains, are analyzedwith respect to their temporal evolution where one distinguishes
transient
and
steady-state
analysis. The latter refers to systems in equilibrium whereas theformer refers to the phase where an equilibrium has not yet been reached. Alarge amount of work exists on the numerical solution of Markov chains[24],where numerical solution means to compute probability distributions, eithertime-dependent transient distributions or steady-state distributions.
2
We assume that the state space is mapped to
N
.

Search

Similar documents

Tags

Related Search

Computer Assisted Language Learning For The Aa different reason for the building of SilburMSG is a neurotransmittor for the brainA Practical Method for the Analysis of GenetiA simple rapid GC-FID method for the determinIndia as a sourcing market for the commercialManaging Diversity at the Workplace for the AA conceptual framework for the forklift-to-grA question for the lads out there.If so many International Society for the Philosophy of A

We Need Your Support

Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks