Lifestyle

A general probabilistic model of the PCR process

Description
A general probabilistic model of the PCR process
Categories
Published
of 26
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  Validation and Estimation of Parameters for aGeneral Probabilistic Model of the PCR Process Nilanjan Saha, Layne T. Watson, Karen Kafadar,Naren Ramakrishnan, Alexey Onufriev,Shrinivas Rao Mane, and Cecilia Vasquez-RobinetSeptember 28, 2006 Abstract Earlier work by Saha et al. rigorously derived a general probabilistic model for the PCR process that includes as aspecial casetheVelikanov-Kapral model whereall nucleotide reaction ratesare thesame. Inthismodel the probabilityof binding of deoxy-nucleoside triphosphate (dNTP) molecules with template strands is derived from the microscopicchemical kinetics. A recursivesolution for theprobability function of binding of dNTPsisdeveloped for asinglecycleand is used to calculate expected yield for a multicycle PCR. The model is able to reproduce important features of thePCR amplification process quantitatively. With a set of favorable reaction conditions, the amplification of the targetsequence is fast enough to rapidly outnumber all side products. Furthermore, the final yield of the target sequence ina multicycle PCR run always approaches an asymptotic limit that is less than one. The amplification process itself is highly sensitive to initial concentrations and the reaction rates of addition to the template strand of each type of dNTP in the solution. This paper extends the earlier Saha model with a physics based model of the dependence of the reaction rates on temperature, and estimates parameters in this new model by nonlinear regression. The calibratedmodel is validated using RT-PCR data. Key words: Levenberg-Marquardtalgorithm,multicycle PCR, nonlinearregression, polymerasechain reaction(PCR),probabilistic model, yield.1  1 INTRODUCTION The polymerase chain reaction (PCR) is a powerful technique used for the ampli cation of speci c segments of DNAor mRNA. The PCR process has become a technique of choice for bioinformatics researchers due to its capabilities of detecting and amplifyinglow copy segments. However,in practice, the PCR process does not always have a consistentrelation between the initial target amount and the absolute amount of the synthesized product. This is due to the PCR’shigh sensitivity to several variables whose effects on the containers (where the reaction takes place) are dif cult tomodel. Therefore, comparisons of the amount of product to that of an external control standard do not always lead toaccurate quanti cations. This problem, however, is addressed in quantitative competitive PCR (QC-PCR).In the QC-PCR, a competitive mRNA or DNA, namely an allelic variant of the target template, is used as an in-ternal standard to provide an internal control in the ampli cation process. Quanti cation is assessed by determiningthe amounts of co-ampli ed products from replicated proportions of the target with the dilution series of the competi-tor. A normalization based on co-ampli cation of a heterologous sequence, however, does not optimally address thedifference in yield due to different template ef ciencies in the ampli cation. It is quite dif cult to rigorously quantifythese differences.AnotherpopularversionofPCR, called real time PCR (RT-PCR), is usedwidely as an industrystandardto validategene expression data obtained from microarray experiments. Determining yield by following the real time kineticsof PCR eliminates the need for a competitor to be co-ampli ed with the target for the internal standard. Quantitationcan be performed by the more basic method of preparing a standard curve and determining an unknown amountby comparison to the standard curve. Real time PCR quantitation eliminates post-PCR processing of PCR products(which is necessary in QC-PCR). This helps to increase throughput, reduce the chances of carryover contamination,and remove post-PCR processing as a potential source of error.PCR is an extremelyimportanttechniqueforbiologists, with applicationsin research(e.g., clinical/food/veterinarymicrobiology), as well as clinical medicine (e.g., oncology, disease identi cation, chromosomal translocations.) Oneof the most important uses is in measuring gene expressions. Here, accuracy of quanti cation is extremely importantas slight variations in estimating initial mRNA (or cDNA) concentrations can lead to false conclusions. Due to PCR’sexponentialgrowth in product,the process of estimating this initial concentrationinvolves errors of many types and is,2  mathematically, ill-conditioned. Sensitivities to these errors increase with cycle number and the yield (ratio of actualproduct to exponentially increasing product) becomes nonlinear beyond a certain cycle number.Therefore, in spite of the popularity of the PCR, theoretical considerations to reliably describe its different ap-plications have relied mostly on experimental inferences rather than on mathematical derivations from biophysicalprinciples; consequently, the currently used expressions lack consistency since their foundations have not been clearlyestablished, frequently leading to empirical  tting procedures of experimental data that result in poor quanti cations.It is clear that a physicsbased modelwill predictthe yield of the PCR ampli cationin a muchbetter fashion. However,developing any type of generalized model is not an easy task. The present study extends and validates such a modeldeveloped by Saha  et al.  (2004). Before summarizing the formulation for this model, some background on the PCRampli cation is given.The PCR ampli cation process in general is conducted in vitro. The three primary ingredients for this processare the three nucleic acid segments: a double-stranded DNA containing the sequence to be ampli ed and two single-stranded primers. They react in an environment containing a DNA polymerase enzyme, deoxy-nucleoside triphos-phates (dNTPs), a buffer, and a magnesium salt (MgCl 2 ). Through cycles of combined denaturing, annealing (a vastnumber of primers is added to ensure complete annealing), and DNA synthesis, the primers hybridize to oppositestrands of the target sequence such that the synthesis stage proceeds across the region between the primers, thus dou-bling the DNA amount. Therefore, the products formed in successive cycles should result in geometric accumulationand the target ampli cation after  n  cycles can be approximated by N  n  = 2 n N  0 , where  N  0  is the initial amount of DNA segment to be ampli ed.The quantitative reliability of the PCR, however,is limited by the ampli cation process itself. Due to its geometricnature, small differences in any of the control variables will dramatically affect the reaction yield. The variablesthat influence the yield of the PCR process are the concentrations of the DNA polymerase, dNTPs, magnesium salt(MgCl 2 ), DNA,andprimers;the denaturing,annealingandsynthesistemperature;thelengthandthenumberofcycles;ramping times, and the presence of contaminating DNA and inhibitors in the sample. Even if extreme care is takento strictly control these parameters the tube-tube variation may sometimes affect the outcome of the reaction. The3  physical basis of such variation is not yet known. Some researchers [8,19] indicated that this variation might be dueto small temperature differences along the thermal cylinder block during the  rst few cycles. According to Wang et al.  (1989) and Gilliland  et al.  (1990), normalization based on co-ampli cation does not optimally characterizethe variation in yield due to differences in template ef ciencies. In reality it is a well-observed fact that the reactionef ciency is never 100 percent and does not remain constant duringthe cycles. Hence, the accumulationtrend is betterrepresented as N  n  =  n  i =1 (1 + ǫ i )  N  0 , where  ǫ i  is the cycle ef ciency and is estimated empirically from the experimental data.A different deterministic and more physics based approach was proposed by (Schnell and Mendoza, 1997a), whoused the law of mass action to derive the kinetic equations for PCR. Stochastic models for PCR have also beendeveloped (Mullis and Faloona, 1987; Mullis  et al. , 1994; Saiki  et al. , 1988; Stolovitzky and Cecchi, 1996; Wang  et al. , 1998; Weiss and Von Haeseler, 1995). Finally, a combineddeterministic and stochastic approachwas proposed byStolovitzkyandCecchi (1996). Theyuseda deterministicmass action equationto computethe ampli cationef ciencyand estimate the number of PCR cycles. Although these models lead to a better quanti cation for the phenomenon,they still do not provide an accurate solution because the ef ciency is assumed to be approximately constant duringall cycles.Velikanov and Kapral (1999) proposed a probabilistic approach to the kinetics of the PCR, which focused on themicroscopic nature of the ampli cation process. Their results indicated that the model was able to reproduce themain qualitative features of PCR kinetics, namely sensitivity to reaction conditions and leveling-off of the yield withincreasing number of cycles (the plateau effect). Though they were able to obtain a closed form solution for theirmodel, the model itself involved two unrealistic assumptions. First, the model assumed that the reaction rates of allnucleotides were identical. In reality, the chemical kinetics of nucleotides binding to the template strand dependsstrongly on the speci c nucleotide (Goodman, 1995). Second, the model assumes that the initial concentrations of the four nucleotides are the same. In fact, the number of each type of nucleotide at the beginning of each cycle maynot be the same and may influence the dynamics of the reaction in subsequent cycles (Saha  et al. , 2004). Saha  et al.  (2004) modi ed the master equation developed by Velikanov and Kapral (1999) to accommodate the fact that the4  initial template strand consists of the four different types of nucleotides, namely A, C, T, and G, and that the initialnumbers of these nucleotides present in the buffer solution at the beginning of each cycle are independent of eachother. The next section summarizes this general model derived rigorously in Saha  et al.  (2004). 2 PROBABILISTICMODELINGOFPOLYMERASECHAINREACTION Let  L  denote the length of the template strand and  ℓ  denote the length of the growing strand at time  t ;  ℓ 0  denotes thelength at  t  = 0 . A reasonable assumption is that, at the molecular level, the rate of change of the probability of areaction event is proportionate to the number of ways in which the molecules of the reactants available in the systemcan be combined for the reaction to take place. It can be further assumed that the template strand consists of all of thefour different nucleotides ( A ,  C  ,  T  , and  G ) in an arbitrary but given order.For this given template strand, the rate of change of the probability of a single nucleotide to be added depends onthe rate of reaction of the particular nucleotide  ˆ ℓ  ∈ { A,C,T,G }  that is complementary to the  ( ℓ  + 1) st nucleotideon the template strand, and the number  n ˆ ℓ  of such nucleotides present in the system. This probability rate of changeis denoted as  w ( l,t ) . It is important to note here that  ˆ ℓ  is the type of nucleotide that is complementary to the nextnucleotide on the template strand when the length of the growing strand is  ℓ . So, in this notation, w ( ℓ,t ) =  k ( ℓ,t ) n ˆ ℓ  ,  (1) where  k ( ℓ,t )  is the reaction rate coef cient that also depends on temperature. Two more parameters are necessary.The  rst one is  m 0ˆ ℓ , which denotes the initial number of nucleotides of type  ˆ ℓ  in the system. (The experiment isexecuted with target value for this number in mind, but, in practice, m 0ˆ ℓ  is known only to within some degree of error,which can be estimated from the data; see section IIID).The other one is  X  ˆ ℓ , which indicates the ratio of the numberof nucleotides of type  ˆ ℓ  to the total number of nucleotides of all types in the growing strand when the length of thegrowing strand is  ℓ . It is reasonable to assume that the total number of nucleotides of each type remains constant, so n ˆ l  =  m 0ˆ ℓ − ℓX  ˆ ℓ  .  (2) The evolutionof the probabilityfunctionis governedby a master equation(Cox and Miller, 1965; Gardiner, 1985).The master equation for the primer extension process was further developed by Velikanov and Kapral (1999) and is5
Search
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks