We are searching data for your request:

**Forums and discussions:**

**Manuals and reference books:**

**Data from registers:**

**Wait the end of the search in all databases.**

Upon completion, a link will appear to access the found materials.

Upon completion, a link will appear to access the found materials.

## How can (V_m) and (K_m) be determined from experimental data?

From initial rate data: The most common way is to determine initial rates, (v_0), from experimental values of (P) or (S) as a function of time. Hyperbolic graphs of (v_0) vs. ([S]) can be fit or transformed as we explored with the different mathematical transformations of the hyperbolic binding equation to determine (K_d). These included:

- nonlinear hyperbolic fit
- double reciprocal plot
- Scatchard plot

The double-reciprocal plot is commonly used to analyze initial velocity vs. substrate concentration data. When used for such purposes, the graphs are referred to as *Lineweaver-Burk plots*, where plots of 1/v vs. 1/S are straight lines with slope (m = frac{KM}{V_{max}}), and y-intercept (b = frac{1}{V_{max}}). These plots can not be analyzed using linear regression, however, since that method assumes constant error in the y axis (in this case (frac{1}{v})) data. A weighted linear regression or even better, a nonlinear fit to a hyperbolic equation should be used. The Mathcad template below shows such a nonlinear fit. In the laboratory, we will use a series of programs developed by W. W. Cleland specifically designed to analyze initial rate data of enzyme catalyzed reactions. A rearrangement of the corresponding Scatchard equations in the *Eadie-Hofstee plot *is also commonly used.

Mathcad 8 - Nonlinear Hyperbolic Fit. Vm and Km.

## Common Error in Biochemistry Textbooks

From integrated rate equations: (K_M) and (V_m) can be extracted from progress curves of (A) or (P) as a function of (t) at one single (A_0) concentration by deriving an integrated rate equation for (A) or (P) as a function of (t), as we did in equation 2 which shows the integrated rate equation for the conversion of (A ightarrow P) in the absence of enzyme. In principle this method would be better than the initial rates methods. Why? One reason is that is not easy to be certain you are measuring the initial rate for each and every ([S]) which should vary over a wide range. It's also time intensive. In addition, think how much of the data is discarded if you take an entire progress curve at each substrate concentration, especially if you quench the reaction at a given time point, which effectively limits the data to one time point per substrate.

In practice, the mathematics are complicated as it is not possible to get a simple explicit function of ([P]) or ([S]) as a function of time. Nevertheless, progress has been made in progress curve analysis. Let us consider the simple case of a single substrate (S) (or (A)) being converted to product (P) in an enzyme catalyzed reaction. The analogous equations for first order, non-catalyzed rates were

[A=A_oe^{-k_1t} ]

or

[P = A_0(1-e^{-k_1t}).]

Now let's derive the equations for the enzyme-catalyzed reaction.

The derivation of the relevant equations are shown below.

[ v = -dfrac{dS}{dt} = - dfrac{dP}{dt} = dfrac{V_MS}{K_M+S} label{15}]

[ int_{S_o}^S dfrac{K_M+S}{V_MS}dS = int_0^t t = -t=dfrac{S+K_Mln S-S_o-K_Mln S_o}{V_M}]

which on rearrangement gives

[S_0- S+K_Mlndfrac{S_0}{S}-V_Mt label{16}]

This equation is an implicit equation, not explicit, as it does not give (S(t)) explicitly as s function of (t). Equations ef{16} can be written with respect to the product (P) using

( P=S_o-S) or ( S=S_o-P)

[ S_o-(S_o-P)+K_Mln dfrac{S_o}{S_o-P}=V_Mt]

[ P -K_M ln left( dfrac{S_o-P}{S_o} ight) = V_Mt label{17}]

Rearranging Equation ef{17} gives

[ dfrac{P}{t}=dfrac{K_M ln left( dfrac{S_o-P}{S_o} ight)}{t} + V_M]

or

[ dfrac{P}{t}=dfrac{K_M ln left( 1-dfrac{P}{S_o} ight)}{t} + V_M label{18}]

A graph of (P/t) vs. ((ln (1-P/S_0)]/t) (shown below) from Equation ef{18} gives a straight line with a slope of (K_M) and a (y)-intercept of (V_m). Note that the calculated values of (V_m) and (K_M) are derived from only one substrate concentration, and the values may be affected by product inhibition

*Figure: Enzyme Kinetics Progress Curve*

Can a simple explicit equation of (P) vs. (t) be derived? The answer is no. However, it can be represented by an explicit *Lambert function* as shown in the derivation below.

Rearrange Equations ef{16} to get Equation ef{19}

[ S+K_Mln S = S_o + K_M ln S_o - V_Mt label{19}]

Let

[phi = dfrac{S}{K_M} label{19.5}]

noting that (S) changes with time. Substitute into Equation ef{19}.

[ phi K_M + K_M ln (phi , K_M) = S_o + K_M ln(S_o) - V_m , t]

Divide by (K_m) and rearrange to get

[ phi + ln(phi) = dfrac{S_o}{K_m} + ln left( dfrac{S_o}{K_M} ight) - dfrac{V_M, t}{K_M} label{20}]

Now consider the righthand side of Equation ef{20} and note the following equality:

[x= ln ( e^{x}) = x]

Apply this identity to Equation ef{20}

[ dfrac{S_0}{K_M} + ln left( dfrac{S_o}{K_M} ight) - dfrac{V_M, t}{K_M} = ln left[ e^{dfrac{S_0}{K_M} + ln left( dfrac{S_o}{K_M} ight) - dfrac{V_M, t}{K_M} } ight] = ln left[ dfrac{S_o}{K_M} e^{frac{S_o-V_Mt}{K_M}} ight]= ln (x) label{21}]

where

[x=dfrac{S_o}{K_M}e^{frac{S_o-V_M, t}{K_M}}]

Now substitute Equation ef{21} into Equation ef{22}

[ phi + ln left[ dfrac{S_o}{K_M} e^{frac{S_o-V_Mt}{K_M}} ight] = ln (x) label{22}]

Equation
ef{22} is analogous to the *Labert *(W) *function.*

[ W(x) + ln{W(x)}=ln(x) label{23}]

Equating the two left most terms of Equation ef{22} and ef{23} gives

[ phi = W(x) = W left( dfrac{S_o}{K_M} e^{frac{S_o-V_Mt}{K_M}} ight) label{24}]

and from Equation ef{19.5}

[ S = K_M W left( dfrac{S_o}{K_M} e^{frac{S_o-V_Mt}{K_M}} ight) label{25}]

which give (S(t)) as a function of (t) and the constancts (V_M) and (K_M).

- Enzyme Kinetics:Interactive Java Applet - Change E, S, P, k
_{cat}, (K_M), Hill Coefficient

## B8. Experimental Determination of Kinetic Parameters - Biology

SABIO-RK (http://sabio.h-its.org/) is a web-accessible, manually curated database that has been established as a resource for biochemical reactions and their kinetic properties with a focus on supporting the computational modeling to create models of biochemical reaction networks. SABIO-RK data are mainly extracted from literature but also directly submitted from lab experiments. In most cases the information in the literature is distributed across the whole publication, insufficiently structured and often described without standard terminology. Therefore the manual extraction of knowledge from the literature requires biological experts to understand the paper and interpret the data. The database offers the literature data in a structured format including annotations to controlled vocabularies, ontologies and external databases which supports modellers, as well as experimentalists, in the very time consuming process of collecting information from different publications.

Here we describe the data extraction and curation efforts needed for SABIO-RK and give recommendations for publishing kinetic data in a complete and structured manner.

## Introduction

Enzyme kinetics is an essential topic in undergraduate biochemistry courses. Steady-state kinetic studies, usually initial rate measurements, are a first approach in the characterization of enzyme function. The variation of initial rate with substrate and/or product concentration provides not only quantitative values of macroscopic reaction constants but also important information about the mechanism involved in the catalysis 1, 2 . Through kinetic analysis, a model for an enzyme-catalyzed reaction can be proposed, certain kinetic constants determined and a kinetic equation deduced. Although this approach alone cannot completely elucidate the mechanism of enzyme catalysis, it can provide useful information regarding the steps involved in the catalysis, that is, the order in which substrates add and products leave the enzyme.

With the purpose of introducing undergraduate students to basic enzyme kinetics, a laboratory experiment was designed in order to address the concepts of time course measurements, reaction rate determination, definition and importance of initial reaction velocity in steady-state conditions, initial rate dependence with substrate and enzyme concentration, and nonlinear regression analysis to obtain kinetic parameters. A week before the enzyme kinetics module starts, students attend two lectures, in which general concepts of enzymes, catalytic mechanisms, allosterism, and cooperativity are discussed. The enzyme kinetics module is organized in one seminar followed by two laboratory sessions (4 hr each). During the seminar, the principal concepts of basic enzyme kinetics are introduced, including transition state theory, steady state and initial velocity concepts, deduction of Michaelis–Menten equation, and enzyme inhibition. Even though only Michaelis–Menten kinetics is discussed, it is pointed out that there are non-hyperbolic behaviors on the initial velocity versus substrate concentration dependence. The significance of the kinetic parameters obtained and the limitations in terms of describing the enzyme catalytic mechanism are also discussed.

In the laboratory sessions, alkaline phosphatase is used as a model. This enzyme not only has been used for academic and teaching purposes 3-7 , but also its activity is determined in the clinical laboratory, as its increase in blood is associated with multiple pathologies, mainly hepatobiliary and bone disorders 8 . Alkaline phosphatase is a promiscuous enzyme with broad substrate specificity that catalyzes the hydrolysis of phosphate esters, with optimum *in vitro* activity at a pH of 8–10 9, 10 . One of the substrates used to determine phosphatase activity is sodium phenyl-phosphate, which is hydrolyzed to phenol and phosphate. As neither the substrate nor the products absorb in the visible region of the spectra, the reaction can be monitored by measuring the inorganic phosphate release with the colorimetric reagent Fiske-Subbarow 6, 11 . Given that the addition of this reagent not only allows phosphate quantification but stops the reaction as well by an abrupt change on the reaction media pH, fixed-time assays are performed to ascertain product generation after a period of time. To point out that enzyme activity can also be determined by continuously monitoring changes in substrate or product concentration when these can be directly detected, a data sheet of a commercial kit for the determination of alkaline phosphatase activity in serum using 4-nitro-phenylphosphate as a substrate is discussed with students.

It is important to remark that we have successfully used this reaction system for several years in different laboratory experiences as part of a course for third-year undergraduate Biochemistry students and it proved to be very robust. The assay is inexpensive and utilizes commercially available reagents and basic equipment. In the present work we describe the experimental approach used for the past 4 years.

## Abstract

In an attempt to find a novel catalyst system for atom transfer radical polymerization (ATRP), a parameter estimation method based on nonlinear regression was developed to evaluate various catalyst systems by determining kinetic parameters of polymerization. From our model system considering small molecular atom transfer addition reaction, we found that equilibrium constant of atom transfer reaction could be successfully determined using our parameter estimation method. However, the each value of activation rate constant and deactivation reaction constant is hard to be determined ubiquitously because of the poor sensitivities of them and the local minima trapping. By applying second minimization algorithm, the parameter estimation algorithm achieves higher propensity to reach global minimum, yet not all the time. The simulation results using kinetic rate constants determined by our method shows better agreement with the experimental data than that using literature values. This is because the current method uses fewer assumptions than other literature methods in determining rate constants. We also demonstrated the determination of kinetic constants in the polymerization of styrene and MMA using various metal catalysts, and the simulations using these kinetic constants agree well with the experimental data.

## Prelaboratory Expectations and Lab Skills Required

The concepts of Michaelis-Menten kinetics and enzyme inhibition should have been covered in the accompanying lecture course prior to the laboratory exercise so that students are familiar with the Michaelis-Menten equation, methods of graphing enzyme kinetic data, and the common types of inhibition (competitive, uncompetitive, mixed, and noncompetitive). In the laboratory, students should be experienced users of micropipettes and be familiar with operating a spectrophotometer. To ensure that our students were prepared to do the necessary calculations for the lab exercise, the following prelab questions were assigned and collected at the start of the lab period.

A 0.5 M phosphate buffer at pH 6.5 and a 10 mM substrate stock solution are available for preparing enzyme assays. Each assay must have a total volume of 3 mL and a final phosphate concentration of 100 mM. For a particular assay, a final substrate concentration of 3 mM is desired, and 100 μL of the enzyme solution will be added to the assay. Calculate the volume of buffer , volume of substrate stock solution , and volume of water that should be placed in a cuvette for the assay.

For an enzyme kinetic experiment it is recommended that the substrate concentrations extend from about 0.3 times *K*_{m}, to at least 5 times *K*_{m}. There should be at least 10 total data points, 5 of them having [S] above *K*_{m} and 5 of them having [S] below *K*_{m} 21-24 . The data points should be spaced more closely at low [S], with at least one at high [S] approaching *V*_{max}. If the *K*_{m} of an enzyme is known to be about 1.5 mM, list 10 substrate concentrations that could be used for the assay series.

## Introduction

Single-molecule measurements are providing insight into many phenomena that were previously intractable because of the ensemble averaging present in bulk measurements. 1 – 8 In particular, the dynamics of conformationally heterogeneous systems are benefiting from single-molecule studies. Protein folding and conformational dynamics, 9 – 15 enzymology, 16 – 20 ribozyme function, 21 bacterial light harvesting, 12 , 22 , 23 and protein–nucleic acid interactions 24 are just a few examples of complex systems that have benefitted from the application of single-molecule techniques.

One goal of single-molecule measurements has been to extract the rate of a dynamic process from a single-molecule trajectory. Single-molecule experiments have been used to obtain kinetic rate information about a variety of biological processes, including protein conformational dynamics and folding, enzymatic turnovers, RNA and DNA conformational changes, and fluctuations and function of large biological assemblies. The rates that have been determined have ranged from 1000 to 0.01 Hz. Indirect evidence of the presence of faster processes is common from heterogeneity detected in distributions determined from binned data. However, experimental and data analysis limitations have prevented their quantification.

Comparisons of rates between single molecules can show evidence of heterogeneity or conformational memory. 12 , 13 , 19 , 25 In the context of a molecular system, conformational memory or intermittency 26 results from transitions between unobservable states that modulate the dynamics of the observable states and can result in apparent dynamics of the observable states being non-Markovian, even if the underlying dynamics involving the observable and unobservable states is Markovian. Rigler and co-workers have reported non-Markovian dynamics and molecule-to-molecule differences in activity in the rate of single enzymatic turnovers. They characterized the dynamics in terms of a non-Markovian function that is sensitive to memory in the trajectory. 19 Dovichi and co-workers reported that differences in single alkaline phosphatase catalytic activity result from differing degrees of glycosylation or protease degradation using the total intensity of a fluorescence product turned over during a set incubation time. 27 Such conclusions require that rates be extracted from observations of single molecules and that reliable uncertainty estimates can be made otherwise, the heterogeneity between molecules cannot be determined with confidence.

The capability of routinely making single-molecule measurements has driven the need for new methods of analyzing single-molecule data that take full advantage of the new and increased information they provide. 28 Essentially three approaches have been used to quantify the rates of single-molecule fluctuations: the fitting of dwell-time histograms, the analysis of the dependence of the shape of a distribution on binning time, and the calculation of correlation functions. Of these methods, only higher-order correlation functions seem to be likely to fully utilize the information present in single-molecule measurements. 18 , 19 , 25 , 29

The most commonly employed of these three methods is the fitting of dwell-time histograms. When a single-molecule trajectory has sufficient contrast between states, thresholds can be applied to distinguish the states of the molecule. These thresholds are typically chosen manually and can introduce subjectivity into the analysis. Runs of each state are tallied to give histograms of the state dwell times, allowing for the determination of kinetic parameters by exponential fitting. Typically, this technique is limited to systems showing large modulations of the fluorescence signal. Binning of the data is also required, and this limits the temporal resolution of the measurement to be 1 or 2 orders of magnitude lower than the photon count rate to overcome the effects of shot noise. To mitigate the effects of shot noise, some investigators have applied filters to the data prior to applying a threshold. 11 This can substantially improve the time resolution of the experiment by mitigating some of the effects of shot noise, but there is still the difficulty associated with choosing a threshold.

Distribution narrowing has been used to estimate rates in cases where clear assignment of states is not possible. If the data are acquired at sufficiently high temporal resolution, they can be “rebinned” at a lower resolution, effectively averaging over some of the conformational fluctuations by causing exchange between different portions of a distribution. The bin-width dependence of a distribution can allow the time of interchange to be estimated by analogy with motional narrowing of spectral features in wavelength-resolved bulk spectroscopic measurements. This rebinning technique has been demonstrated for the conformational fluctuations in polypeptides and proteins 9 , 12 , 14 and appears to be useful for making estimates of interchange times when clear contrast between interchanging states does not exist and adequate trajectories are not available to determine correlation functions.

Correlation analysis is also commonly used and can provide a great deal of information regarding the time scales of fluctuations in the system. Correlation functions can be formally defined in terms of integrals over time with infinite limits. Conceptually, this corresponds to replacing the bulk ensemble average with a single-molecule time average, but such an approach can lead to difficulties for time scales that are not at least an order of magnitude faster than the average total observation time of a single molecule. A single molecule does not typically sample enough of its fluctuation spectrum during a single measurement to allow robust correlation analysis. In practice, a large number of trajectories must be averaged to obtain adequate mathematical accuracy, 11 , 19 , 30 , 31 particularly for the higher-order correlation functions that are sensitive to memory effects and temporal heterogeneity. 18 , 19 , 25 , 29 This prevents the examination of differences between single molecules. Finally, it can be difficult to determine the degree to which a model successfully describes the data using correlation functions.

One type of single-molecule experiment involves the observation of fluorescence fluctuations from an individual member of an equilibrium ensemble. Confocal microscopy coupled with high-sensitivity detection for time-correlated single-photon counting can monitor changes in fluorescence polarization, spectrum, lifetime, and intensity that arise from fluctuations in the system. 32 Single-molecule fluorescence measurements have some important fundamental limitations that restrict the rate, amount, and quality of information obtainable from the system. Because individual single fluorophores can emit only one photon at a time, they exhibit fluorescence anti-bunching at time scales very short compared to the fluorescence lifetime, thus limiting the maximum average observable count rate. Organic dyes are typically used as labels and always have a finite cross section for photobleaching. This limits the total number of photons that can be observed on average from a single molecule. Furthermore, in solution, there will be contributions from spontaneous Raman scattering of the solvent. Even though Raman scattering is weak, the high concentration of the solvent relative to a single molecule makes it a significant source of background in single-molecule fluorescence measurements. Background photons are uncorrelated with the state of the system and therefore degrade the average information content of the photon stream.

Converting the stream of detected photons, or photon arrival trajectory, into knowledge regarding the unobservable and dynamically changing state of the molecule is the goal of single-molecule data analysis and the topic of this paper. In this paper, we present a novel application of a statistical analysis method for extracting information about dynamic processes from single-molecule photon arrival trajectories. We specifically address the problem of extracting the rate of conversion between states and the number of states involved. We include treatment of the statistical uncertainties present in this type of single-molecule measurement and analysis to allow the determination of the significance of any differences observed between molecules. Our method allows us to demonstrate the fundamental limits of precision for determining this dynamic information by applying it to simulated data and to determine the degree to which experimental limitations due to background and detector crosstalk further limit the determination of dynamic information. Quantification of the fundamental precision limits of parameters derived from single-molecule trajectories has important ramifications for experimental design and interpretation.

We have in mind a single-molecule fluorescence measurement in which the molecular dynamics of interest will result in the signal switching from one detection channel to the other. We call this the “two-color problem”. Many single-molecule phenomena can be interpreted within this context, including spectral diffusion, fluorescence anisotropy, and FRET colocalization. Single-molecule measurements are limited in precision because of the finite number of kinetic transitions in the observation period. We show how to calculate this “kinetic shot noise” limit. The arrival of photons is stochastic and occurs at a finite rate. We quantify the degree to which these characteristics limit the fastest time scales that can be accurately measured. The knowledge of such limits is critical in experimental design, as it allows for the estimation of the lowest possible intensity that will still permit measurement of the fastest time scale of interest, which is important in minimizing the effect of photobleaching. We show how background and crosstalk between detector channels degrades the accuracy of the rates calculated. We show that our data analysis methods give substantial improvements in the time scales that can be measured for single-molecule photon arrival trajectories.

The utility of maximum-likelihood methods for analysis of single-molecule experiments has been previously noted. 33 – 36 For example, single-molecule energy transfer distribution measurements often report energy transfer yields that are negative or greater than 1. This unphysical result has been attributed to the broadening of the distribution due to finite statistical sampling (i.e., shot noise). This is a result of directly calculating molecular properties as one would from bulk measurements. A likelihood-based approach would not give such unphysical results, because the likelihood function includes information regarding the physical process generating the signal. As a result, the most likely parameters that give rise to the observed signal can be determined and will not include artifacts from shot-noise broadening.

The methods we describe operate directly on the photon arrival trajectory of a single molecule by evaluating a likelihood function without the need for averaging over many molecules such as is required for correlation functions. The likelihood function is defined by the solutions of the master equation for the kinetic process of interest and incorporates [by means of a hidden Markov model (HMM)] the corruption of the molecular state information due to backround photons and spectral crosstalk. The HMM formalism also allows us to directly model the effects of temporal heterogeneity, which can be considered to be the corruption of the molecular state information by the spectroscopic degeneracy of multiple molecular states. We demonstrate that this likelihood-based approach yields unbiased estimates of the molecular interconversion rate from the raw data stream with little or no user “tweaking” of the algorithm and that the uncertainty in the estimate of that rate remains low even when the interconversion rate reaches or exceeds the photon detection rate. We also show that the HMM approach is remarkably robust with respect to degradation of the signal by background and crosstalk photons. These results not only confirm the utility of the methodology, but also are useful in experimental design. We demonstrate how HMM-based methods, together with statistical model selection, can be used as an alternative to higher-order correlation function analysis for the detection of intermittency or temporal heterogeneity with a simple example involving a kinetic model previously studied by Schenter et al. 18 in the context of non-Markovian fluctuations of enzymatic reaction rates.

## SABIO-RK database

SABIO-RK is a manually curated database for enzymatic reaction kinetics. Data are either extracted from scientific articles [32] or directly submitted by wet-laboratory experimentalists [30] . During a typical workflow (Fig. 1), published data are manually inserted by the students or biological experts who first read the publications using a web-based input interface. Subsequently, the same input interface is used by database curators who read the paper a second time to validate the data and to adjust them to SABIO-RK data standards. This double check is needed to avoid errors and inconsistencies. Finally, the data are transferred to the public online database.

In the SABIO-RK database, biochemical reactions are defined by their reaction participants (substrates, products), modifiers (inhibitors, activators, cofactors), catalyst details (e.g. EC enzyme classification, UniProtKB accession numbers, protein complex composition of the active enzyme, isozymes, wild-type/mutant information, molecular weight) and their biological source (organism, tissue/cell type, cell location). This is not restricted to any organism classes. SABIO-RK data can be simply accessed through web-based user interfaces and web services. Various search criteria are selectable to search for biochemical reactions and their kinetics. Beside a free text search, complex and detailed queries can be executed in the advanced search. This may include the combination of several search criteria [e.g. reaction participants (substrates, products, inhibitors, activators etc.), pathways, enzymes, organisms, tissues or cellular locations, kinetic parameters, environmental conditions or literature sources]. When entering the search terms, the number of kinetic data entries is displayed that is available in the database matching the search criteria. Further sorting and grouping features are implemented in three views with a different focus, which also offer alternatives for further modification of the query. The search criteria also comprise SABIO-RK internal identifiers and identifiers from external databases (e.g. UniProtKB [3] , KEGG [33] , ChEBI [34] ) based on supplementary added annotations. Selected complete database entries or grouped datasets can be exported in different file formats: SBML, BioPAX/SBPAX and a simple table format. With the exception of the latter, annotations to external databases and ontologies are always included [7] .

SABIO-RK stores all of the kinetic information for one specific reaction under specific experimental conditions from a defined biological source in one dataset called the database entry. This information can be viewed and exported as a single dataset. As shown in Fig. 2, the general information about the organism and the tissue is described, as well as the enzyme and reaction participants (yellow), followed by kinetic information including rate laws and formulas and the corresponding parameters (red). Then, the experimental conditions pH, temperature and buffer are represented (green) and, finally, the original source of the data is cited (blue).

One of the main goals of the SABIO-RK database is to facilitate and support the process of computational modelling. Accordingly, SABIO-RK is integrated in systems biology applications [35] and a number of modelling platforms, including celldesigner [36] , virtual cell [37] or sycamore [38] , which either make use of SABIO-RK's web services or the web interface.

The SABIO-RK database is mainly populated with data manually extracted from the literature, which requires biological expert knowledge for an understanding of the publication, the extraction and standardization of relevant information, and the guarantee of high-quality data in the database. As a result of the missing controlled vocabularies and annotations to standard identifiers, the manual data extraction comprises extra work for the biological expert to interpret and assign the information. To reduce errors and inconsistencies during data insertion, database internal selection lists with controlled vocabularies are used and constraints are included to check and structure the data. For example, a consistency check is implemented within the input interface to control the parameters given in the rate equation with the list of available parameters. If not all of the parameters are given in the paper, for consistency reasons, ‘dummy’ parameters are created with values of ‘null’ to offer complete datasets for modellers during data export, especially in SBML format. Therefore, parameters are sometimes defined in the database, although no values were provided in the original literature. Only 78.8% of the database entries in SABIO-RK containing a rate equation include a substrate affinity constant (*K*_{m} or *S_*half) together with a reaction velocity constant (*V*_{max} or *k*_{cat}). Therefore, for more than 20% of the entries, either a substrate affinity constant or a velocity constant, or both, are missing, with the last representing, for example, rate equations for inhibitions where only a inhibition constant is given without further substrate or reaction-related parameters.

The manual data extraction and curation process also includes the annotation of data to ontologies, controlled vocabularies and external databases. SABIO-RK uses the following biological ontologies and controlled vocabularies for the various attributes: ChEBI [34] , SBO [20] , BTO (BRENDA Tissue Ontology) [39] , NCBI (National Center for Biotechnology Information) organism taxonomy [40] and Gene Ontology [41] . Based on these annotations, the correct interpretation, exchange, comparison and cross-referencing of data is possible. On the other hand, external databases such as KEGG [33] , UniProtKB [3] or ChEBI use these annotations to cross-reference to SABIO-RK database entries [7] .

As of May 2013, the SABIO-RK database stores kinetic parameters for 5737 different biochemical reactions in approximately 44 000 database entries. On average, ten database entries are extracted from one publication because any possible variation of the experimental conditions or tissues or organisms results in the creation of a new entry in the database. Based on the information available in publications, rate equations are available for approximately 52% of all entries in the database. The majority of these database entries are defined as being of Michaelis–Menten kinetic law type and represent 42% of all SABIO-RK entries (Fig. 3). Michaelis–Menten kinetics is provided for 3659 different biochemical reactions. Because only one-third of these reactions are single-substrate reactions (water is ignored as a substrate for that calculation), two-thirds of the biochemical reactions with Michaelis–Menten kinetics in SABIO-RK are multiple-substrate reactions and therefore do not represent real Michaelis–Menten laws *in vivo*. For multiple-substrate reactions, there are different types of kinetic mechanisms (e.g. ordered or random sequential ternary-complex, Theorell-Chance or Ping-Pong mechanisms for Bi-Bi reactions). These kinetic law types are usually only tagged by the type of reaction, although the corresponding rate equation is not provided by the authors. This missing information cannot be supplied by SABIO-RK database curators without making further assumptions, and therefore it cannot be considered by modellers for their computational model set-up [42] . If needed, modellers are able to handle this by using convenience kinetics [1] , in contrast to enzymologists who typically need to know the correct and detailed enzyme kinetic mechanism.

For the determination of kinetic parameter values for multiple-substrate reactions, Michaelis–Menten kinetics could be applied if one substrate is varied and the other substrate(s) is kept constant on a saturating level. Therefore, kinetic parameters such as *K*_{m} values are measured under pseudo-single-substrate conditions for one substrate at varied concentrations to determine its *K*_{m} value, whereas all other substrates are kept constant under saturating concentrations. Many publications describe the use of Michaelis–Menten kinetics for this single substrate without explaining detailed kinetics for the whole reaction. Under these conditions, the reaction could be seen as a single-substrate reaction and the parameter values should be given as ‘apparent’ values. In the literature, the naming of such parameter values as ‘apparent’ is not consistent. These *in vitro* analyses of multiple-substrate reactions cannot be extrapolated to *in vivo* conditions within living cells where the concentrations of the reaction participants are different from the *in vitro* saturating conditions [15] . Because of numerous problems with respect to measuring *in vivo* data, it should be realized that all current data in SABIO-RK are *in vitro* data and therefore any models built from these data should be critically regarded and extrapolated to the situation in living cells. ‘Models are not descriptions of reality they are descriptions of our assumptions about reality’ [10] .

## Implications of multiplicity in kinetic parameters to petroleum exploration: distributed activation energy models

Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.

Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.

The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.

**Note:** In lieu of an abstract, this is the article's first page.

Banks, H. T.: Modeling and Control in the Biomedial Sciences. Springer Lecture Notes Vol. 6. Berlin, Heidelberg, New York 1975

Bassham, J. A., Calvin, M.: The path of carbon in photosynthesis. Prentice Hall, Englewood Cliffs, New Jersey 1957

Bassham, J. A., Krause, G. H.: Free energy changes and metabolic regulation in state photosynthetic carbon reduction. Biochem. Biophys. Acta, **189**, 207 (1969)

Bassham, H. A.: Control of photosynthetic carbon metabolism. Science, **172**, 526 (1971)

Bassham, J. A.: Gerri Levine and John Forder III. Photosynthesis in vitro: Achievements of high rates. Plant Sci. Ltrs., **2**, 15 (1974)

Bellman, R., Kalaba, R.: Quasilinearization and non-linear boundary-value problems. New York: American Elsevier 1965

Bremermann, J. H.: A method of unconstrained global optimization. Math. Biosi. **9**, 1 (1970)

Bremermann, H. J.: Complexity of automata brains and behavior. In: Physics and Mathematics of the Nervous System (M. Conrad, W. Guttinger, M. Dal Cin, eds.) Biomathematics Lecture Notes, Vol. 4, Heidelberg: Springer Verlag 1974

Bremermann, H. J.: Complexity and transcomputability, in: The Encyclopaedia of Ignorance, Vol. 1, Physical Sciences (R. Duncan, M. Weston-Smith eds.) Oxford: Pergamon Press Ltd. 1977

Brown, W. S.: Altran User's Manual. Bell Laboratories (1973)

Garfinkel, D.: Simulation of glycolitic systems, In: concepts and models of biomathematics (F. Heimmets ed.) Dekker, N.Y. 1969

Garfinkel, D.: Computer applications to biochemical kinetics, A Rev. Biochem. **39**, 473 (1970)

Garfinkel, D., Frenkel, R., Garfinkel, L.: Simulation of the detailed regulation of glycolysis in a heart supernatant preparation, Comput. Biomed. Res. **2**, 31 (1968)

Garfinkel, D., Hess, B.: A detailed computer model of the glycolitic pathway in ascites cells, J. Biol. Chem. **239**, 71 (1964)

Gear, C. W.: Numerical initial value problems in ordinary differential equations. Prentice Hall, Inc. 1971

Heimmets, F.: Concepts and models of biomathematics. New York: Dekker 1969

Hemker, H., Hess, B.: Analysis and simulation of biochemical systems. Febs, North Holland, American Elsevier 1972

Himmelblau, D. M.: Determination of rate constants for complex kinetic models. Ind. Eng. Chem. Fundam. **6**, 539 (1967)

Jensen, R. G., Bassham, J. A.: Photosynthesis by isolated chloroplasts. Light activation of the carboxylation reaction, Biochem. Biophys. Acta **153**, 227 (1968)

Milstein, J.: Global Optimization by Weighted Random Directions and Interpolating Schemes. To appear

Milstein, J.: Estimation of the dynamical parameters of the Calvin photosynthesis cycle optimization and ill conditioned inverse problems. Ph.D. thesis, Berkeley: University of California 1975

Milstein, J.: Error estimates for rate constants of non-linear inverse problems, SIAM Journal of Applied Mathematics **35**, 3 (1978)

Roth, R., Roth, M.: Data unscrambling and the analysis of inducible enzyme synthesis. Math. Biosci. **5**, 57 (1969)

Rosenbrook, H., Storey, C.: Computational techniques for chemical engineers. Oxford: Pergamon Press 1966

Squire, W.: A simple integral method for system identification. Math. Biosci. **10**, 145 (1971)

Swartz, J.: Parameter estimation in biological systems, Ph.D. thesis, University of California at Berkeley 1973

Swartz, J., Bremermann, J. H.: Discussion of parameter estimation in biological modelling: Algorithms for estimation and evaluation of the estimates. J. Math. Biology **1**, 241 (1975)

Tanner, R.: Estimating kinetic rate constants using orthogonal polynomial and Picard's iteration method, Ind. Eng. Chem. Fundam. **11**, 1 (1972)

Tanner, R.: Identification, hysteresis, and discrimination in enzyme kinetic models, A.I.Ch.E., **18**, 385 (1972)

Yesley, W. G., Pollard, E. C.: J. Theoretical Biology, **7**, 485 (1964)

Zeitz, Stanley: Cell cycle kinetics modeling and optimal control theory in the service of cancer chemotherapy. Ph.D. dissertation. Dept. of Math. University of California Berkeley 1976. (Available from Dept. of Math.)

## Methods

The problem we address here is to infer the regulatory structure of a metabolic system, given a known structure for the reaction network (stoichiometry) and experimental time series for the dynamic behavior of that system. To address this question, and to explore the practical problems associated, we consider the following general representation of a biochemical network:

where *X* _{i} denotes the concentration of metabolite *i*, *μ* _{i,r} is the stoichiometric coefficient of metabolite *i* in process *r*, which indicates the number of molecules of type *i* produced or destroyed by process *r*, and *v* _{r} is the rate function of this process. In general, *v* _{r} is represented as:

There are two critical issues in defining this model. One is the selection of an appropriate mathematical representation for *v* _{r}, which may be a function of an arbitrary number of variables (substrates, products, and modifiers). In most cases the mechanism for each process are unknown and choosing a specific mechanistic rate law, such as a Michaelis-Menten rate law, becomes an act of faith. The other issue is the problem of identifying the regulatory structure of the system.

The most straightforward and theoretically well supported solution to both issues is the use of an approximate formalism based on a standard mathematical representation [10]. By adopting such a kinetic representation, identifying the regulatory structure of the system becomes synonymous to determining the set of values *θ* for the model parameters that better fit the available data. Hence, without losing generality, and as a first step towards a more complex framework, we will consider the case where the rates are modeled using a power-law formalism. Note, however, that our approach could be easily extended in order to accommodate any other structured kinetic formalism.

### Power-law models

Using the power-law representation, the rate *v* _{r} is expressed as follows:

where *γ* _{r} is an apparent rate constant for reaction *r*, and *f* _{r,j} is the kinetic order of metabolite *j* in that process. Note that this equation accounts for the effect of *n + m* metabolites (*n* dependent and *m* independent) on each reaction.

The advantage of this representation is that the same functional form represents all the rates. The reaction structure of the system will constrain the range of admissible values for some of the parameters. For example, all *γ* and *f* parameters for the substrates and catalysts of the reactions are by definition larger than zero. In addition, the values of the *f* parameters for all metabolites that are not directly involved in a given process are zero in the rate that describes the process.

By adopting such a kinetic representation, we can pose the problem of identifying the regulatory signals in a very compact mathematical form. If *X* _{j} is a modifier of *v* _{r}, then the corresponding kinetic order *f* _{r,j} will be different from zero (positive if it is an activator, and negative if it is an inhibitor). By substituting (3) into equation (1), we get what is known as a Generalized Mass-Action (GMA) model.

Note that the power-law formalism accounts for both the stoichiometry of the system (*the network structure*), and the reaction and regulatory structures (*kinetic orders*) using a single systematic nonlinear representation. This property is very important for defining a systematic way of exploring alternative regulatory signals. We will make use of this general and compact formalism in the derivation of the equations for the parameter estimation model.

### Parameter estimation in a GMA model

Given a set of experimental observations (i.e., time courses for the metabolites), our goal is to find the values of the apparent constants and kinetic orders that minimize the sum of least squared errors between the experimental data and the predicted dynamic profiles. This problem can be expressed in compact form as follows:

where *X* _{i} represents the state variables (i.e., metabolite concentrations), *X* _{0i} their initial conditions, *X* _{i,u} *exp* denotes the experimental observations, and *X* _{i,u} *mod* are the values calculated by the dynamic model (i.e., model predictions). *i* is the index for the set of state variables whose derivatives explicitly appear in the model, *γ* _{r} and *f* _{r,j} are the parameters to be estimated, and *t* _{u}, is the time associated with experimental point u belonging to the set *U* of observations. *k* is the total number of experimental data points and *n* is the number of time dependent variables.

Conventional parameter estimation approaches seek parameter values that minimize the approximation error assuming a given regulatory scheme (i.e., fixing some *f* _{r,j} to zero beforehand according to the aprioristic biochemical knowledge of the system). While this assumption simplifies the calculations, it can lead to poor approximations and hamper at the same time the discovery of new regulatory loops. In this work we introduce a rigorous and systematic parameter estimation and network identification method that makes no assumption regarding the regulatory network topology.

To model the existence of a regulatory interaction, we make use of the following disjunction:

In which *Y* _{r,j} *-* *,Y* _{r,j} and *Y* _{r,j} *+* are Boolean variables that are true if parameter *f* _{r,j} is negative, zero or positive, respectively, and false otherwise. ϵ is a very small parameter. Note that only one term of the disjunction can be active (i.e., exclusive disjunction), while the others must be false. Hence, if *Y* _{r,j} is true, metabolite *i* takes no part in velocity *r*. Conversely, if this metabolite has an influence on *r*, then *Y* _{r,j} is false and either *Y* _{r,j} *-* or *Y* _{r,j} *+* will be active. This disjunction can be translated into standard algebraic equations using either the big-M or convex-hull reformulations [29]. By applying the former, we get:

where Boolean variables *Y* have been replaced by auxiliary binary variables *y*. In these equations, M is a sufficiently large parameter whose value must be carefully set according to the bounds defined for the kinetic parameters.

A key issue in our approach is how to avoid overfitting. To this end, we make use of the Akaike criterion, which captures the trade-off between the number of kinetic parameters contained in the model and its ability to accurately reproduce the experimental data. If we assume that the error of the observations follows a normal distribution, the Akaike criterion takes the following mathematical form [17]:

Where *AIC* denotes the value of the Akaike criterion and *C* is a constant value that does not affect the optimization. The parameter estimation problem can be finally posed in mathematical terms using the following MIDO (mixed-integer dynamic optimization) formulation:

There are different solution methods to solve this MIDO (see [25]). Without loss of generality, we propose here to reformulate this problem into an equivalent algebraic MINLP (mixed-integer nonlinear program) using orthogonal collocation on finite elements. This allows exploiting the rich optimization theory and software applications available for MINLP in the solution of the MIDO. Note that the reformulated MINLP might be nonconvex. This will give rise to multimodality (i.e., existence of multiple local optima), preventing standard gradient-based solvers from identifying the global optimum. Deterministic global optimization methods could be applied to solve the MINLP, but they might lead to large CPU times given the size and complexity of a standard dynamic problem of this type. Details on the application of deterministic global optimization methods to parameter estimation problems of small/medium size can be found elsewhere [30, 31]. For the reasons given above, in this work we will solve the reformulated MINLP using local optimizers.

One important feature of our approach is that rather than calculating a single optimal solution, it identifies a set of plausible regulatory topologies by solving the model iteratively. That is, the model is first solved to identify a potential regulatory configuration represented by a binary solution (i.e., set of values of the binary variables). The model is then calculated again but this time adding the following integer cut, which excludes solutions identified so far in previous iterations from the search space:

Where *ONE* _{it} and *ZERO* _{it} represent the sets of binary variables that take a value of one and zero, respectively, in iteration it of the algorithm. After adding the integer cut, the model is solved again to produce a new regulatory topology, and this procedure is repeated iteratively until a desired number of configurations is generated. Hence, the algorithm produces as output a set of potential network configurations (encoded in the values of the binary solutions) rather than a single topology. Note that these regulatory topologies show a descendant value of the Akaike performance criterion.

KL contributed to the conceptualization of the project, carried out the MS and kinetic analyses and wrote the paper. GM performed all the structural biology experiments, analysed the data and wrote a major part of the paper. RGD contributed to critical discussion of the results and to writing the paper. SLS contributed to the purification of the ACE constructs and the MS analysis. EDS conceived the project, analysed the data and wrote the paper. KRA conceived the structural biology part of the experiments, analysed the data and edited the paper. All authors reviewed the results and approved the final version of the manuscript.

We thank the scientists at station IO3, Diamond Light Source, Didcot, Oxon (UK), for their support during X‐ray diffraction data collection. KRA and EDS also thank the University of Cape Town (South Africa) and University of Bath (UK) respectively for Visiting Professorships. This work was supported by the Medical Research Council (UK) Project Grant G1001685 (to KRA) and the National Research Foundation (South Africa) CPRR grant 13082029517 (to EDS).

In my opinion, mistakes are made. I am able to prove it. Write to me in PM, speak.

At all personally go today?

Prompt reply, attribute of ingenuity ;)

If you looked more often at a simple mathematical reference book, discussions on this topic could have been avoided altogether.