From ProcessDB
Revision as of 08:39, 11 November 2015 by Jzwolak (Talk | contribs) (Performance)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

The ProcessDB simulator is custom designed for the types of models typically seen in biological modeling. This custom design includes necessary features for biological modeling, such as discrete events, and increases the speed of performing a simulation by up to sixteen times. Our simulator is composed of three parts:

  • CVODE - a public domain, robust, and fast stiff ODE solver;
  • interpreter - our custom code for evaluating mathematical expressions, conditionals, and functions;
  • simulator core - our custom code that integrates CVODE and the interpreter, manages control flow, processes biological events, and outputs data for plotting or other use.

As an end user, mostly, you do not need to know about this simulator or think about it. However, there are some reasons you may want to be familiar with how it works. In particular, if the output is not as you expect then an understanding of what the simulator is doing may help you understand why the output is the way it is or, in some cases, change the input to the simulator so that the output is what you expect. Note in the latter case, this is not like optimizing model parameters and does not represent a problem in the simulator. Instead, there are inputs to the simulator that are critical to performing a correct numerical integration of a system of ordinary differential equations such as the tolerances of the system. Tolerances are domain specific (i.e., specific to your model) and an automatic selection of them is not always possible.

Choosing Tolerances

There are two types of tolerances: absolute and relative. ProcessDB lets you set both of them. The absolute tolerance is how close the magnitude of a number must be to zero before it is considered equivalent to zero. The relative tolerance is how close two numbers have to be to each other relative to their magnitude before they are considered equivalent.

In general, choosing smaller values will make the simulations more accurate and take more time to complete. Values too large or too small will cause the simulator to fail entirely. Last, if your simulations do not look correct and you suspect a "problem" in the simulator then try experimenting with different values of the tolerances. Usually a couple orders of magnitude up or down for the tolerances will not make an appreciable difference in the simulator output. If it does, then definitely choose the values that make the output look correct.

The values of the tolerances cannot be used to fit experimental data; they do not have that kind of control over the simulation output. If bad tolerances are chosen you may see one of the following (but not limited to one of the following):

  • large negative numbers for variables that have no physical possibility of being negative,
  • variables blow up to infinity when the system constrains them to some maximum (e.g., dx/dt=1-x with an initial value of x=0),
  • jagged lines (i.e., variables bouncing up and down creating sharp peaks and valleys when they should be steady or smooth),
  • small jagged lines in what you know should be a steady state value, or
  • non-horizontal straight lines on a linear plot when you expect a curve of any sort.

Notice how these symptoms represent a deviation from anything physically meaningful or valid. That is the underlying clue that bad tolerances are being used (though, it is also possible the model has an error in the math, so you might check that, too).

Choosing the Absolute Tolerance

When choosing the the absolute tolerance you want to choose a value small enough that meaningful values for variables are outside this range and values below are within the biological noise of zero for the system.

For example, if you're modeling a system where numbers of molecules are used for values of your variables then any number below one is not meaningful physically. You may not know the noise of the model, but you can say that if the noise is greater than 0.5 then you would not be able to distinguish between neighboring values in some cases (e.g., with noise of 0.7 a value of 1.7 could be 1 or 2). If you know the actual noise of your system you can use that number, but if not, a value of 0.5 for the absolute tolerance should be adequate for simulating your model. It is worth noting that with high noise it is possible that a value of 10, 100, 1000, or greater for the number of molecules could also equivalent to zero and your absolute tolerance may be 10, 100, or 1000, respectively, in this case. It's also fine to choose something like 1.0e-4 for your absolute tolerance, but this may cause the simulator to work hard and generate an error.

If your model is using concentrations to represent values of variables then it is likely your absolute tolerance should be much smaller than the previous example. Values of 1.0e-6 to 1.0e-12 have worked in a large number of cases. You may try something in the middle, like 1.0e-8, and see how things simulate. Again, if you know the noise of your system then use that.

CVODE supports setting the absolute tolerances independently for each variable. ProcessDB does not support this feature at this time so the absolute tolerance is the same for all variables.

Choosing the Relative Tolerance

The relative tolerance defines two values as equivalent if their difference compared to their magnitude is smaller than the value of the relative tolerance. If you have a relative tolerance of 1.0e-4 then the values 1.0001 and 1.0000 would be considered equivalent where 1.0002 and 1.0000 would not be equivalent.

Choosing the relative tolerance mostly has to do with how accurate you want your simulations to be. The more accurate the simulation, the more time it will take to complete. Choose a relative tolerance much smaller than the error in experimental data to ensure you can reproduce the data within the experimental error.

Unlike the absolute tolerance, the relative tolerance doesn't change depending on whether you are using concentrations or molecule counts for values of your variables. Ranges for the relative tolerance that have worked for a large number of models are 1.0e-6 to 1.0e-12. In many cases 1.0e-4 also works well. If you are uncertain then try starting with 1.0e-8 and adjust from there.

Negative Numbers

As stated in the CVODE FAQ, negative values may be present in a simulation for variables that cannot be negative in the physical world. This is normal as long as the magnitude of those values is comparable or less than the absolute tolerance. The numerical integrator (CVODE) cannot, due to real world limitations on computing with real numbers and analytic algorithms, calculate the exact values of variables in your model. This may be a surprise or somewhat disconcerting, but be reassured that this problem has been researched for over half a century now and numerical analysts understand the problem well and have minimized its impact on simulations. In fact, with stable numerical integrators (such as CVODE) the error in the simulations is guaranteed to be bound by some tolerance and you get to choose the tolerance. When the simulator is doing its calculations, a number with magnitude less than the absolute tolerance is equivalent to zero. You can treat such numbers as zero and it is fair to report them as zero in publications. Future versions of ProcessDB may set these numbers to zero just before output for display or plotting while allowing internal use to maintain the small error in calculations for stability of the algorithms.


The ProcessDB simulator is so fast that many simulations are done before your eyes move from the parameter you changed to the plot of the results. This enables dialable parameters that can be adjusted smoothly with the scroll of a mouse wheel or two finger gesture. There are benchmark results for identical tough models simulated in the ProcessDB simulator and Berkeley Madonna showing the ProcessDB simulator to be sixteen times faster than Berkeley Madonna.

This performance is also crucial for optimization algorithms which may require millions of simulations.