Posted by Rick Kossik
As I noted in previous blog posts, I am currently working on an online CT Course. In order to provide some useful material for users in the meantime, I plan to post excerpts from time to time as I slowly progress through the development of the Course.The first excerpt simply provided the outline describing how the Course will be organized. The second excerpt discussed the key decisions required before starting to build a contaminant transport model. In this excerpt from one of the introductory Units of the Course, I discuss the problem of uncertainty in contaminant transport models and then summarize the underlying philosophy of the GoldSim Contaminant Transport Module.
The Problem of Uncertainty in Contaminant Transport Models
For most real-world systems, at least some of the controlling parameters, processes and events are often uncertain (i.e., poorly understood or quantified) and/or stochastic (i.e., inherently temporally variable). It is for this reason that GoldSim was specifically designed as a powerful probabilistic simulator.
Although this “problem of uncertainty” applies to any kind of system you may want to simulate (and it is unfortunately often ignored), due to the nature of the systems involved, this issue is especially important when dealing with environmental systems, and particularly so when trying to carry out contaminant transport modelling.
Sources of Uncertainty in Contaminant Transport Models
If you give it a little thought, the reasons for the high uncertainty associated with contaminant transport modeling should be clear. Engineered systems (such as a factory or a machine) are, for the most part, well-defined and understood (e.g., you can typically measure the important variables and properties of these systems, may have many similar systems that you can look at to better understand performance, and can often build and test prototypes). They still have uncertainty in their performance (primarily due to exogenous environmental variables that may be important, or perhaps behavior over very long time periods that is difficult to prototype), but this uncertainty is typically not extremely large (and that is why complex machines like airplanes are quite safe and predictable).
For environmental systems, however, we often have a much
poorer understanding of the system. A
major reason for this is that they often cannot be easily characterized (i.e.,
the relevant parameters cannot be easily measured). For example, if trying to
predict contaminant transport through groundwater, it is not practical or
feasible to completely characterize the subsurface environment and determine
the relevant properties (instead, you may have only a handful of data points). This
is complicated by the fact that the properties themselves (e.g., hydraulic
conductivity, chemical environment) are spatially variable. This spatial
variability is important because some of the key parameters controlling mass
transport for a contaminant, such as its solubility and partition coefficients
to various solids, may be extremely sensitive to local environmental conditions
that are difficult to characterize and potentially highly variable (e.g., pH
and redox). Relatively small changes in
local environmental conditions could result in order of magnitude changes in
these parameters.
Some parameters used to describe contaminant transport are
quite difficult to measure at all. An example of this is the dispersivity. Dispersivity is unusual
in that unlike a property like porosity it is not really meaningful to say that
it has some value at a particular point in space. This is because its value is
typically considered to be scale dependent (due to the concept of macrodispersion). Hence, it can be
thought of as a property of the entire system. This, of course, can make it
difficult to quantify (doing so may require a large-scale field experiment that
may not be feasible or practical).
In some cases, the processes themselves may be extremely
difficult to quantify and poorly understood.
For example, if your model included a pond with many chemical
constituents, and during the simulation the pond went dry due to evaporation, concentrations
in the pond would be very high (and spatially variable over small distances) and
hence the various precipitation reactions taking place while the pond
evaporates would be quite difficult to predict accurately. As a result, the behavior of such a system
would have lots of uncertainty.
In many cases, extremely important variables required for
predicting performance may be almost entirely unavailable and/or need to be
estimated using very poor information.
For contaminant transport models, particularly for existing waste sites,
the classic example of this is the source term.
There may be very limited information available regarding what was
disposed and when it was disposed.
This imposes a very large uncertainty on any predictive models.
In addition to these issues, it is often not practical or
feasible to carry out experiments or evaluate alternative designs for environmental
systems you are trying to model. In many cases, it is simply not possible to
build and test alternative designs for a proposed system (such as a mine) – the
system is simply too large to build and test realistic prototypes. The long
time frames involved for some systems also makes this impossible. For example, when
disposing of radioactive waste, highly engineered waste packages are often
used. Laboratory tests can be carried out on these for months or perhaps years
to evaluate their performance. But their
design life is typically thousands of years, and it is very difficult to design
experiments to extrapolate performance over such time frames. Models with long
time frames have many other difficulties.
For example, climatic factors (e.g., rainfall) will typically play an
important role in environmental models. But predicting future climate for
thousands (or even for tens or hundreds) of years in the future is difficult.
And, of course, even for very short duration models (months or years), the stochastic
nature of weather adds uncertainty to many environmental models.
All of these factors result in very large uncertainties (in
some cases, several orders of magnitude) in many of the parameters,
processes and events associated with contaminant transport models.
Why is Dealing with Uncertainty So Important?
Unfortunately, the large uncertainties discussed above are
often ignored. If uncertainties are acknowledged at all, the modeler often does
so by selecting single values for each parameter, labeling them as “best
estimates” or perhaps “worst case estimates”. These inputs are evaluated in the
simulation model, which then outputs a single deterministic result, which
presumably represents a “best estimate” or “worst case estimate”. They may also
carry out some simple sensitivity analyses on some of the parameters.
Unfortunately, dealing with large uncertainties in this
way when making predictions about the performance of an environmental system is
highly problematic.
Oftentimes, the potential behaviors you care about most
(e.g., exceeding an environmental regulation) arise from an unknown combination
of parameter values and assumptions. The “best estimate” approach may not
capture this. Because of this, defending “best estimate” approaches is often
very difficult. In a confrontational environment (e.g., demonstrating that a
particular facility will meet certain regulations), “best estimate” analyses
will typically evolve into “worst case” analyses. However, “worst case” analyses can be
extremely misleading. Worst case analyses of a system are likely to be grossly
conservative and therefore completely unrealistic (i.e., by definition, picking
lots of unlikely “pessimistic” values will almost certainly have an extremely
low probability of actually representing the future behavior of the system). And
it is not possible in a deterministic simulation to quantify how conservative a
“worst case” simulation actually is (i.e., define its probability). Using a
highly improbable estimate to guide policy-making (e.g., “is the design safe?”)
is likely to result in very poor decisions.
Due the large uncertainties in the parameters, processes and
events associated with contaminant transport models, and the problems with
deterministic approaches described above, GoldSim’s modeling philosophy
revolves around the belief that explicitly representing these uncertainties (by
carrying out probabilistic simulations) is critical. And as we will see below, realistically acknowledging and representing uncertainty has
important implications for how you should build and structure your models.
GoldSim Modeling Philosophy
The large uncertainties associated with contaminant
transport modelling discussed above strongly informs the
modeling philosophy embodied in GoldSim. This philosophy revolves around the
idea that the complexity and detail that you include in your model should be consistent
with the amount of uncertainty in the system. In particular, when building a model
and deciding whether to add detail to a certain process, you should not just
ask if you can (e.g., can you use a more detailed equation or more
discretization?), but you should ask if you should (does the amount
uncertainty I have in this process justify a more detailed model?).
An easy way to illustrate this is to consider a trivial example. Imagine that we had a process that was described using the equation Y = A + B. A and be have similar magnitudes. A is a parameter that has three orders of magnitude of uncertainty (that cannot be easily reduced). B, on the other hand, only has an uncertainty of one order of magnitude. By using a more detailed model, the uncertainty in B could be cut in half. Should we add more detail to the model of B? What we need to ask is how reducing our uncertainty in B will reduce our uncertainty in the “answer” (i.e., Y). Of course, in this trivial example, reducing the uncertainty in B has no significant impact on the uncertainty in Y (because it is dominated by the uncertainty in A).
In the real world, of course, the models are not so
trivial. But the basic idea still
applies: we should only add detail (to reduce uncertainty) to a process or
parameter if by doing so we can reduce our uncertainty in the ultimate measure
that we are trying to predict or the question that we are trying to answer. (In fact, probabilistic
modeling supports uncertainty and sensitivity analyses that allows you to
specifically determine if this is the case.)
This discussion may seems obvious, but anyone who has reviewed environmental models will readily acknowledge that most models have details that simply cannot be justified. The problem is not just that you are wasting time and money by adding these details. The problem is that such models can be misleading. The extreme detail can often act to mask the fact that there are huge uncertainties in the model, making the model seem more “correct” than it really is.
Top-Down Modeling
To avoid this problem, GoldSim supports and embodies a top-down modeling approach.
In general terms, there are two ways to approach any kind of modeling problem: from the "bottom-up", or from the "top-down". “Bottom-up” modeling approaches attempt from the outset to model the various processes in great detail, and typically make use of complex process-level models for the various system components. The emphasis is on understanding and explaining all processes in great detail in order to eventually describe the behavior of the entire system.
While such an approach may seem at first glance to be "scientifically correct", for the following reasons it is generally not the best way to solve many real world problems:
- The level of detail in a model developed from the bottom-up often becomes inconsistent with the amount of available information and the uncertainty involved. That is, a model is only as good as its inputs, and if you don't have much information, a detailed model is generally no better than a simple one.
- It is often difficult to appropriately integrate and capture interdependencies among the various model components in a bottom-up model, since it is often impossible (or computationally impractical) to dynamically couple the various detailed process-level models used for the components of the system. As a result, important interactions in the system are often intentionally or unintentionally ignored in a bottom-up model.
- It is easy for a bottom-up modeling project to lose sight of the "big picture" (i.e., to "not see the forest for the trees"). As a result, such an approach can be very time-consuming and expensive, with much effort being spent on modeling processes that prove to have little or no impact on the ultimate modeling objectives.
- Finally, such models tend to be very difficult to understand and explain (and hence be used) outside of the group of people who create them.
In a "top-down" modeling approach, on the other hand, the controlling processes may initially be represented by approximate (i.e., less detailed or “abstracted”) models and parameters. The model can then evolve by adding detail (and reducing uncertainty) for specific components as more is learned about the system. Such an approach can help to keep a project focused on objectives of the modeling exercise without getting lost in what may prove to be unnecessary details. Moreover, because a properly designed top-down model tends to be only as complex as necessary, is well-organized, and is hierarchical, it is generally easier to understand and explain to others.
There are four key points in the application of a top down modeling approach:
- Top-down models must incorporate a representation of the model and parameter uncertainty, particularly that due to any simplifications or approximations.
- As opposed to representing all processes with great detail from the outset, details are only added when justified (e.g., if additional data are available, and if simulation results indicate that performance is sensitive to a process that is currently represented in a simplified manner). That is, details are only added to those processes that are identified as being important with respect to your modeling objectives and where additional detail will reduce the uncertainty resulting from model simplifications.
- A top-down model does not have to be “simple”. Whereas a “simple” model might completely ignore a key process, a well-designed top down model approximates the process at an appropriate level while explicitly incorporating the resulting uncertainty that is introduced.
- A top-down model lends itself to being a “total system” model, in which all relevant processes are integrated into a single coupled model.
Iterative Modeling
How GoldSim Supports Iterative, Top-Down Modeling
- Spatial Resolution: Whereas a finite element or finite difference model would typically have a very high level of discretization (effectively thousands of finite volumes), a GoldSim model would have a much lower level of spatial resolution (perhaps tens or hundreds of finite volumes). Due to this lower spatial resolution, processes will typically be “abstracted”, “lumped” and/or averaged to some extent.
- Integration: Detailed, high resolution models are typically designed to do one thing very well (e.g., model groundwater flow and transport or model geochemistry). It is often difficult to appropriately integrate and capture interdependencies among the various model components when using such an approach, since it is often impossible (or computationally impractical) to dynamically couple the various detailed models used for the components of the system. As a result, important interactions in the system are often intentionally or unintentionally ignored. GoldSim, on the other hand, allows you to build a single model that focuses on integrating and coupling all system components.
- Uncertainty: For computational reasons, it is quite difficult to deal with uncertainty in detailed, high resolution models. GoldSim, on the other hand, allows you explicitly represent the (typically high) uncertainty present in contaminant transport models.
Since all models are wrong the
scientist cannot obtain a "correct" one by excessive elaboration…
Just as the ability to devise simple but evocative models is the signature of
the great scientist, so overelaboration and overparameterization is often the
mark of mediocrity.
No comments:
Post a Comment