GoldSim Blog: Modeling Spatially Correlated Rainfall in GoldSim

Posted by Jason Lillywhite

Effective water resource management hinges on accurately modeling precipitation. But what happens when rainfall patterns differ significantly between 2 locations within your study area. For example, precipitation on a valley floor compared to the mountainous watershed nearby? This post explores a practical method using GoldSim to simulate precipitation that is linked, or spatially correlated, across different locations.

We'll use real-world daily rainfall data from two distinct sites in Utah to demonstrate how to set up and parameterize such a model. Read on to see how rainfall correlation between these valley and mountain locations led to more realistic hydrological simulations in GoldSim.

Introduction

When we simulate rainfall that occurs randomly over time (stochastic precipitation) across larger areas, it's important to account for how rainfall varies from one place to another. Measurements at one rain gauge can be quite different from a nearby gauge, particularly if there are significant differences in elevation. While GoldSim, through tools like the PrecipGen, allows us to create stochastic precipitation models for a single point, simulating multiple locations realistically often requires us to capture the dependent relationship, or spatial correlation, between them. Without accounting for this, our models might simulate a storm in the valley but dry conditions in the mountains, or vice-versa, in a way that doesn't match reality.

Historical Data: Valley vs. Mountain Precipitation

To investigate this, I gathered daily total precipitation data from the Salt Lake City Airport, representing the valley, and the Parley's Summit SNOTEL site, high in the nearby Wasatch Mountains. Examining the historical data from these two gauges revealed a common pattern. The timing of wet and dry periods generally aligned: when a storm system moved through the region, both locations tended to receive some precipitation around the same time. However, the amount of rainfall often varied. There were also instances, typically lasting one to three days, where one location received rain while the other remained dry, indicating that not all storms affected both sites concurrently.

Locations of Salt Lake City Intl. Airport and Parley's Summit SNOTEL sites in Utah, USA

To properly parameterize our GoldSim model, historical daily precipitation data from 1979 to 2024 for both locations were analyzed. This analysis focused on two key aspects of their relationship: the correlation in rainfall amounts (intensity) and the correlation in the occurrence of wet or dry days.

Comparison of Measured Daily Precipitation at 2 sites (2017)

Correlating Rainfall Intensity

First, I explored how the amount of precipitation at the two sites related to each other, but only on days when both locations recorded rainfall. Daily rainfall data can often be skewed, so a simple square-root transformation was applied to the rainfall values on these concurrent wet days. Then, the standard Pearson correlation coefficient was calculated between these transformed amounts. This yielded a Pearson correlation coefficient of 0.1. This value quantifies the linear relationship between the (transformed) rainfall amounts when both sites are wet and was used in GoldSim to link the model components that determine how much it rains at each site.

Screen capture showing Pearson correlation

Correlating the Occurrence of Wet and Dry Days

Next, I addressed the timing of wet and dry days. A straightforward initial approach was to calculate the correlation directly from the historical wet/dry status (treating 'Wet' as 1 and 'Dry' as 0). This calculation, known as the Phi Coefficient, resulted in a value of 0.27.

Screen capture showing the Phi calculation

While this 0.27 value measures the direct association in the historical record, it seemed a bit low for effectively driving the simulation's synchronization. In GoldSim, the occurrence of a wet or dry day is often determined by comparing a randomly generated number (the "chance" of rain) against each site's specific probability of being wet, which can change daily based on recent conditions (e.g., whether the previous day was wet or dry). Because the mountain site is naturally wetter than the valley site, their daily probabilities of rain often differ.

This led to the realization that a stronger underlying connection between their respective "chance" mechanisms might be needed to make the simulated wet/dry patterns align realistically, especially when their baseline probabilities of rain diverge. We need a method to estimate the correlation between these underlying, unobserved continuous "chance" factors that ultimately produce the observed binary (wet/dry) outcomes.

To explore this, I looked at the pattern of concurrent conditions: how often both sites were wet, both were dry, or one was wet while the other was dry. A useful measure in such situations is the Odds Ratio. The Odds Ratio helps quantify the strength of association between two binary conditions, which is the wet/dry status at the two sites, in this case. It's calculated by first creating a 2x2 contingency table showing the counts of the four possible daily combinations as shown below. The Odds Ratio is then calculated as (A * D) / (B * C). For this dataset, the Odds Ratio was approximately 3.5, indicating that the sites were about 3.5 times more likely to have matching conditions (both wet or both dry) than mismatching conditions.

We can use the counts from that 2x2 contingency table (A, B, C, D):

A = Both Wet

B = Site 1 Wet, Site 2 Dry

C = Site 1 Dry, Site 2 Wet

D = Both Dry

Odds Ratio (OR) = (A * D) / (B * C)

This Odds Ratio, while not a direct correlation coefficient itself, can be used in a statistical approximation (related to a concept called tetrachoric correlation) to estimate the correlation of the underlying continuous variables that drive the wet/dry outcomes. This approach uses the observed wet/dry data to infer how strongly the underlying "chance" factors (like GoldSim's randomly drawn numbers) are related.

This method yielded an estimated correlation of 0.5. Because this value was higher than the simple Phi coefficient (0.5 vs. 0.27) and seemed to better reflect the necessary linkage for the simulation's random chance generators to achieve realistic synchronization, 0.5 was chosen to correlate the model elements responsible for determining wet or dry day occurrence (Wet_Dry_Chance Stochastic) between the two sites in GoldSim. This approach helps ensure that the model more accurately simulates the tendency for both sites to experience storms at the same time. The calculation of the odds ratio and the estimated underlying correlation was performed using a Python script.

Implementing Correlated Precipitation in GoldSim

The stochastic precipitation simulator in GoldSim (PrecipGen or WGEN) use a Markov Chain-gamma daily method, relying on two primary stochastic functions. The first, which we can call the "Wet/Dry Chance Stochastic," determines if precipitation occurs. This is usually driven by comparing a uniformly distributed random number [U(0,1)] against monthly wet/dry transition probabilities (e.g., the probability of a wet day following a wet day, P(W|W), or a wet day following a dry day, P(W|D)) derived from historical records. The second, an "Intensity Stochastic," is activated only on simulated wet days and samples from a statistical distribution (like the Gamma distribution), based on historical data, to determine how much precipitation falls.

Screen capture of the PrecipGen model with Stochastic elements

A key feature of GoldSim is its ability to define correlations between these stochastic elements across different parts of a model. This provides the mechanism to link the precipitation behavior at our valley and mountain locations, using the correlation coefficients derived from our data analysis. For instance, the 0.5 correlation was applied to the Wet_Dry_Chance Stochastic elements of the two sites, and the 0.1 correlation was applied to their Intensity_Stochastic elements.

Stochastic element properties dialog with correlation coefficient

Observing Model Results

Upon plotting the generated daily precipitation time series for the two correlated sites, the model produced storm events whose timing appeared graphically similar for most of the period, mirroring the behavior seen in the historical data.

Screen capture of the top level of the model for precipitation simulators for both locations

While one site might still receive significantly more or less precipitation than the other on any given wet day (due to their different intensity distributions and the modest correlation of 0.1 for intensity), the occurrence of wet periods largely lined up realistically. The separate correlation of the Wet/Dry Chance stochastic, guided by the analysis leading to the 0.5 coefficient, seemed effective in capturing this aspect of synchronized storm timing.

Comparison of Simulated (1 realization) and measured precipitation Feb-May

If you are interested in playing with this model and viewing the Excel and Python script, please download a zip of the files from here.

Conclusion and Next Steps

This exercise demonstrated a practical workflow for modeling spatially correlated precipitation in GoldSim. The process involved recognizing the need for spatial correlation from site characteristics and data observations. Then, GoldSim's capability to correlate individual stochastic elements within its precipitation generator was leveraged. This required targeted data analysis on historical records to derive separate correlation coefficients: one for the occurrence of wet/dry days (informed by methods like analyzing the Odds Ratio to better capture underlying dependencies) and another for the magnitude of rainfall on concurrent wet days (e.g., using Pearson correlation on transformed data). Finally, these coefficients were implemented in the GoldSim model to link the appropriate stochastic elements determining wet/dry chance and rainfall intensity between the simulated locations.

By using tools like Python and Excel for data analysis, we were able to derive meaningful parameters that resulted in simulations exhibiting qualitatively realistic spatial precipitation patterns. This approach provides a tangible way to move beyond single-point rainfall generation and build more representative distributed hydrological models. Further refinement and testing of this method on other datasets will continue, and we anticipate sharing more results and insights in the future.

Pages

May 6, 2025

Modeling Spatially Correlated Rainfall in GoldSim