Wednesday, November 7, 2018

Statistical Analysis of Streamflow Time Series

Posted by Jason Lillywhite


Using GoldSim's built-in probabilistic simulation capabilities, it is straightforward to perform statistical analysis of time series data. An example model was built and added to our library, which demonstrates how Monte Carlo simulation is used to analyze daily time series data of streamflow to produce daily, monthly, and annual statistics. In addition to this, the model finds a best fit probability distribution for these statistics.  
Below are screen captures of the model. This model is using USGS streamflow data from a river gage in the Weber River near Oakley, Utah but you can use this model for any daily streamflow data. Please go here to download and use the model.


The above image is a screen capture of the model's main dashboard showing how your own data can be loaded and analyzed. This page displays the daily time history results along with percentile plots for the daily flow values for all years. From this page, you can navigate to other pages that allow you to view monthly statistics and probability distributions along with how the measured data fits to analytical distributions such as Normal, Gamma, Log Normal, and Log Pearson III distribution types.


The above screen capture shows another dashboard from the model that provides a summary of monthly and annual statistics. These results are captured using built-in functions on the probabilistic results of the model after it simulates all the years on record. These functions are the result of only 6 elements within the model.


For each month and for the year, GoldSim captures a probability distribution from the values of all the years on record. These can be plotted as a probability density function plot (shown in the red lines of the plots above). In addition, the statistical summaries of these can be used to derive an analytical distribution, which is plotted as a green line. This model evaluates how well the empirical distribution fits the shape of the analytical by calculating a correlation coefficient for each month and the year along with a plot showing scatter plots of the paired values between the two. This is shown in the following dashboard:


In the above screen capture, we can see how well the empirical probability distributions fit to the analytical. These plots are produced by sampling from each of the distributions for each month at specified quantiles then plotting the pairs. Rank (Spearman) or value (Pearson) correlation coefficients are calculated and these range between -1 and +1, and express the extent to which there is a linear relationship between the selected result and an input variable. The value correlation coefficient is computed as follows:


Where:


  • Crp= the value correlation coefficient
  • n = the number of selected data points (realizations)
  • pi = value of output p for realization i
  • ri = value of output r for realization i
  • mp = mean value of output p
  • and mr = mean value of output r


The more linear correlation plot shown in the plots of that screen above will indicate a better correlation between the empirical and analytical distribution and therefore provide a guide to make a selection of the fitted distribution. In this model, the analytical distribution is automatically selected based on this correlation coefficients for each month and the year. The following probability distribution types are evaluated: Normal, Gamma, Log Normal, and Log Pearson III.

If you would like to try using this model, please go to our library to download it, here.

If you do not have the GoldSim software, but are interested in trying it out, please visit our website for a trial version.

No comments:

Post a Comment