Sampling Methods: Sequential Sampling or Monte-Carlo?

“Forecasts are difficult, especially when they concern the future”. No quote describes the fundamental challenge of the FI simulator better than this one. In the absence of a crystal ball, this simulator is based on the more or less intelligent generation of possible future asset returns series and the analysis of the distribution of a large quantity of those. This article describes the details of the generation process and compares advantages and disadvantages of the different methods.

1. Sequential Sampling

Sequential sampling (even if it was not explicitly labeled as such) was the only sampling method available in the simulator prior to version 0.7. As already described in the basics article, each starting month of our historical return series corresponds to a possible future development of our portfolio, simply by applying the historical return series of all assets following this starting month to today’s portfolio. If we use the data series going back the furthest, e.g. the Shiller data of the US stock market (from the beginning of 1871) or the Gielen data of the German stock market (from the end of 1869), we obtain approx. 154 years or 1844 different starting months, which span a relatively broad spectrum of possible samples.

The advantage of this method, apart from its simple implementation, is the best possible preservation of temporal correlations. This means that if markets e.g. return to their long-term mean after irrational exaggerations, this would automatically be reflected in all samples.

The disadvantages of this method are not only the fundamental statistical limitation to the available starting months, but also the high correlation of the generated samples. A return series that starts in Sept. 1929 is only marginally different from one that starts in Oct. 1929. Both still have the entire Great Depression ahead of them and the withdrawal rates that can be achieved are therefore very similar. This can also be seen very clearly if you look at the withdrawal rates as a function of the starting month. The development of the withdrawal rates per starting month follows a nice and steady curve:

Although this makes it somewhat easier to e.g. analyze the economic reasons for the resulting sequence-of-return risk, it clearly shows how related the data points actually are. Statistically, we certainly do not have 1800 independent data points. Furthermore, the history covered by our price data is limited as well. Yes, data from 1871 onwards certainly includes two world wars, various economic crises and several interest rate cycles, and our basic assumption that the future portfolio development will probably lie between the extremes still makes sense (I mean almost everybody uses this method, right?). Nevertheless, the low statistics here always left me with an uneasy gut feeling. As the saying goes: How can you tell that economists have a sense of humor? They use decimal points. One look at my table for the exact calculation of withdrawal rates with a decimal point followed by two decimal places shows that apparently I have a sense of humor as well.

As of version 0.7, there are now alternative ways of generating possible future return series in the simulator. These are based on Monte Carlo methods and hopefully address some of these concerns.

Problem: Correlations between different Assets

The simplest way to generate new returns series would be to determine monthly returns with a simple random number generator and just string them together. Typically, one would generate returns that fluctuate around the historically known mean value and also adjust the variance to the historical fluctuations (a completely arbitrary definition of mean value and variance would of course make no sense at all). However, this approach still has a serious disadvantage in a multi-asset simulation such as ours: even if we neatly determine the historical mean returns and variances for each asset and use them in the simulation, we would completely ignore a fundamentally important phenomenon: The returns of different assets have historically always been correlated! For example, government treasuries have often increased in value during crises, while share prices have taken a dive. This is ultimately the main reason why advisors recommend an mixture of bonds and stocks and why we think so much about our own asset allocation in the first place. However, simple Monte Carlo simulations would completely ignore such correlations between assets and would therefore simply be useless for such studies.

One possible countermeasure would be to also analyze these historical correlations and instruct the random number generator to only spit out suitably correlated random returns. However, this approach is ultimately limited by the fact that such correlations have unfortunately never been constant over time. And if you also try to take that into account, the accuracy of the calculations really suffers from the limited historical data. In other words, you end up feeding increasingly uncertain parameters into an increasingly complex simulation. Fortunately, though, there are modified Monte Carlo approaches that are simple enough but still take this problem into account.

2. Monte Carlo using IID Returns

A simple way out of the dilemma of wanting to preserve the correlations between assets in a Monte Carlo simulation is known as IID and stands for “Independent and identically distributed returns”. In this method, no new returns are randomly generated but historical returns are selected randomly, independently from each other and equally distributed. In other words, assuming we need random price developments for 30 years or 360 months, we would select 360 random returns from the existing 1844 historical returns for a new, random time series and string those together. And very importantly: In order to preserve the correlations between assets, each asset would receive its appropriate return from the same original month.

Such a simulation is easy to implement and can be selected in the FI Simulator since version 0.7. To do this, open the “Asset allocation and Stock/Bond/Inflation data Settings” tab and simply select “IID” in the new “Sampling method” field:

If you then look at the calculated withdrawal rates, you see a very different picture than above:

In our simulator, we use this Monte Carlo method to always generate 5000 random price movements, each corresponding to one point shown. These points are now only distributed along the horizontal axis for the sake of visual clarity and no longer correspond to a historical starting month as above. If you look at the vertical axis, the first thing you notice is that the range of possible withdrawal rates has increased. Apparently, a random selection of time series filled with IID returns contains even more extreme cases than our simple sequential simulation, that only differed in the starting month. The “safe” withdrawal rate in the simple Trinity case shown here with a $480,000 starting portfolio and 30 years of withdrawals is now only a paltry $737 instead of the “usual” $1200 with sequential sampling. On the other hand, in the best case, we could now withdraw over $9500 per month compared to “only” $6000 with sequential sampling.

To make things even worse, these extreme values can no longer be reproduced exactly. Refreshing the browser window will result in 5000 new random returns series being generated and these will show different minimum and maximum values. Sometimes these deviations can be quite large and it is very helpful to get a feeling for this behavior by refreshing the browser window several times. Anyone who has previously calculated “safe” withdrawal rates in the simulator, i.e. exactly the minimum values with an alleged “0% Default-Probab.”, will definitely hate this behavior. I have deliberately chosen not to save the so-called “seed” of the random number generator. This would have ensured that the same random sequence is always generated each time the program is called and then the data would have been reproducible. However, this would have suppressed precisely this phenomenon and, in my view, it is fundamentally important to be aware of this randomness of the distribution. After all, future price developments are also unpredictable and the exact reproducibility of the results to date is simply an illusion or an artifact of the sequential sampling previously used.

But there is hope: let’s not look at the extremes but at the median of the distribution, which is around $2960 and fluctuates only very slightly after refreshing the browser. This is of course due to the fact that the minimum and maximum are defined by individual randomly bad and randomly good returns series, whereas the median reflects the overall distribution of them. Although the median corresponds to the most probable withdrawal rate that can be realized from our portfolio, we are naturally more interested in the more problematic cases, as we want to understand the associated risks and minimize their effects as far as possible. How can we do this now with these new Monte Carlo methods?

Restrict Percentiles

For this purpose, since version 0.7 there is now a new field “Boxplot Percentiles” in the same tab, which is set to 0%-100% by default. This field now makes it possible to ignore the most extreme random returns series again. To do this, we set the boxplot percentiles e.g. to the value “1%-99%”, i.e. we ignore the worst 1% and best 1% of all random return series. The lower limit of the boxplot on the right then moves upwards to approx. $1190 and also remains relatively constant again after refresh. So it seems simply by ignoring 2% of the most extreme price movements, we seem to have tamed chance sufficiently.

But which percentiles should we really use in the Monte Carlo simulations? This question is pretty fundamental, not easy to answer and I will therefore avoid giving an explicit answer. However, I offer two contrasting perspectives: In favor of a relatively low threshold you would argue that extreme price developments can and will happen in the future and a conservative risk assessment should therefore take them into account. Another perspective that I once read in one of William Bernstein’s books would favor a higher threshold: Looking back in history, the probability of being personally affected by a war in our lifetime would probably be in the region of 10%. Does it really make sense to factor in bad returns series with a much lower probability of occurring, when such a war would entail completely different personal risks?

3. Monte Carlo using the Stationary Block Bootstrap (SBB) Method

We have seen that Monte Carlo simulations with IID returns have the advantage over simple Monte Carlo methods of preserving correlations between different assets. However, temporal correlations within an asset are completely lost due to the condition of “independent and identically distributed” returns. In order to take such temporal correlations into account in the random generation of returns series, we need to enhance our method. The algorithm required for this is known as the “block bootstrap method” and was introduced in 1994 by Dimitris Politis and Joseph Romano. In somewhat simplified terms, this method generates a random return series from the historical return series as follows:

First, an average block length is defined with which all series are to be generated. At the start, a random starting month is selected from the historical series and all assets are assigned the returns from that month. With a probability of (1-1/block length), the exact next month in the history is then selected and its returns are again copied to the new series. At a certain point in the time series, however, a new block starts (with a probability of 1/block length), which then begins again at a random starting month in the historical series. In the end, the random new time series is composed of a large number of historical blocks of different lengths, the average length of which corresponds exactly to the block length specified above. This means that temporal correlations up to approximately this block length are retained in the newly generated returns series.

This method can now also be selected in the “Sampling method” field as “Block Bootstrap (SBB)”. If chosen, an additional input field appears next to it in which the block length can be set:

As default, this block length is set to 120 months, i.e. the random price developments are then composed of historical blocks that are on average 10 years long.

If you look at the calculated withdrawal rates for the standard Trinity example, at first glance you see no real differences to the IID simulation (the percentiles were left at 1%-99% here, i.e. the bars of the boxplots only cover this range of points):

4. Comparing the different Sampling-Methods

Even if the results of the block bootstrap method look very similar to the results of the Monte Carlo with IID returns at first glance, it is worth comparing the methods in more detail. To do this, we first look at the two possible borderline cases of the block bootstrap method:

According to the definition above, a block bootstrap with block length 1 corresponds exactly to the Monte-Carlo simulation with IID returns, i.e. IID is the limiting case of block bootstrap when the temporal correlations are completely discarded due to the block length of 1.

On the other end, a block bootstrap with infinite block length corresponds exactly to our previous sequential simulation. This means we should see the difference to our previous results becoming smaller as the block length increases.

As of version 0.7, such a comparison of the sampling methods is now directly available as a sub-tab “Compare Sampling-Methods” below the tab “Calculation of Exact Withdrawal Rates”. Important: This comparison then of course ignores the preset sampling method and the block length, as it displays all available methods and a selection of block lengths side by side:

After the above consideration, it should come as no surprise that IID is shown on the far left next to SBB with block length L=1. Both methods should produce identical results (within the statistical mmargin). As you proceed to the right, the block length is gradually increased and you can see that the range of results first increases up to a block length of approx. 15-30 and then decreases again. With longer block lengths, the range of results approaches the previous sequential sampling (shown on the far right) again, as expected.

It may initially come as a surprise that a block bootstrap with a block length of around 20-25 apparently generates even more “extreme extremes” than the IID method. However, if you bear in mind that the real price performance of the stock market is not a purely random walk, but heavily influenced by market sentiment, this behavior makes sense: there are market phases in which one record return chases the next due to boundless euphoria, and there are just as many phases in which frustration leads to ever more severe price slumps. The block bootstrap retains such periods of emotional exaggerations and if the procedure then randomly strings together many of those extreme phases of euphoria and frustration, it should be clear that the resulting price developments should contain significantly more “extreme extremes” than a purely random distribution diced according to IID.

Last but not least, it can be seen that the median of the distributions remains almost completely unaffected by the choice of sampling method.

5. Conclusion

The inclusion of Monte Carlo simulations in the FI Simulator will probably raise a number of questions, not least because the new methods apparently lack the same “precision” in the calculation of withdrawal rates. Stable results now require a rather arbitrary restriction of the percentiles used in the distribution. I can imagine that some people are unhappy with this, but I find this approach more honest because it ultimately makes transparent a) how high the uncertainty in our results actually still is and b) how subjective we still have to be when interpreting those results. In this sense, the previous sequential sampling of historical data, which only differed in the starting month, in my view only resulted in a pseudo-precision that did not actually exist. Monte Carlo methods with their inherent randomness therefore make these limits of our predictive power much clearer and are therefore probably even better suited for a realistic risk analysis.