bopsavenue.blogg.se - Constrained latin hypercube sampling

We will see something similar when simulating using MCS and LHS. The advantage of stratified sampling over simple random sampling is that even though it is not purely random, it requires a smaller sample size to attain the same precision of the simple random sampling. In random sampling the 20 people are chosen randomly (without the use of any structured method) and in stratified sampling, 4 people are chosen randomly from each of the 5 districts. Suppose we want to pick 20 people from a city which has 5 districts. In order to give a rough idea, MC simulation can be compared to simple random sampling whereas Latin Hypercube Sampling can be compared to stratified sampling. In MCS we obtain a sample in a purely random fashion whereas in LHS we obtain a pseudo-random sample, that is a sample that mimics a random structure. Monte Carlo Sampling (MCS) and Latin Hypercube Sampling (LHS) are two methods of sampling from a given probability distribution. There is no upper bound in dimensions for which LHS is proven to be effective.Latin Hypercube Sampling vs. These conclusions are all irrespective of the number of dimensions in the sample. Owen 1997 says this is "not much worse than" simple random sampling. A LHS of size $n > 1$ has variance in the non-additive estimator less than or equal to a simple random sample of size $(n-1)$. See here from the accepted answer, and also Stein 1987 and Owen 1997.įor non-additive functions, the Latin hypercube sample may still provide benefit, but it is less certain to provide benefit in all cases. The conclusions in the literature are clear:įor estimating the variance in functions which are "additive" in the margins of the Latin hypercube, then the variance in the estimate of the function is always less than the equivalent sample size of simple random sample, regardless of the number of dimensions and regardless of sample size. If you read the chapter cited by the accepted answer here, they talk about effectiveness of variance reduction or efficiency being measured relative to some base algorithm like simple random sampling. The plots they showed were the confidence intervals for the mean of their cost function with increasing sample size for 1 dimension and 2 dimension. The original poster was looking for an amount of "variance reduction" in the Latin hypercube. I interpret the literature cited in the accepted answer differently. Once you move outside the realm of additive functions, it's very hard to predict how much of an improvement you'll get. It also contains a number of references to the literature: some researchers have found that LHS substantially outperforms simple random sampling, whereas others have noted minimal improvements. This latter article also suggests a rule of thumb that LHS is most effective when at most 3 inputs/dimensions contribute most of the variation in the output. I also note this blog post by Lonnie Chrisman which argues in favour of LHS as a default for sampling. Indeed, many researchers continue to use LHS regularly as a default sampling option. He seems to be considering the situation where it is trivial (by modern computing standards) to evaluate the output function at each sampled point in parameter space, so I don't think this article is a reason to avoid LHS. There's an interesting blog post by David Vose in which he explains why he doesn't implement LHS in his ModelRisk software. LHS is essentially never worse than simple random sampling, so you can always use LHS as a default sampling method and this decision won't cost you anything. In practice, this behaviour doesn't actually matter. I agree with the answer by R Carnell, there is no upper bound on the number of parameters/dimensions for which LHS is proven to be effective, though in many settings I've noticed that the relative benefits of LHS compared to simple random sampling tend to decrease as the number of dimensions increases.