Importance Sampling

Importance Sampling is a way to compute an expectation/integral more efficiently by sampling from a different distribution that puts more probability mass on the parts of the domain that matter the most, and then reweighing them to keep the estimator correct.

Motivation: Monte Carlo Integration

For example, consider MC Integration:

where

  • where is bounded, i.e. .
  • is bounded and integrable (continuous)
  • is a PDF where with .

Note that and

Special Case: CMC via a Uniform Choice

More generally, any integral on can be rewritten as an expectation by introducing a PDF on :

One simple choice is the Uniform density on :

If we plug this is in,

such that

as desired.

Importance Sampling (Variance Reduction)

Instead of sampling from the “default” density, choose a proposal density on (with wherever ) and reweigh:

This gives

for and . Some remarks:

  • is called the importance function
  • is the likelihood ratio

Algorithm

  1. Generate iid.
  2. Set

Theorem

It’s worth asking what is the optimal to minimize the variance of . The variance

is minimized at PDF given by

Thus, the minimum variance is

Proof:

Let

Then with i.i.d., so

Using :

and

Therefore

Since does not depend on , making small makes the estimator variance small. Finally, the left term of the subtraction is minimized via Jensen’s Inequality.

Remarks

  • is the optimal importance function
  • If on then and so is . Intuitively, , so the above is .
  • But this requires knowing , which is already what we’re trying to do.
  • In general, involves knowing which is similar to , and practically, using is not possible.

Example 1

Consider on where and on .

By Crude MC,

the variance is

Note that

Example 2 (Using Variance Reduction)

Let’s try

Desmos Graph

This gives the estimator

for iid. So,

Then

by applying the variance formula for estimators above. What if we overdo it with a large ?

Desmos Graph

Let’s try for . being when is not ideal. What does that look like?

The estimator

with . Then

which shows it’s a bad estimator.