The goal is to construct a Markov Chain with a stationary distribution .
Metropolis Algorithm
First, we construct the proposal transition matrix where is symmetric. Then, choose an initial state randomly. Given ,
- Sample from , i.e. for all . We assume is observed.
- Let and generate .
- If then accept and .
- If then reject and .
This means we accept with probability . The output is a sequence
Remarks About Computation
- is a Markov Chain with .
- We can choose to be simple, like uniform jumps, where .
- We only need to now the ratio , so we don’t need to know the normalization value .
- is a proposal state.
- In practice, is easy to calculate.
- This algorithm works for non-symmetric and
Example 1 (1D Ising Model)
Recall the Ising Model, where we have particles each with spin or . So,
or the set of length strings of alphabet . Obviously .
Let
where 1. represents the “energy” of state .
Our target distribution2 is
where is the inverse temperature , and is normalization (partition function). This is hard to compute.
Note that a large small many and many
Here, we can apply the Metropolis Algorithm. Choose given by the random spin flips. For example, if :
we get the following. If , then the transition matrix is:
which is symmetric. Choose given . Some notes,
- represents the interaction strength between particles. It can be treated as a constant.
- is the external magnetic field on the particles. We can treat it as a constant.
Algorithm:
- Choose a site for uniformly at random among .
- Flip the th site to obtain the proposal state: Note that . This shows is symmetric.
- Compute Recall that we do this to avoid the partition function.
- Three cases for :
- If , (we picked a particle in the middle of the chain) then
- If (the left boundary), then
- If (the right boundary), similarly Thus,
- We can then accept with probability
- Thus,
- If then and set . (Here, the system lost energy, i.e. became more stable, so the second term is positive and thus greater than . We will accept).
- If then . (This is the opposite of above.)
- Generate .
- If , accept and
- If , reject and .
Why does it work?
- In step 2, we do not calculate since it is computationally inefficient. However, we can calculate the change, . Every term in cancels out but the changed term. This means we only need to find the sum of a few terms.
- The Ising model only involves interactions with the nearest neighbors. This is why we see changes in and , reducing the computations needed from to .
Theorem (Metropolis Satisfies Detailed Balance)
- The transition matrix of the Metropolis Chain is
- The idea in case is that we propose moving from with probability in Metropolis, but we accept this movement times. This is just the probability of these two events happening together.
- In case , accounts for the probability of rejection. Indeed, is the sum of chances we tried to move but failed.
- and satisfy detailed balance: for all . (Hence is a stationary distribution.)
- If and , then is unique, and
Proof:
: For
Then since is a stochastic matrix, for any state , the sum by law of total probability. This implies
which represents the “rejection probability”. The chain stays at state if either
- The proposal suggests staying at ,
- or if the the proposal was made, but was rejected with probability .
: We want to show that
Trivially, this is true for . Suppose . Then
However, this is only true because we let , i.e. the matrix must be symmetric. Thus Metropolis satisfies detailed balance, and is stationary.
: The condition ensures the chain is irreducible (any state can reach any other in one step). The condition ensures the chain is aperiodic. An irreducible, aperiodic chain on a finite state space is regular. By the FTMC, a regular matrix has a unique stationary distribution and .
Remark (Metropolis on Ising Model)
Using Metropolis on the Ising Model, we had
where sign flips would give us
which defines our proposal probability as
We can show that gives a regular Markov Chain. The idea is that any states in are at most flips apart.
Metropolis-Hastings Algorithm
We can run Metropolis with an asymmetric proposal matrix . The purpose of this is to relax the condition that must be symmetric in Theorem (Metropolis Satisfies Detailed Balance). This allows us to correct the model over-sampling states frequently proposed by regardless of the actual probability in the target distribution .
- Given , choose from .
- Accept with probability
The output is a Markov Chain of .
Convergence is hard to prove. Heuristically, consider that . So as , .
Footnotes
-
This is the Hamiltonian. ↩
-
This is the Boltzmann Distribution. ↩