Tip:
Highlight text to annotate it
X
This project parallelizes Monte Carlo methods. A Monte Carlo method randomly samples to better
understand a distribution of interest. Remember in problem set 1 how we estimate pi by throwing
darts on a square? Our project is harder.
We contribute to the Monte Carlo canon by parallelizing across proposals, densities,
and parameters. This is achieved through MapReduce, MPI, and CUDA, respectively.
Our first contribution is parallelizing weight-based sampling through MapReduce.
Our input is a target distribution of interest, and a proposal distribution we can sample
from. Mapreduce first MAPS out sampling and weighting duties to various processes, then
REDUCES to calculate summary statistics. With scale we can improve the properties of our sample
estimates, especially when using complicated target distributions. While no mainstream
functions perform this computation, we have two weight-based methods that work on both
one and two dimensions.
Our second contribution is performing parallel tempering two different ways using MPI.
Our input here is a target distribution with multiple regions of high density, separated
by a region of low density, for which normal MCMC techniques may be unable to draw representative
samples as seen here. Parallel tempering runs chains inflating the original density to fix
this problem. We present two parallel implementations, one using the conventional Master/Slave model
and another using coupled random number generation. Our latter implementation achieves a three-fold
speedup over the former showing the importance of reducing unnecessary communication. Our
final results seem pretty awesome.
Our third contribution is a pixel clustering algorithm achieved through Metropolized-Gibbs
in CUDA.
We wish to assign a pixel to one of four blocking groups to 4-cluster this image. For each pixel,
we ask it if it wants to switch blocking groups, and the pixel accepts or rejects knowing it
wants to block with similar neighboring pixels. We take advantage of the Ising model pattern
to checkerboard-parallelize this in CUDA. We see that oftentimes the best method of parallelism
is specific to the problem. Here is our president in 4 shades of gray. our parallel
implementation performs reasonably well.
Overall, our project uses the parallel computing frameworks learned in CS 205 to tackle a variety
of problems present within Monte Carlo methods. Thanks so much everyone! :D :D