Cs 205 Final Project - Parallel monte carlo

This project parallelizes Monte Carlo methods. A Monte Carlo method randomly samples to better understand a distribution of interest. Remember in problem set 1 how we estimate pi by throwing darts on a square? Our project is harder. We contribute to the Monte Carlo canon by parallelizing across proposals, densities, and parameters. This is achieved through MapReduce, MPI, and CUDA, respectively. Our first contribution is parallelizing weight-based sampling through MapReduce. Our input is a target distribution of interest, and a proposal distribution we can sample from. Mapreduce first MAPS out sampling and weighting duties to various processes, then REDUCES to calculate summary statistics. With scale we can improve the properties of our sample estimates, especially when using complicated target distributions. While no mainstream functions perform this computation, we have two weight-based methods that work on both one and two dimensions. Our second contribution is performing parallel tempering two different ways using MPI. Our input here is a target distribution with multiple regions of high density, separated by a region of low density, for which normal MCMC techniques may be unable to draw representative samples as seen here. Parallel tempering runs chains inflating the original density to fix this problem. We present two parallel implementations, one using the conventional Master/Slave model and another using coupled random number generation. Our latter implementation achieves a three-fold speedup over the former showing the importance of reducing unnecessary communication. Our final results seem pretty awesome. Our third contribution is a pixel clustering algorithm achieved through Metropolized-Gibbs in CUDA. We wish to assign a pixel to one of four blocking groups to 4-cluster this image. For each pixel, we ask it if it wants to switch blocking groups, and the pixel accepts or rejects knowing it wants to block with similar neighboring pixels. We take advantage of the Ising model pattern to checkerboard-parallelize this in CUDA. We see that oftentimes the best method of parallelism is specific to the problem. Here is our president in 4 shades of gray. our parallel implementation performs reasonably well. Overall, our project uses the parallel computing frameworks learned in CS 205 to tackle a variety of problems present within Monte Carlo methods. Thanks so much everyone! :D :D