My Introduction to parallel computing in R

April 09, 2019

I derive a sadistic satisfaction in making my computer work hard to run my code. When the peripheral applications lag, the cooling fan ramps up, and the CPU load goes to 100%.

I love making my computer work hard.

The project

My latest research project requires me to improve models which are used to predict 'Black Spot' spore dispersal from infested crop residue following field peas. This model will be used to assess the timing and risk of spore showers for growers wanting to plant field peas so they can either 'dodge' heavy spore showers by shifting the planting window or apply a fungicide.

Needless to say, simulating millions, billions, trillions, quadrillions, perhaps even over octillion spores and their movements need a hefty amount of computing power.

I translated an old model built for running in the Mathematica programming language which used seemingly a lot of, 'for' loops. After many errors, mainly due to data formatting changes I managed to test run the code on one time increment.

One time increment for the simulation is one hour out of the whole season.

When I first successfully ran the code and no warnings spammed my console I fist pumped the air and my computer fan whirred. I felt a surge of dopamine pulse down my spine and into the rest of my body. Fifteen minutes and a celebratory cup of tea later, my computer finished the computation. Scaling this up to 4464 hours will need some coding improvements and a bigger-faster computer.

for(i in for_a_long_wait){ }

This led me to discover that 'for' loops were horribly inefficient, and my cascade of three nested loops was causing my code to run very slow. I needed to learn to use 'apply' functions, which I had been avoiding since all my programming life like an awkward teenage boy avoiding something he didn't understand ... like a girl.

The switch to using 'apply' functions instead 'for ' loops was frustrating at first but extremely rewarding. If you are not using 'for' loops to do sequential computation, where the next loop relies on the last, I suggest you don't use them.

The computation time fell from 15 minutes -> 5 minutes -> ~1.5 minutes; however, I was still only using 1 core, of the 8, on my machine.

Learning parallel

To improve computation further I needed to parallelise the computation to make use of every core. There are many packages available for use in R to parallelise computation, and the 'parallel' package is the one I used, however I ran into problems.

install.packages("parallel")
library(parallel)

The simplest 'lapply' analogue for parallel computing 'mclapply' does not use all cores when running on a windows version of R. Therefore I needed to use the 'parLapply' function which needs a few extra expressions before it successfully runs.

Parallel Help file usage

parLapply(cl = NULL, X, fun, ..., chunk.size = NULL)

The first argument 'cl =' requires you to specify how many cores you are allocating to the computation. I have 8 and I allocated them using the following code.

cl8 <- makeCluster(getOption("cl.cores", 8))

The second argument 'X' is the data which is passed into the function. In my case it is a 'list' with coordinates of where the spore will travel from, and also an index of the spore-load at that location.

The third argument 'fun' is the function() I have written to compute the 'list' output, where the spores will land.

This did not run successfully and produced errors like

Error in checkForRemoteErrors(val) : 
  8 nodes produced errors; first error: could not find function "partyInMyPants"

Error in checkForRemoteErrors(val) : 
  8 nodes produced errors; first error: object 'ofMyAffection' not found

As far as I can make out, these errors occur when during the partitioning of the strings to the nodes/clusters the system needs to identify the non-base functions and objects which run internally to the specified 'fun =' which are saved in the R global environment. All objects and functions need to be specified prior to the 'parLapply' call using the function 'clusterExport'. For example:

clusterExport(cl8, list("partyInMyPants", "ofMyAffection"))

Success !!

On my 8 core i5 Dell laptop, running windows 10 ran the computation in 19 seconds.

When I uploaded the code to the 32 core high performance computing Rstudio server instance at USQ the computation took under 9 seconds.

However, using a remote server to run the computation invoked an unrequited feeling for my sadistic thrill in hearing the fan speed maxing out and my computer lagging. Perhaps this is the next thing I need to simulate.

Search This Blog

The Fungal Jungle