# Parallel Programming with Python¶

Thomas Langford, Ph.D.

Yale Center for Research Computing

Yale Wright Laboratory

November 11, 2020

# Tools and Requirements¶

• Language: Python 3.8
• Modules: pandas, numpy, multiprocessing, PIL (for imamge processing), mpi4py, matplotlib, cupy (for GPU parallelism)
• Jupyter notebook

# Comment: Python 2 versus 3¶

## Clone from GitHub¶

Navigate to the GitHub repository: https://github.com/ycrc/parallel_python Clone or download the zip file that contains this notebook and required data.

## MyBinder¶

Launches a live AWS container with the required tools installed and ready to go:

https://mybinder.org/v2/gh/ycrc/parallel_python/master

# Outline and Overview¶

• Introduction to parallel concepts
• Classes of parallel problems
• Python implementations of parallel processesing
• GPU Parallelism
• Tools for further exploration

# Introduction to parallel concepts¶

## Serial Execution¶

Typical programs operate lines sequentially:

In :
# Define an array of numbers
foo = np.array([0, 1, 2, 3, 4, 5])

# Define a function that squares numbers
def bar(x):
return x * x

# Loop over each element and perform an action on it
for element in foo:

# Print the result of bar
print(bar(element))

0
1
4
9
16
25


## The map function¶

A key tool that we will utilize later is called map. This lets us apply a function to each element in a list or array:

In :
# (Very) inefficient way to define a map function
def my_map(function, array):
# create a container for the results
output = []

# loop over each element
for element in array:

# add the intermediate result to the container
output.append(function(element))

# return the now-filled container
return output

In :
my_map(bar, foo)

Out:
[0, 1, 4, 9, 16, 25]

Python has a helpfully provided a map function in the standard library:

In :
list(map(bar, foo))

# NB: in python3 map is a generator, so we need to cast it to a list for this comparison

Out:
[0, 1, 4, 9, 16, 25]

The built-in map function is much more flexible and featured than ours, so it's best to use that one instead.

## Parallel Workers¶

In the example we showed before, no step of the map call depend on the other steps.

Rather than waiting for the function to loop over each value, we could create multiple instances of the function bar and apply it to each value simultaneously.

This is achieved with the multiprocessing module and a pool of workers.

## The Mutiprocessing module¶

The multiprocessing module has a number of functions to help simplify parallel processing.

One such tool is the Pool class. It allows us to set up a group of processes to excecute tasks in parallel. This is called a pool of worker processes.

First we will create the pool with a specified number of workers. We will then use our map utility to apply a function to our array.

In :
import multiprocessing

# Create a pool of processes
with multiprocessing.Pool(processes=6) as pool:
# map the np.square function on our foo array
result = pool.map(np.square, foo)

# output the results
print(result)

[0, 1, 4, 9, 16, 25]


The difference here is that each element of this list is being handled by a different process.

To show how this is actually being handled, let's create a new function:

In :
def parallel_test(x):
# print the index of the job and it's process ID number
print(f"x = {x}, PID = {os.getpid()}\n")


Now we can map this function on the foo array from before. First with the built-in map function:

In :
list(map(parallel_test, foo));

x = 0, PID = 150565

x = 1, PID = 150565

x = 2, PID = 150565

x = 3, PID = 150565

x = 4, PID = 150565

x = 5, PID = 150565



We see that each step is being handled by the same process and are excecuted in order.

Now we try the same process using multiprocessing:

In :
with multiprocessing.Pool(processes=6) as pool:
result = pool.map(parallel_test, foo)

x = 2, PID = 150587
x = 3, PID = 150588
x = 1, PID = 150586
x = 4, PID = 150589
x = 5, PID = 150590

x = 0, PID = 150585



Two things are worth noting:

1. Each element is processed by a different PID
2. The tasks are not executed in order!

## Key Take-aways¶

1. The map function is designed to apply the same function to each item in an iterator
2. In serial processing, this works like a for-loop
3. Parallel execution sets up multiple worker processes that act separately and simultaneously

# Embarassingly parallel problems¶

Many problems can be simply converted to parallel execution with the multiprocessing module.

## Example 1: Monte Carlo Pi Calculation¶

• Run multiple instances of the same simulation with different random number generator seeds
• Define a function to calculate pi that takes the random seed as input, then map it on an array of random seeds
In :
fig, ax = plt.subplots(nrows=1,ncols=1, figsize=(5,5))
x = np.linspace(0,1,100)
plt.fill_between(x, np.sqrt(1-x**2),0,alpha=0.1)
plt.xlim(0,1.03);plt.ylim(0,1.03);plt.xlabel('X');plt.ylabel('Y');

x = np.random.random(size=100)
y = np.random.random(size=100)

plt.plot(x,y,marker='.',linestyle='None');

In :
def pi_mc(seed):
num_trials = 500000
counter = 0
np.random.seed(seed)

for j in range(num_trials):
x_val = np.random.random_sample()
y_val = np.random.random_sample()

counter += 1

return 4*counter/num_trials


### Serial vs Parallel¶

In :
%timeit pi_mc(1)

1 loop, best of 5: 319 ms per loop

In :
seed_array = list(range(4))
%timeit list(map(pi_mc, seed_array))

1 loop, best of 5: 1.27 s per loop

In :
%%timeit

with multiprocessing.Pool(processes=4) as pool:
result = pool.map(pi_mc, seed_array)

1 loop, best of 5: 422 ms per loop


While the serial execution scales up linearly (~4x longer than one loop), the parallel execution doesn't quite reach the single loop performance. There is some overhead in setting up the threads that needs to be considered.

## Example 2: Processing multiple input files¶

Say we have a number of input files, like .jpg images, that we want to perform the same actions on, like rotate by 180 degrees and convert to a different format.

We can define a function that takes a file as input and performs these actions, then map it on a list of files.

In :
# import python image library functions
from PIL import Image

from matplotlib.pyplot import imshow
%matplotlib inline

In :
#Read image
im = Image.open( './data/kings_cross.jpg' )
#Display image
im

Out: In :
im.rotate(angle=180)

Out: