Here, Pool creates the pool of
processes that control the workers
and gets the environment ready to
run multiple tasks. Pool provides two
prominent functions (map and apply)
that are the equivalents of Python’s
built-in map and apply functions.
These will lock the main program until
each process has finished, which is
useful when we want to obtain results
in a specific order. If we want to obtain
results as soon as they are processed
and do not require them to be in a
particular order, we can use Pool’s
asynchronous variants, apply_async
and map_async. While using the
asynchronous versions, we need to use
a get method after the async call in
order to obtain the return values of the
finished processes. The map function
here takes as input a function and an
“iterable” of parameters. The function
is then called for each parameter in
the iterable and results are put into
a list, distributing the calls over the
Pool.close() informs the
processor that no new tasks will be
added to the pool. It is required to
call either Pool.close or Pool.
terminate before calling Pool.join.
Pool.join() will halt the program
execution and wait for all the processes
to complete and results to be collected
into a single list. Several such details
and best practices are outlined in the
module’s Programming Guidelines [ 4].
Executing the parallel code with
two processes takes 145 seconds,
while running with three process
takes just 109 seconds, giving an
almost 3x speedup. As shown in
the accompanying figure [ 1], we
can execute the parallel version by
changing the number of processes
to spawn. However, in case, as in this
example, we notice that the time
taken by launching more than three
processes is only marginally lower.
In fact, we observe that the time
taken remains almost constant up to
a certain number of processes and
then, counterintuitively, increases as
we increase the number of processes.
This increase can be attributed to the
fact that the CPU consists of only four
cores and is running system processes
(such as the operating system) in
the background. The fourth core thus
does not have enough capacity left
to further increase performance
when more processes are spawned.
Other factors (such as data transfer,
hardware cache levels, and inter-
process communication) also create
an overhead with every new process
launched. For more complex tasks,
certain issues (such as balancing the
load between processes) come into play.
Steps for designing parallel programs.