Parallel For Loops in Python: Examples with Joblib

Last updated:
Table of Contents

WIP Alert This is a work in progress. Current information is correct but more content may be added in the future.

Tested under Python 3.x

The Python Joblib.Parallel construct is a very interesting tool to spread computation across multiple cores.

It's in cases when you need to loop over a large iterable object (list, pandas Dataframe, etc) and you think that your taks is cpu-intensive.

In rough terms, it spawns multiple Python processes and handles each part of the iterable in a separate process. Then it joins everything at the end.

Simplest possible example

from math import sqrt
from joblib import Parallel, delayed

# single-core code
sqroots_1 = [sqrt(i ** 2) for i in range(10)]

# parallel code
sqroots_2 = Parallel(n_jobs=2)(delayed(sqrt)(i ** 2) for i in range(10))

A more complex example (process a large XML file)

The function must return a value

In order to update the example above to use any function, just define it and use its name:

# Python XML Processing Module
import xml.etree.ElementTree as ET

from joblib import Parallel, delayed

FILE = 'path/to/your/file'

tree = ET.parse(FILE)
dataset = tree.getroot()

def process_node(xml_node):
    # extract some information from
    # the xml node

    return 'node information'

# n_jobs=1 means: use all available cores
element_information = Parallel(n_jobs=-1)(delayed(process_node)(node) for node in dataset)

Troubleshooting: name bindings

TODO example with tfidfvectorizer and analyze

Troubleshooting: python won't use all processors

This may happen if you define a method that takes a large parameter.

Make sure large variables are already bound when you define the method.

This ensures each thread will have access to all the information it needs to process each part.

TODO example with embeddings index


Dialogue & Discussion