multiprocessing use pickle module to serialize things among the process, but doesn’t support functions with closures, lambdas, or functions in __main__
1 2 3 |
In [1]: import pickle In [2]: pickle.dumps(lambda x: x) PicklingError: Can't pickle <function <lambda> at 0x0000000005AC8048>: attribute lookup <lambda> on __main__ failed |
Here is the example I try:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
In [1]: from multiprocessing import Pool class C(object): def __init__(self): self.x = 2 def method(self): def square(): return self.x**2 p = Pool(4) p.map(square, range(10)) c = C() c.method() AttributeError: Can't pickle local object 'C.method.<locals>.square' |
solution: dill
1 2 3 |
In [1]: import dill dill.dumps(lambda x: x) Out[1]: b'\x80\x03cdill.dill\n_create_function\nq\x00(cdill.dill\n_load_type\nq\x01X\x08\x00\x00\x00CodeTypeq\x02\x85q\x03Rq\x04(K\x01K\x00K\x01K\x01KCC\x04|\x00S\x00q\x05N\x85q\x06)X\x01\x00\x00\x00xq\x07\x85q\x08X\x1f\x00\x00\x00<ipython-input-14-80b851ab3476>q\tX\x08\x00\x00\x00<lambda>q\nK\x02C\x00q\x0b))tq\x0cRq\rc__builtin__\n__main__\nh\nNN}q\x0etq\x0fRq\x10.' |
dill and multiprocessing: pathos
– dill: a utility to serialize all of python
– pox: utilities for filesystem exploration and automated builds
– klepto: persistent caching to memory, disk, or database
– multiprocess: better multiprocessing and multithreading in python
– ppft: distributed and parallel python
– pyina: MPI parallelmap
and cluster scheduling
– pathos: graph management and execution in heterogenous computing
1 2 3 4 |
from pathos.multiprocessing import Pool p = Pool(4) p.map(lambda x: x**2, range(10)) |
Thnx for this post! Very useful!