Each Answer to this Q is separated by one/two green lines.
I encountered a weird problem while using python multiprocessing library.
My code is sketched below: I spawn a process for each “symbol, date” tuple. I combine the results afterwards.
I expect that when a process has done computing for a “symbol, date” tuple, it should release its memory? apparently that’s not the case. I see dozens of processes (though I set the process pool to have size 7) that are suspended¹ in the machine. They consume no CPU, and they don’t release the memory.
How do I let a process release its memory, after it has done its computation?
¹ by “suspended” I mean their status in ps command is shown as “S+”
def do_one_symbol( symbol, all_date_strings ): pool = Pool(processes=7) results = ; for date in all_date_strings: res = pool.apply_async(work, [symbol, date]) results.append(res); gg = mm = ss = 0; for res in results: g, m, s = res.get() gg += g; mm += m; ss += s;
Try setting the maxtasksperchild argument on the pool. If you don’t, then the process is reusued over and over again by the pool so the memory is never released. When set, the process will be allowed to die and a new one created in it’s place. That will effectively clean up the memory.
I guess it’s new in 2.7: http://docs.python.org/2/library/multiprocessing.html#module-multiprocessing.pool
You should probably call
close() followed by
wait() on your
Wait for the worker processes to exit. One must call close() or terminate() before using join().