Python multiprocessing – How to release memory when a process is done?

Each Answer to this Q is separated by one/two green lines.

I encountered a weird problem while using python multiprocessing library.

My code is sketched below: I spawn a process for each “symbol, date” tuple. I combine the results afterwards.

I expect that when a process has done computing for a “symbol, date” tuple, it should release its memory? apparently that’s not the case. I see dozens of processes (though I set the process pool to have size 7) that are suspended┬╣ in the machine. They consume no CPU, and they don’t release the memory.

How do I let a process release its memory, after it has done its computation?


┬╣ by “suspended” I mean their status in ps command is shown as “S+”

def do_one_symbol( symbol, all_date_strings ):
    pool = Pool(processes=7)
    results = [];
    for date in all_date_strings:
        res = pool.apply_async(work, [symbol, date])

    gg = mm = ss = 0;
    for res in results:
        g, m, s = res.get()
        gg += g; 
        mm += m; 
        ss += s;

Did you try to close pool by using pool.close and then wait for process to finish by pool.join, because if parent process keeps on running and does not wait for child processes they will become zombies

Try setting the maxtasksperchild argument on the pool. If you don’t, then the process is reusued over and over again by the pool so the memory is never released. When set, the process will be allowed to die and a new one created in it’s place. That will effectively clean up the memory.

I guess it’s new in 2.7:

You should probably call close() followed by wait() on your Pool object.

Wait for the worker processes to exit. One must call close() or terminate() before using join().

The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .