Running in parallel

GNU parallel is a handy tool for easily executing jobs in parallel across any number of cores.

Suppose we have 32 jobs we want to run across 4 cores. A simple way to parallel-ize this is to run 8 jobs on each core. GNU parallel instead starts a new process when one finishes, keeping all cores active and reducing the dead time.

Installation:

    (wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash

Suppose you then have a X jobs named script_n.sh for identifier n. The easy way to run this across Y cores is simply

    parallel -j Y bash -c ::: ./script_*.sh

To prevent parallel quitting if you exit - i.e. if your remote session expires - run this with nohup:

    nohup parallel -j Y bash -c ::: ./script_*.sh

An excellent full how-to guide is found here

without with

Categories:

Updated: