Questions tagged [gnu-parallel]

GNU parallel is a command line utility to run programs in parallel

GNU parallel is a command line utility to run programs in parallel, on a single (usually multiprocessor) machine or over multiple machines via .

For questions about running programs in parallel in general, see . For questions about the tool called parallel from Joey Hess's moreutils, see .

External links

255 questions
255
votes
10 answers

Parallelize a Bash FOR Loop

I have been trying to parallelize the following script, specifically each of the three FOR loop instances, using GNU Parallel but haven't been able to. The 4 commands contained within the FOR loop run in series, each loop taking around 10 minutes.…
Ravnoor S Gill
  • 2,653
  • 3
  • 12
  • 4
50
votes
1 answer

GNU parallel vs & (I mean background) vs xargs -P

I'm confused about the difference or advantage (if any) of running a set of tasks in a .sh script using GNU parallel E.g. Ole Tange's answer: parallel ./pngout -s0 {} R{} ::: *.png rather than say looping through them putting them in the background…
37
votes
8 answers

Parallelise rsync using GNU Parallel

I have been using a rsync script to synchronize data at one host with the data at another host. The data has numerous small-sized files that contribute to almost 1.2TB. In order to sync those files, I have been using rsync command as follows: rsync…
Mandar Shinde
  • 3,156
  • 11
  • 39
  • 58
20
votes
3 answers

Can GNU parallel output stdout before the program has exited?

echo 'echo "hello, world!";sleep 3;' | parallel This command does not output anything until it has completed. Parallel's man page claims: GNU parallel makes sure output from the commands is the same output as you would get had you run the commands…
Luc
  • 3,418
  • 3
  • 26
  • 37
18
votes
6 answers

using parallel to process unique input files to unique output files

I have a shell scripting problem where I'm given a directory full of input files (each file containing many input lines), and I need to process them individually, redirecting each of their outputs to a unique file (aka, file_1.input needs to be…
J Jones
  • 325
  • 1
  • 3
  • 8
16
votes
2 answers

Why does (GNU?) parallel fail silently, and how do I fix it?

In a larger script to post-process some simulation data I had the following line: parallel bnzip2 -- *.bz2 Which, if I understand parallel correctly (and I may not), should run n-core threads of the program over all files with the listed extension.…
Hooked
  • 1,343
  • 3
  • 17
  • 24
13
votes
1 answer

Why doesn't GNU parallel work with "bash -c"?

% echo -e '1\n2' | parallel "bash -c 'echo :\$1' '' {}" :1 :2 % echo -e '1\n2' | parallel bash -c 'echo :\$1' '' {} % I'd expect the second line to act the same.
Raitis Veinbahs
  • 343
  • 3
  • 9
13
votes
3 answers

How would I use GNU Parallel for this while loop?

So I have a while loop: cat live_hosts | while read host; do \ sortstuff.sh -a "$host" > sortedstuff-"$host"; done But this can take a long time. How would I use GNU Parallel for this while loop?
Proletariat
  • 669
  • 3
  • 16
  • 28
11
votes
2 answers

GNU Parallel: immediately display job stderr/stdout one-at-a-time by jobs order

I know that GNU Parallel buffers std/stderr because it doesn't want jobs output to be mangled, but if I run my jobs with parallel do_something ::: task_1 task_2 task_3, is there anyway for task_1's output to be displayed immediately, then after…
Hai Luong Dong
  • 566
  • 1
  • 4
  • 8
10
votes
1 answer

Does "parallel --jobs 10" mean that exactly 10 jobs will run?

When specifying the option --jobs to GNU parallel, what exactly does it mean? I execute: parallel --jobs 10 ./program ::: {1..100} where program is an intensive task, and the jobs are completely independent of each other. {1..100} represents…
a06e
  • 1,627
  • 4
  • 24
  • 31
9
votes
4 answers

Using GNU Parallel With Split

I'm loading a pretty gigantic file to a postgresql database. To do this I first use split in the file to get smaller files (30Gb each) and then I load each smaller file to the database using GNU Parallel and psql copy. The problem is that it takes…
Topo
  • 285
  • 1
  • 4
  • 9
9
votes
3 answers

How to get GNU parallel on Amazon Linux?

Preferably without having to compile it from source. I tried adding repositories I found on Google: CentOS 6 and CentOS 5, but both give me: [ec2-user@ip-10-0-1-202 yum.repos.d]$ sudo yum install parallel -y Loaded plugins: priorities, update-motd,…
Matt Chambers
  • 241
  • 1
  • 2
  • 5
9
votes
3 answers

How to use GNU parallel effectively

Suppose I want to find all the matches in compressed text file: $ gzcat file.txt.gz | pv --rate -i 5 | grep some-pattern pv --rate used here for measuring pipe throughput. On my machine it's about 420Mb/s (after decompression). Now I'm trying to do…
Denis Bazhenov
  • 191
  • 1
  • 5
9
votes
2 answers

GNU Parallel Limit Memory Usage

Is it possible to limit the memory usage of all processes started by GNU parallel? I realize there are ways to limit the number of jobs, but in cases where it isn't easy to predict the memory usage ahead of time it can be a difficult to tune this…
Joe
  • 341
  • 4
  • 7
9
votes
2 answers

Can GNU Parallel execute more parallel processes?

Can I for example execute: parallel -j 200 < list0 Where "list" has: nice -n -20 parallel -j 100 < list2 nice -n -20 parallel -j 100 < list1 Would this be feasible/possible?
Dominique
  • 5,155
  • 8
  • 26
  • 29
1
2 3
16 17