5

Can scp be used to send a single file to multiple remote servers at the same time? If so, how? If not, what's the alternative?

jasonwryan
  • 71,734
  • 34
  • 193
  • 226
N. F.
  • 2,109
  • 8
  • 24
  • 24
  • 1
    Why do you need that "at the same time"? What is your scenario? – Nils Sep 15 '12 at 22:06
  • at the same time->parallely – N. F. Sep 19 '12 at 02:33
  • I know what at the same time is. The question is WHY? It is one thing asking a question - but I am interested in your use-case, too. I never needed that up until now... – Nils Sep 19 '12 at 13:35
  • well, I needed to do some experiments, where the root site needs to send copies of the same thing to multiple site at the same time, and multiple sites do some processes then again send the result to root site. – N. F. Sep 19 '12 at 19:44

2 Answers2

10

pdcp from the pdsh package is one option. pdsh was written to help with management of HPC clusters - I've used it for that, and I've also used it for management of multiple non-clustered machines.

pdsh and pdcp use genders to define hosts and groups of hosts (a "group" is any arbitrary tag you choose to assign to a host, and hosts can have as many tags as you want.)

For example, if you had a group called 'webservers' in /etc/genders that included hostA, hostB, hostC, then pdcp -g webservers myscript.sh /usr/local/bin would copy myscript.sh into /usr/local/bin/ on all three hosts.

Similarly, pdsh -g all uname -r would run uname -r on every host tagged with "all" in /etc/genders, with the output from each host prefixed with the host's name.

$ pdsh -g all uname -r
indra: 3.2.0-3-amd64
kali: 3.2.0-3-amd64
ganesh: 3.2.0-3-amd64
hanuman: 3.2.0-2-686-pae

pdsh commands and pdcp copies are executed in parallel (with limits and timeouts to prevent overloading of the originating system).

When the command being run produces multi-line output, it can get quote confusing to read. Another program in the pdsh package called dshbak can group the output by hostname for easier reading.


after seeing all your comments, it's possible that pdsh & pdcp may be overkill for your needs...it's really designed to be a system admin's tool rather than a normal non-root user's tool.

It may be that writing a simple shell script wrapper around scp may be good enough for you. e.g. here's an extremely simple, minimalist version of such a wrapper script.

#! /bin/bash 

# a better version would use a command line arg (e.g. -h) to get a
# comma-separated list of hostnames, but hard-coding it here illustrates
# the concept well enough.
HOSTS="[email protected] [email protected]"

# last argument is the target directory on the remote hosts
target_dir="$BASH_ARGV"

# all but the last arg are the files to copy
files=${@:1:$((${#@} - 1))}

for h in $HOSTS; do
    scp $files "$h:$target_dir"
done
cas
  • 1
  • 7
  • 119
  • 185
  • Can you shed some light on how can I use it in an environment where I have 3 remote servers each in different locations, how to install and get it working, any man page? – N. F. Sep 15 '12 at 22:55
  • It uses ssh. if the commands you want to run need root privs, that means you'll need to enable root ssh. it doesn't matter where the remote servers are, as long as you can ssh to them. install pdsh, genders and optionally rdist (for pdcp) onto your client machine that you'll be running pdsh/pdcp from, and on the remote servers - these are all available packaged for debian, ubuntu, and other distros. add your hosts to /etc/genders. man pages are included with the software. – cas Sep 16 '12 at 00:02
  • Is there any other way round than using gender database? I have 1 root site(from where I use to send files to all the 3 client sites).from the man page I found I can use pdcp -w user@[host1,host2...hostn] command, please correct me if its wrong, also here how can I mention the destination hosts path is like pdcp -w user@[host1, host2, host3] /home/user/Test_Data(let this is the common destination path for each hosts). – N. F. Sep 17 '12 at 07:04
  • yes, you can specify hostnames directly with `-w`, even without an /etc/genders file. your 'root site' will still need to have your three client hosts in ~/.ssh/known_hosts, and your 3 remote hosts will need to be configured to allow password-less public_key access from your root site. using `ssh-copy-id` should achieve both requirements. i'm puzzled by what your second question could mean because the obvious interpretation is answered both in my answer above and in your question itself...as with `cp`, the destination is the final argument. – cas Sep 17 '12 at 07:20
  • I asked that because I was puzzled too from the man page as it was written that the path cann't be specified the way we specify it in scp,but with that they meant the use of :and then file path name, I didn't get it then,now its clear. – N. F. Sep 17 '12 at 07:58
  • I have installed pdsh in all the hosts,and also,configured all the hosts for passwordless access from the root(by copying public keys of root to them and vice-versa), but the problem is when I issue pdcp command from root,to send same files to the remote hosts,it says,rcmd socket: permission denied. What is the solution for that? – N. F. Sep 18 '12 at 05:11
  • pdsh/pdcp default to using rsh for some reason (no idea why, unencrypted connections aren't safe...but some HPC clusters are on private heavily-firewalled networks so they assume that it's still OK to use rsh and telnet. user inertia, lots of them still use csh and think that fortran's the only language that's any good for scientific programming). anyway, you can use the `-R ssh` option on every `pdsh` and `pdcp` command line, or you can change the default with: `mkdir -p /etc/pdsh ; echo ssh > /etc/pdsh/rcmd` (as root, or prefix both commands with sudo, of course) – cas Sep 18 '12 at 07:24
  • I am trying ../local/bin/pdcp -R ssh -w [email protected],looper.comgrid.ac test.txt /home/mindfreak/Test_Data, from the Test_Data folder of root. It returns the following error: looper: bash: /global/home/mindfreak/Test_Data/../local/bin/pdcp: No such file or directory pdcp@seawolf2: nestor: ssh exited with exit code 127 asper: bash: /global/home/mindfreak/Test_Data/../local/bin/pdcp: No such file or directory pdcp@seawolf2: jasper: ssh exited with exit code 127 – N. F. Sep 18 '12 at 09:44
  • have you installed pdsh and pdcp on the slave hosts? it looks like it's trying to find pdcp in the same directory on the slaves as on the master ("root"). – cas Sep 18 '12 at 11:00
  • ok, let me tell you how I installed it briefly: In the root pc, I issued following command(inside the pdsh-2.26 directory) ./configure --with-ssh --prefix=/home/mindfreak/local, then make,then make install, I repeated the same process in other hosts where I have also the same file structure under home/mindfreak – N. F. Sep 18 '12 at 20:30
  • i've only ever used pdsh & pdcp on systems where I have root and can install it into the standard system paths...but I suspect that the `-e PATH` option or `PDSH_REMOTE_PDCP_PATH` env var may be part of the solution here. I think this is getting beyond what's possible to resolve in a Q&A site like this - have you tried the pdsh wiki or issues on https://code.google.com/p/pdsh/ – cas Sep 18 '12 at 22:13
  • No, I haven't, and the reason to install it into my own path is I dont have permissions in the local/usr/bin and also etc. I'll try dicussing it in the code.google. – N. F. Sep 18 '12 at 22:20
  • thanks man!..the thing you suspected was correct, one of the guys of that group told me to specify the path to pdcp binaries using -e PATH, and there you go!it worked just fine. Though, It seems to me, somewhat slow..:P:P...anyways, can you just do one more little help? how can I set up global variables for pdcp binaries in root and other hosts? so that writing only pdcp will do,instead writing everytime the total path to pdcp binaries. – N. F. Sep 19 '12 at 02:24
  • and thanks for your script, but again its sending files serially through the iteration of the loop. But I want it to be parallely sent. – N. F. Sep 19 '12 at 02:29
  • I'm glad you got it working. If you need default options for pdcp (or any command, really) then you can make an alias, function, or shell script. e.g. `alias pdcp='pdcp -R ssh -e /path/to/pdcp'`. Also, as i mentioned, if you do something like `export PDSH_REMOTE_PDCP_PATH=/path/to/pdcp`, then pdcp will use that. Similarly, you can `export PDSH_RCMD_TYPE=ssh` instead of using the `-R ssh` option. see the man pages for pdcp and pdsh for details on options and env vars. – cas Sep 19 '12 at 02:47
1

Instead of sending a file to multiple targets at the same time you could read it multiple times at the same time.

Transfer your file to NFS, mount that filesystem on your targets and copy it from NFS to your local destination.

To do so manually and concurrent, you could use cluster-ssh (cssh) on the targets.

Nils
  • 18,202
  • 11
  • 46
  • 82