3

I have a large file(s) (say 10GB) that I want to transfer from one machine to many (say 50) identical machines connected to LAN.

It makes sense to me that transfering like this:

             /--> 4 ---
            / 
1 -----> 2 -----> 5 ---
   \                           and so on...
    \--> 3 -----> 6 ---
             \
              \-> 7 ---

Would be faster because many transfers happen at the same time (and I have a fast switch so that should work).

Surely someone coded this already :P

What is the appropriate package for this?


Everything runs unix/linux and I have sudo access so everything goes.

thanasisp
  • 7,802
  • 2
  • 26
  • 39
Radost
  • 131
  • 3
  • https://unix.stackexchange.com/a/596051 – Artem S. Tashkinov Nov 16 '20 at 18:39
  • Sometimes, USB keys are a better alternative, if you have physical access to the 50 machines. The other point: I wouldn't start too many transfers at the same time because if something goes wrong on your network(cf. the old networks cards), you will have to restart all the transfers. I initiate only a few transfers simultaneously to minimize the risk and impact of trouble (don't forget to compute and check the MD5 checksums). – Jacques Nov 16 '20 at 19:25
  • @ArtemS.Tashkinov Isn't udpbroadcast transfer potentially packet-droppy? – Radost Nov 16 '20 at 20:29
  • @Jacques I do, but then I'd rather leave something overnight than run around with a USB stick – Radost Nov 16 '20 at 20:30
  • 1
    @Radost TCP is also packet dropping. However TCP has built in re-try. Re-try can also be added to UDP. – ctrl-alt-delor Nov 16 '20 at 20:56
  • @ctrl-alt-delor Can you elaborate on UDP retry in this context? – Radost Nov 17 '20 at 00:14
  • @Radost `TCP ≈ UDP + stream + retry - multicast`. There is nothing stopping a program that uses UDP from adding retry. It is not magical. Just add this layer. Look at the layers in the [OSI model](https://en.wikipedia.org/wiki/OSI_model) for how it can be done. – ctrl-alt-delor Nov 17 '20 at 18:09
  • @ctrl-alt-delor I've coded some things with sockets a long time ago, however, as usual there are people who are waaaay more qualified to code something like that. It'd seem this kind of software could be reasonably universal. – Radost Nov 17 '20 at 22:43
  • I am not saying that you should write it. I am just pointing out an (inferred) **invalid assumption** "UDP has to be packet dropping". – ctrl-alt-delor Nov 18 '20 at 10:41
  • @Radost The broadcasting capability is not available everywhere, like many clouds turn off that feature because they don't want you to flood their entire network with broadcast packets. It takes a significant amount of time to develop solutions that work in a data center. Now, with TCP, you could use `scp` and you should be able to write a high level bash script that does the copies for you. – Alexis Wilke Nov 23 '20 at 17:43
  • @AlexisWilke I think you've missed the point -- doing scp from server1 to server$i is crap because time scales like number of targets (while in principle it could scale like log number of targets) – Radost Nov 24 '20 at 12:15
  • @Radost, the scp can happen on any machine, so once its on #2, you can further scp to the next machine(s), like you've shown on your graphics. If you can instead program UDP with broadcast, that would indeed be much better (i.e. one single send and all the other machines receive a copy _simultaneously_). I do that with one of my systems, I think it's a really cool feature. – Alexis Wilke Nov 24 '20 at 14:56

1 Answers1

2

BitTorrent

BitTorrent will allow this. It does not follow a tree structure (as in your picture). It uses a mesh. The file is broken into parts. The file parts are sent from any machine that has a copy.

If the master is put into super-seeding mode, then it strictly sends different parts to each of the clients. The clients will then cooperate in sharing what they have until they all have everything.

Roman Riabenko
  • 2,145
  • 3
  • 15
  • 39
ctrl-alt-delor
  • 27,473
  • 9
  • 58
  • 102
  • I've just tried this with Transmission. Created a torrent file on one peer and copied it to another. Needed to enable Local Peer Discovery (LPD) protocol in the peers to make them find each other without a tracker. It wasn't enabled by default in Debian. – Roman Riabenko Nov 16 '20 at 21:07