9

Is there any difference between doing i.e. dd bs=4M if=archlinux.iso of=/dev/sdx status=progress oflag=sync or doing cp archlinux.iso /dev/sdx && sync, and reason to use one over the other? (aside from the pretty progress bar in dd)

Valeriy
  • 229
  • 3
  • 9
  • less wear on flash disk? – A.B Dec 20 '19 at 14:26
  • My (rather uneducated) guess would be that this is about writing to a block device, not a 'regular' file. – Panki Dec 20 '19 at 14:54
  • 2
    Does this answer your question? [dd vs cat -- is dd still relevant these days?](https://unix.stackexchange.com/questions/12532/dd-vs-cat-is-dd-still-relevant-these-days) – Gilles 'SO- stop being evil' Dec 20 '19 at 14:55
  • 1
    @A.B No, that would make zero difference. – Gilles 'SO- stop being evil' Dec 20 '19 at 14:55
  • 1
    @Panki The “magic” for writing to a block device is in `/dev/sda`, not in `dd`. `cp` can do the job just as well. – Gilles 'SO- stop being evil' Dec 20 '19 at 14:55
  • This question should state the operating system. It is important to any answer. Is it an old one? Or a modern one? Does it have block devices? Or only raw devices? – JdeBP Dec 20 '19 at 14:58
  • @JdeBP I am using arch linux – CcVHKakalLLOOPPOkKkkKk Dec 20 '19 at 14:59
  • When writing blocks with size 4096 bytes and bigger, there is almost no difference in speed when using modern linux systems and writing into USB devices. See [this link](https://askubuntu.com/questions/931581/flashing-ubuntu-iso-to-usb-stick-with-dd-recommended-block-size/931588#931588). -- I *think* that `cat` and `cp` write blocks with 4096 bytes but the tool that is really writing to the device might do something else. – sudodus Dec 20 '19 at 15:40

2 Answers2

11

One difference is efficiency, and thus speed. For example, you could get the bytes one by one and copy them to the device, with cat if it had the idealized implementation or in older systems, for example BSD4:

cat archlinux.iso > /dev/sdx

In these implementations cat will move each byte independently. That is a slow process, although in practice there will be buffers involved. Note that modern cat implementations will read blocks (see below).

With dd and a good block size it will be faster.

With cp it depends on the buffer size used by cp (not under your control) and other buffers on the way. The efficiency lies between the idealized implementation of cat and dd with the optimum block size.

In practice though modern cat and cp will ask the system for the preferred block size: st_blksize. Note that this doesn't have to be the optimum block size.

An analogy: it is like pouring the contents of a glass into another glass.

  • idealized cat would do it one drop at a time.

  • dd will use a spoon, and you define exactly how big the spoon is (system limits apply)

  • cp and modern cat will use its own spoon (stat -f -c %s filename will tell you how big it is).

Eduardo Trápani
  • 12,032
  • 1
  • 18
  • 35
  • 2
    The analogy to the glass is really nice! – Panki Dec 20 '19 at 15:06
  • **This answer is completely false.** _"In theory cat will move each byte independently._" - No, it will not. `cat` reads and writes blocks just like `dd` and `cp` do. Modern `cat` (e.g. GNU `cat`) actually [asks the OS what the preferred block size is](https://unix.stackexchange.com/questions/245499/how-does-cat-know-the-optimum-block-size-to-use), for optimum speed. On my system, `cat` uses 128KiB blocks, compared to `dd` which only moves 512 bytes at a time. There's no reason to use `dd` here. And where it matters, it's slower except if you manually match the block size. – marcelm Feb 25 '22 at 19:09
  • _"With `dd` and a good block size (usually related to the physical block size) it will be faster."_ - More misunderstandings; optimal block size is completely unrelated to physical block size (typically 512B, sometimes 4kB). The OS will take larger amounts of data and write it out to multiple physical blocks, no problem. The concerns for userspace applications in choosing block size have to do with 1) minimizing the number of system calls and 2) optimizing CPU cache usage. This is why `cat` uses 128KiB; large enough that the number of syscalls is small, and fits CPU caches well. – marcelm Feb 25 '22 at 19:16
  • @marcelm thanks for the input. I hope the edit and the links make it clearer. – Eduardo Trápani Mar 02 '22 at 05:00
0

I use it mainly because of the status=progress you mentioned; what can I say, I am impatient and need to know :-)

Even if you forgot to add that and started the job, you can send it a SIGUSR1 signal and it will print the current I/O statistics to stderr (which, unless you redirected it, is your terminal).