5

For the purpose of compiling code on fast hard drives (NVME for example), is there a clear winner for code compilation or are the popular file systems roughly comparable? (EXT4, XFS, BTRFS, ZFS)?

I'd assume the efficiency gains from any compression would be negated by the CPU overhead which could otherwise be used for compilation.

ideasman42
  • 1,091
  • 1
  • 10
  • 23
  • 9
    Nothing beats a tmpfs for compilation ;-). – Stephen Kitt Oct 22 '22 at 11:33
  • 2
    I wholeheartedly support @StephenKitt comment… provided you get a comfortable amount of RAM. I would not venture on chromium compilation (for example) with an 8G only tmpfs (meaning 16G RAM total) and that is precisely when tmpfs is not an option than OP's question makes a lot of sense. – MC68020 Oct 22 '22 at 12:33
  • forums.gentoo.org would probably have opinions on this topic, as it's routine for users to compile their packages (even very large programs like chromium). – preferred_anon Oct 22 '22 at 19:18
  • 1
    https://www.phoronix.com/review/linux-50-filesystems/3 has some numbers for EXT4 vs. XFS vs. BTRFS for Linux 5.0, on an NVMe drive. (The article's from Jan 2019 :/ I'd assume Phoronix has done some more recent testing that included timed compilation on different filesystems, since that's part of their test suite.) – Peter Cordes Oct 22 '22 at 19:47
  • 1
    Linux caches disk access aggressively in RAM. With enough RAM, I'd expect the difference between filesystems to be fairly small, especially considering that compilation tends to be CPU-bound, not I/O bound. But; you know the exact workload better than we do. Have you tried benchmarking different filesystems? – marcelm Oct 22 '22 at 21:06
  • @marcelm : You wrote : *"compilation tends to be CPU-bound, not I/O bound."* I would not swear about that. My point being why (under modern CFS) is make -j NCPU+1 actually more efficient than make -j NCPU ? – MC68020 Oct 22 '22 at 22:53
  • @MC68020 Well, I just did a few quick benchmarks, compiling the Linux kernel with default config, and I found no difference at all between ext4-on-lvm-on-luks-on-nvme vs tmpfs. -j5 was a tiny bit slower than -j4. But I'm using an aging 4-core i5, so maybe I don't have enough CPU power to make a difference. – marcelm Oct 23 '22 at 12:29

1 Answers1

6

I've compiled in tmpfs for over a decade now. It's the fastest option bar none if you have enough RAM. It's a filesystem which resides entirely in your RAM.

Both btrfs and zfs feel like the worst options considering their overhead. Ext4 specially without a journal and XFS are both extremely fast.

Phoronix has a ton of reviews, Google for them.

Here's one of the freshest ones: https://www.phoronix.com/news/Linux-5.14-File-Systems

Artem S. Tashkinov
  • 26,392
  • 4
  • 33
  • 64
  • 1
    BTW tmps is indeed the best option… as far as you get an UPS. If you don't then… you'll have to restart your build from scratch. Which leads to a question being still not solved in my brain about journaling. I make no doubt that build will necessarily be faster on a non-journalled FS however, in case of some power shortage, don't I take the risk of having to restart the build from scratch ? – MC68020 Oct 22 '22 at 12:47
  • I've got a blazingly fast CPU, and I've not had power outages for years, so not sure I have any concerns about that. – Artem S. Tashkinov Oct 22 '22 at 13:27
  • 8
    Few projects take long enough to build that restarting after an outage is an issue, so that might be an optimization that is going too far. – Simon Richter Oct 22 '22 at 14:11
  • @SimonRichter : Granted! You are absolutely correct. I am convinced only people having "*blazingly fast cpu*" should embark in some projects. It appears that apart from those being sponsored by RH-or-like (much thanks meant), they usually don't. And how many of them will realize that their *blazingly fast cpu* can remain *"mostly idle"* thanks to the work of many devs (absurdly?) working under core2+8Gigs ? – MC68020 Oct 22 '22 at 21:22
  • 1
    Speaking from actual experience, BTRFS is actually not that bad. Most software builds are almost pure WORM workloads that almost exclusively write out whole files at a time, which is actually a near optimal workload for a CoW filesystem. It won’t ever beat ext4 without a journal, but in a RAID setup it actually can approximately match, or possibly even outperform, XFS on DM-RAID depending on the rest of the storage stack. – Austin Hemmelgarn Oct 22 '22 at 22:32
  • 2
    @MC68020, my point is: outages are seldom, so optimizing for this case and wasting performance on the normal case is not useful. When I compile stuff on my Amiga, I normally disable journalling for the build tree, because it causes head movements and slows down the compile. If I lose power, I will need to delete the build tree and check the file system, but that hasn't happened so far. – Simon Richter Oct 23 '22 at 00:05
  • Thanks, I'm aware of Phoronix's benchmarks, it's just I didn't see a recent one that focused on code-compilation on NVME's although I might have missed it. – ideasman42 Oct 23 '22 at 03:02
  • 1
    @ArtemS.Tashkinov What performance benefit are you seeing from building on tmpfs? Can you share some numbers compared to regular filesystems? – marcelm Oct 23 '22 at 12:30
  • @marcelm it's highly dependent on the code you're trying to compile. You **will** get huge performance improvement for building e.g. the Linux kernel (granted you have a fast modern CPU and loads of RAM) which is relatively simple C code but if you build something like Firefox, LibreOffice or KDE, improvements will be a lot smaller since these three projects C++ code is very dense and C++ is a lot harder to compile in general. Secondly, untarring on `tmps` is near instant vs. the actual storage. Lastly, with tmpfs you can omit `-pipe` which is absolutely necessary for the actual storage. – Artem S. Tashkinov Oct 24 '22 at 10:17
  • @ArtemS.Tashkinov Interesting; I briefly benchmarked building the Linux kernel on ext4 vs tmpfs (see my comment on the question), and I saw absolutely no difference. While my CPU is not impressive, if tmpfs is so much faster, I'd expect to see _some_ difference even on my limited CPU... – marcelm Oct 24 '22 at 16:24
  • Try to repeat the test but before it run `echo 3 | sudo tee /proc/sys/vm/drop_caches` and after it run `sync`. You must feel something ;-) If you just untar the linux kernel source tree and run compilation after it, the entire source code tree will still be in your RAM. The first command will deal with that :-) The second command makes sure the generated object files have actually been stored on your device. – Artem S. Tashkinov Oct 24 '22 at 22:35