2

I came across a SSD which have a very significant performance drop (about 20 times). As there is an ExFAT filesystem used, I suspect it might be due to fragmentation.

Is there a tool available in the open source / free software world (= permissive or affordable license) to de-fragment the filesystem?

Yes, I know about the Good old way of backing up, reformatting and putting back. In this case it might be quite lengthy (some TBs of data in an embedded measurement system).

d.c.
  • 887
  • 1
  • 7
  • 14
  • Low performance due to fragmentation really only happens on spinning rust, as the system waits for the heads to fly repeatedly across the platters. That 20 times drop may be a sight that the drive does not have very long to go. – Bib Nov 15 '21 at 12:57
  • @Bib Not really, most USB flash drive controllers are slow as hell and have atrocious random IO speeds. – Artem S. Tashkinov Nov 15 '21 at 15:03
  • Uhm, yes, if the performance _dropped_ 20 fold over time. – Bib Nov 15 '21 at 15:19
  • How do you have TB of data on ExFAT? That seems to be a *very* odd choice for large datasets! (I know this doesn't help you, but it might help shift the effort/benefit tradeoff for move away, discard all and format as something different, move back as option.) – Marcus Müller Nov 15 '21 at 15:30
  • But I do agree, if it's an SSD, it's really rather questionable that defragmentation would help. Do benchmarks (e.g. `gnome-disks`) tell you that random access is the problem, or is linear access also slower than you'd like? – Marcus Müller Nov 15 '21 at 15:33
  • Well, the choice is simple. We use Linux and FPGA in our devices. Some of them need to handle very large datasets. So U.2 SSD are the only choice besides the big array. If customer needs to handle data in his/her machine (s)he may put U.2 to Win server/workstation. NTFS under embedded Linux is not working well enough. – d.c. Nov 15 '21 at 15:34
  • According to SSD's counters it's not running out of spare blocks, nvme smart-log says: available_spare : 100% available_spare_threshold : 5% – d.c. Nov 15 '21 at 15:35
  • @d.c. interestingly I heard the same argument from a customer of a company I know, who demanded ready-for-consumption data from the high-rate loggers that company produces; instead of just using custom software (easier to write) to simply extract the data from the continuous stream of measurement recorded to an SSD array, they insisted the hardware demuxes different streams and saves them separately – which of course makes the whole system (that the customer pays for) much more complex (and harder to write at high speeds, too). But if a customer threatens you with loads of money, you comply... – Marcus Müller Nov 15 '21 at 15:37
  • @d.c. so if you want to hear my biased point of view: tell the customer that a file system isn't what they need, they need the full disk to function as database / tagged stream recorder / whatever you application actually needs instead of files with names, lengths and blocks strewn across, and that their readout software on the Win workstation needs to be able to deal with that. Writing a windows software to read raw volumes is easier than doing file systems in embedded hardware, and probably serves your purpose better. – Marcus Müller Nov 15 '21 at 15:39
  • @d.c. since that's probably not feasible: Your time is probably valuable. Buy a hard drive, move the files to the hard drive, format/discard, move back: defragmentation would take longer than that, inherently. If possible, change the software that writes the data reserve the file at full size at creation, to avoid future fragmentation. (fragmentation only happens if files are "concurrently/alternatingly" extended, or deleted). But honestly, I have very serious doubts fragmentation is the problem here! – Marcus Müller Nov 15 '21 at 15:40
  • @Marcus Mueller - I agree to the most of your comments. But I have a very limited influence on the customers. 1. I am surprised to see such a performance drop with such an expensive hard drive. 2. I would like to trace the reason (and possibly avoid such thinks in my favorite OSes. 3. I need to provide customers with a guide to solve performance drops too deep for the device to work well. I am in the process of tests (removing measurment data to backup etc.) and it is lengthy (some TB). TXH! – d.c. Nov 15 '21 at 15:45
  • hm, it's wise to *consider* fragmentation, but I really doubt it's your original problem! really, take one of the slow drives and just `dd` from it (to `/dev/null`) if you will, and see how fast that is. The things you *can* optimize (because, as commented below, I don't even think *windows* has defragmentation for exFAT!) would be access patterns (especially preallocations). – Marcus Müller Nov 15 '21 at 15:58
  • Heavy fragmentation is absolutely easy to achieve: start writing several streams of data simultaneously using 4K blocks one by one. Once this way my MySQL produced ISAM files which had over 200K fragments and it was impossible to work with it (it wasn't me who set it up). It took **many hours** just to dump the files and move everything into InnoDB after which everything started to fly. The FS was ext3. – Artem S. Tashkinov Nov 15 '21 at 16:08
  • I was not given an opportunity to do thorough investigation, but reformatting did help. So I can't decide whether it was caused by heavy fragmentation and overhead connected to it or whether it was because of some other filesystem (in)consistency issue (fsck is run before every mount, but anyway). THX for valuable insights to everyone, especially to Marcus Mueller and Artem S. Tashkinov! – d.c. Nov 18 '21 at 21:26
  • Further investigation lead to a bit surprising observation, that it is actually very easy to make inconsistency in exfat filesystem, but there is no easy remedy. I tried different versions of fsck.exfat (there is a fork, but some are going ahead in the original branch) on Linux and two tools on Win 10. Sometimes a combination of tools and manual deletion of broken files helps. And yes, it *has* the huge impact on filesystem performance. No fragmentation necessary. – d.c. Nov 29 '21 at 13:27

1 Answers1

3

No.

The same applies to NTFS and FAT32. Actually AFAIK off all filesystems that Linux supports, only ext4 can be defragmented (only individual files one by one), and XFS (full defragmentation available).

As a last resort you could install a trial version of Windows 10 Enterprise and defragment from it. There is no built-in defrag tool in Windows either, but there are some third party tools, e. g. Defraggler, O&O Defrag and UltraDefrag.

Defragmenting SSD/NVMe storage is generally not recommended (possible inessential wearing of flash erase blocks).

Some fragmentation issues are only specific to revolving HDD (seek time), some can be experienced also on SSD when filesystem susceptible to fragmentation is used.

In Linux-only environment a cycle of backup, re-format and restore may be the only (or easiest) option.

d.c.
  • 887
  • 1
  • 7
  • 14
Artem S. Tashkinov
  • 26,392
  • 4
  • 33
  • 64