1

I'm experimenting with Systemd and in particular Journald on a Yocto Poky (Linux) based embedded device with eMMC storage but I'm worried about flash wear since the write stats are excessively high as shown by both /sys/block/mmcblk0/stat and blktrace.

Note: I wish to keep the logs across reboots and I'm using the structured fields journald/journalctl provide, so a volatile journal with syslog forwarding doesn't seem feasible unless there's a way to keep the additional fields. That said, I can afford to lose some low-prio messages in a crash provided high-prio messages make it.

As an example, a test case that was logging a few short DEBUG message every 8 seconds wrote to MMC at a rate of 1.4 GB/day despite the messages themselves being only 5 MB/day (but note that using journalctl -o export | wc as a metric gives 125 MB/day - presumably due to metadata).

There is nearly 300x write multiplication in that example, which seems very high (I would have expected 10x). I would like to know what can be done to reduce the multiplication factor?

blkparse output:
jbd2/mmcblk0p4- (180)
 Reads Queued:           0,        0KiB  Writes Queued:       33611,   134444KiB
 Read Dispatches:        0,        0KiB  Write Dispatches:        0,        0KiB
 Reads Requeued:         0               Writes Requeued:         0
 Reads Completed:        0,        0KiB  Writes Completed:        0,        0KiB
 Read Merges:            0,        0KiB  Write Merges:        17722,    70888KiB
 IO unplugs:          7835               Timer unplugs:          91
 Allocation wait:        0               Allocation wait:         0
 Dispatch wait:          0               Dispatch wait:           0
 Completion wait:        0               Completion wait:         0
journal-offline (248, ...)
 Reads Queued:           0,        0KiB  Writes Queued:        6277,    46528KiB
 Read Dispatches:        0,        0KiB  Write Dispatches:        0,        0KiB
 Reads Requeued:         0               Writes Requeued:         0
 Reads Completed:        0,        0KiB  Writes Completed:        0,        0KiB
 Read Merges:            0,        0KiB  Write Merges:          154,     3840KiB
 IO unplugs:           563               Timer unplugs:          30
 Allocation wait:        0               Allocation wait:         0
 Dispatch wait:          0               Dispatch wait:           0
 Completion wait:        0               Completion wait:         0
kworker/u2:0 (253, ...)
 Reads Queued:           0,        0KiB  Writes Queued:       32284,   288288KiB
 Read Dispatches:        0,        0KiB  Write Dispatches:        0,        0KiB
 Reads Requeued:         0               Writes Requeued:         0
 Reads Completed:        0,        0KiB  Writes Completed:        0,        0KiB
 Read Merges:            0,        0KiB  Write Merges:         1351,    32744KiB
 IO unplugs:          2553               Timer unplugs:         710
 Allocation wait:        0               Allocation wait:         0
 Dispatch wait:          0               Dispatch wait:           0
 Completion wait:        0               Completion wait:         0
kworker/u2:1 (414, ...)
 Reads Queued:           0,        0KiB  Writes Queued:       28428,   260332KiB
 Read Dispatches:        0,        0KiB  Write Dispatches:        0,        0KiB
 Reads Requeued:         0               Writes Requeued:         0
 Reads Completed:        0,        0KiB  Writes Completed:        0,        0KiB
 Read Merges:            0,        0KiB  Write Merges:         1256,    29428KiB
 IO unplugs:          2290               Timer unplugs:         679
 Allocation wait:        0               Allocation wait:         0
 Dispatch wait:          0               Dispatch wait:           0
 Completion wait:        0               Completion wait:         0
kworker/u2:2 (208, ...)
 Reads Queued:           1,        4KiB  Writes Queued:       22264,   200632KiB
 Read Dispatches:        0,        0KiB  Write Dispatches:        0,        0KiB
 Reads Requeued:         0               Writes Requeued:         0
 Reads Completed:        0,        0KiB  Writes Completed:        0,        0KiB
 Read Merges:            0,        0KiB  Write Merges:          918,    21048KiB
 IO unplugs:          1782               Timer unplugs:         481
 Allocation wait:        0               Allocation wait:         0
 Dispatch wait:          0               Dispatch wait:           0
 Completion wait:        0               Completion wait:         0
systemd-journal (206)
 Reads Queued:          13,       52KiB  Writes Queued:         190,      508KiB
 Read Dispatches:        0,        0KiB  Write Dispatches:        0,        0KiB
 Reads Requeued:         0               Writes Requeued:         0
 Reads Completed:        0,        0KiB  Writes Completed:        0,        0KiB
 Read Merges:            4,       16KiB  Write Merges:            0,        0KiB
 IO unplugs:           128               Timer unplugs:           4
 Allocation wait:        0               Allocation wait:         0
 Dispatch wait:          0               Dispatch wait:           0
 Completion wait:        0               Completion wait:         0
CPU0 (mmcblk0p4):
 Reads Queued:         632,    61712KiB  Writes Queued:      123054,   930732KiB
 Read Dispatches:     8616,    61712KiB  Write Dispatches:   101590,   930732KiB
 Reads Requeued:         0               Writes Requeued:         0
 Reads Completed:     8616,    61712KiB  Writes Completed:   109579,   930732KiB
 Read Merges:            4,       16KiB  Write Merges:        21401,   157948KiB
 Read depth:             2               Write depth:             9
 IO unplugs:         15952               Timer unplugs:        2012

Things I've tried:

  • adjusting SyncIntervalSec in journald.conf - it didn't seem to affect the totals much, just the timing
  • F2FS instead of ext4 - actually made it a bit worse but I haven't tried tuning mount/mkfs params yet (any tuning ideas welcome)
  • sysctl vm.dirty_writeback_centisecs=0 - this is because I noticed from blktrace that much of the amplification seemed to be from kworkers in between journal-offline runs, presumably from journald mmap() usage? that gave around 400 MB/day so it's a significant improvement, but I'm not sure that CRIT messages are still syncing immediately like the journald doc says
  • https://github.com/systemd/systemd/pull/21183 seems like it might help, but it's got a way to go before being stable
  • fs-level compression - results not clear yet, not sure if this methodology will evaluate it properly?
squashed
  • 61
  • 3
  • Lennart and co don't give a damn: https://github.com/systemd/systemd/issues/15292 Also check this: https://github.com/systemd/systemd/issues/22940 – Artem S. Tashkinov Jun 02 '22 at 10:11
  • I'm not saying it's broken though - it's not designed for the application I'm using it in. E.g. if my MMC was the size of a typical SSD (and the SSD did ideal wear leveling) even this level of writing wouldn't be an issue. But I'm hoping someone knows of ways to tune either it or the OS to bring it down a bit... – squashed Jun 02 '22 at 23:22

0 Answers0