My swapdisk can write 100 MB/s. When the system swaps out it only writes 8-20 MB/s.
iostat says the device is 100% active, so I have the feeling that Linux is seeking on the drive or swapping out small chunks.
I could explain this if there were swap-ins at the same time, but there are no swap-ins and no other disk I/O.
Is it possible for me to tell Linux to swap out in bigger contiguous chunks, say, determining the oldest 10 MB pages and swap them out in one chunk?
A bit like saying the page frame size of swap is not 4K but 10M.
For my system I think an algorithm along the lines of this would be ideal:
dirty_pct = dirty pages / all RAM
if dirty_pct > 50%:
# Half of memory is dirty, slowly start swapping out
if busy_time(swapdevice) < dirty_pct:
# if RAM is 60% dirty, start swapping if disk is less than 60% busy
# if RAM is 90% dirty, start swapping if disk is less than 90% busy
identify the next 10 MB that would be swapped if dirty_pct had been 100%
save the 10 MB to swap as a single chunk
mark the pages as clean
This way my system would start swapping out at 50% dirty and it would not affect performance, because it would do so on a drive that was sitting idle anyway. Maybe the swapped data will never be used, and then we wasted some IO that was sitting idle anyway.
$ uname -a
Linux r815 4.15.0-99-generic #100-Ubuntu SMP Wed Apr 22 20:32:56 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux