I would like to know if there is a way to prevent Linux from periodically syncing mmap'd files to disk, while still allowing the OS to write back when physical memory gets tight.
I am writing applications which process large images, so large that several multiples of the amount of swap space may be needed. This can result in unexpected OOM crashes as swap is exhausted.
A simple way to allocate large-memory objects without using swap is the mmap() call. This call is very simple to use, and works correctly, but has one major problem which greatly saps performance: The operating system will periodically write out dirty pages from the mmap region.
For my application, this has the effect of reducing CPU utilization from about 2900% to around 700%, making the process 4 times slower.
In the past, RedHat has allowed this behaviour to be turned off by setting the OS parameter vm.flush_mapped_pages to 0, but this setting no longer exists, and would be likely to have unexpected behaviour: I would prefer not to tune OS parameters just to make one process work properly.
FreeBSD allows this behaviour to be turned off by using the NO_SYNC flag in the mmap call, but this is not available in linux.
Here is how I am creating my mmap buffer, with code to handle errors and EINTR removed:
size_t n = 1048576*(size_t)1024*16; // 16G
int fd = open("mem.bin", O_TRUNC | O_RDWR | O_CREAT, S_IRUSR | S_IWUSR);
ftruncate(fd, n);
void* data = mmap(nullptr, n, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_HUGE_2MB, fd, 0);
close(fd);
unlink("mem.bin");
Is there a way to make this memory efficient to use?