I'm trying to understand the relation between huge page size and how data is actually being written to RAM.
What happens when a process uses a 1GB huge page - does writing occur in 1GB chunks? I guessing I'm completely wrong with this assumption?
I'm trying to understand the relation between huge page size and how data is actually being written to RAM.
What happens when a process uses a 1GB huge page - does writing occur in 1GB chunks? I guessing I'm completely wrong with this assumption?
Huge pages are for allocating chunks of memory, not writing them.
Normally when applications need large amounts of memory, they have to allocate many "pages". A page is simply a chunk of physical memory. Normally this chunk is only a few KB. So when an application is doing a lot of memory intensive operations that span many pages, it's expensive for the kernel to have to translate all those virtual memory pages to physical memory.
To optimize this, the kernel offers huge pages, which are basically allocations larger than the default page size. So instead of having to allocate thousands of pages, it just gets a few. The reads and writes are still of whatever size is being read or written. If the application writes a 10 byte string into a huge page, it's still going to be a 10 byte write.
There is more than one definition of the chunk size for memory writes. You could consider it to be:
None of these are related to a page size.
The page size is an attribute of a page in the MMU. The MMU translates virtual addresses (used by programs) into physical addresses (which designate a physical location in memory). The process to translate a virtual address into a physical address goes something like this:
Common 32-bit architectures go through two table levels; common 64-bit architectures go through 3. Linux supports up to 4 levels.
Some CPU architectures support making some pages larger, going through fewer levels of indirection. This makes accesses faster, and keeps the size of the page tables down, at the expense of less flexibility in memory allocation. The time gain is minimal for most applications, but can be felt with some performance-sensitive applications that don't benefit from the flexibility of small pages, such as databases. Huge pages are pages which go through fewer levels than the normal amount, and are correspondingly larger.
The software that is using large pages typically requests them specifically (via flags to mmap, see how is page size determined in virtual address space? for a few more details). After this initial request, it doesn't need to know or care about the page size. In particular, memory accesses are handled by the MMU: software is not involved at the time of access.