3

I'm working on a program which needs to detect at runtime whether the system it's running on supports hugepages, and if so, what sizes are available. Ideally I'd like this to work for any POSIX platform, but a Linux-specific solution would be a start.

POSIX supports sysconf(_SC_PAGESIZE) to get the default page size on the platform, but doesn't seem to similarly support asking for any hugepage sizes. I could also potentially check by trying to mmap MAP_HUGE_2MB or MAP_HUGE_1GB arguments, but that would be slow, and in the case of 1GB huge pages, incredibly wasteful (and it could easily fail due to a lack of available memory).

joshlf
  • 345
  • 3
  • 17
  • 1
    Maybe Linux only, but `madvise` has `MADV_HUGEPAGE` that may help. It also has the file path `/sys/kernel/mm/transparent_hugepage/` that gives information. This article may also help you: https://access.redhat.com/solutions/46111 – Patrick Mevzek May 12 '17 at 20:00

2 Answers2

2

For Linux, as a programmer, you probably want to make use of the libhugetlbfs which takes care of everything for you under the hood. (Note: It may not even be required, I do not know for sure, I still have to test).

It replaces all the functions that manage memory with its own version which will automatically switch to a huge page if/when possible. This will include malloc() and mmap() and others like fork() allocating a stack.

With Ubuntu, you can install the library this way:

sudo apt-get install libhugetlbfs-dev

You can then make use of functions such as get_huge_pages() if you definitely want to allocate a large buffer in a huge page. This is an equivalent to malloc(). With the library installed you should have the manual pages so for details you can do:

man get_huge_page

There is also an OS function, mincore(), which allows you to get the number of huge pages and small (4K) pages used by a region of memory. There is a sample using in the BSD code below (where the function comes from, it's a BSD thing). This is most certainly what you were looking for under Unices (although it may not be available on all Unix OSes, like IRIX, HP-UX, AIX... you'd have to check). Note that this function is always available. You do not need the libhugetlbfs library to use it. Note that under Linux it doesn't look like they set anything else than bit 0, meaning that you only know whether the page is resident or not and not the size of the page as in BSD.

I pretty much exclusively work under Linux. But for MS-Windows, this is called "Large Pages". They have docs here: https://docs.microsoft.com/en-us/windows/win32/memory/large-page-support

Finally the various BSD implementations call those "Super Pages" (macos therefore uses that interface). I found this page with a code sample showing how many super pages get used for a large malloc().

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>

int 
main(int argc, char **argv)
{
        size_t size, flags, i, super = 0, ordinary = 0;
        void *addr;
        char *vec;

        if (argc != 2)
                return (0);

        size = atoi(argv[1]);
        addr = malloc(size);
        vec = malloc(size / 4096);

        memset(addr, 0, size);

        flags = mincore(addr, size, vec);

        printf("addr %p len %d:\n", addr, size);
        for (i = 0; i <= size / 4096; i++)
                if (vec[i] & MINCORE_SUPER)
                        super++;
                else
                        ordinary++;
        printf("%d 4K blocks super-mapped, %d ordinary 4K pages\n",
            super, ordinary);

        return (0);
}

And some output of that FreeBSD code:

x23% ./a.out 1000000
addr 0x801006000 len 1000000:
0 4K blocks super-mapped, 245 ordinary 4K pages

x23% ./a.out 10000000
addr 0x801000000 len 10000000:
2048 4K blocks super-mapped, 394 ordinary 4K pages

x23% ./a.out 100000000000
addr 0x801000000 len 1215752192:
296448 4K blocks super-mapped, 367 ordinary 4K pages

As we can see, the first number represents "super mapped pages", meaning that it uses memory allocated in blocks of 2Mb or 1Gb or some other sizes depending on your OS and settings and it was automatically handled under the hood.

IMPORTANT: The memset() is important, it commits ALL the pages of the allocated memory. Without that call, the allocated pages may be 1 or 2...

Alexis Wilke
  • 2,697
  • 2
  • 19
  • 42
0

Linux

Parse the contents of /sys/kernel/mm/hugepages (which is what libhugetlbfs does:

[~]# ls -l /sys/kernel/mm/hugepages/
total 0
drwxr-xr-x 2 root root 0 Dec 30 16:38 hugepages-1048576kB
drwxr-xr-x 2 root root 0 Dec 30 16:38 hugepages-2048kB
[~]# 

In C (headers and error checking omitted for clarity):

// 16 should be plenty for this example (should really do this dynamically)
int size_count = 1;

// use unsigned long long to match strtoull()
unsigned long long page_sizes[ 16 ];

// get the base page size
page_sizes[ 0 ] = sysconf( _SC_PAGESIZE );

// now get the huge page size(s)
DIR *dirp = opendir( "/sys/kernel/mm/hugepages" );
for ( ;; )
{
    struct dirent *dirent = readdir( dirp );
    if ( NULL == dirent ) break;

    if ( '.' == dirent->d_name[ 0 ] ) continue;

    char *p = strchr( dirent->d_name, '-' );
    if ( NULL == p ) continue;

    page_sizes[ size_count++ ] = 1024ULL * strtoull( p + 1, NULL, 0 ); 
}

closedir( dirp );

If you have libhugetlbfs-dev installed:

#include <hugetlbfs.h>

int size_count = getpagesizes( NULL, 0 );

long page_sizes[ size_count ];

getpagesizes( page_sizes, size_count );

Link with -lhugetlbfs

Solaris

In C, use the getpagesizes() function:

#include <sys/mman.h>

int size_count = getpagesizes( NULL, 0 );

size_t page_sizes[ size_count ];

getpagesizes( page_sizes, size_count );

FreeBSD

FreeBSD copied the getpagesizes() function from Solaris 9:

#include <sys/mman.h>

int size_count = getpagesizes( NULL, 0 );

size_t page_sizes[ size_count ];

getpagesizes( page_sizes, size_count );
Andrew Henle
  • 3,722
  • 14
  • 13