I sort of understand it now.
GLOBAL+HIDDEN normally happens in unlinked .o files (incl. static libs) for hidden functions (while unit-local static functions are LOCAL+DEFAULT there already). Then during linking, GLOBAL+HIDDEN converts to LOCAL+DEFAULT, always for normal functions.
When creating shared libraries, it also happens for the GCC/lib internals from the question. However, for non-library executables, these internals are not fully converted, remaining GLOBAL+HIDDEN for a reason that might be pure convenience. As it is not a library that is linked to anywhere, it doesn't do any harm.
Maybe an overview of normal user-defined functions is helpful for other people too... this is about Linux behavior; other systems might be a bit different.
What needs to be considered (normal function)
Let's build a binary mybin. It has three files fila.c, filb.c and filc.c. In one of the code files there is a function func1.
mybin might become a static library (only compiled to .o and archived), a runnable program (linked too) or a shared library (also linked).
func1 in the code can be
- an ordinary function
- weak (
__attribute__((weak)))
- hidden, meaning available within
mybin but not in programs that use the shared lib (if it is one) (__attribute__ ((__visibility__ ("hidden"))))
- static, available only in the
.c file where it is defined (static in C)
- (and protected and internal, but these are not really important)
This does not take optimizations (compile-time and/or LTO (Link-Time Optimization)) into account, which might remove symbols altogether.
Compile step (to .o)
In fila.c, which is the file that has func1, the function is obviously visible and usable for all types (normal/weak/hidden/static).
- An ordinary function:
- is visible to other units like
filb.c and filc.c too (also to static libs, if any)
- There can be only one ordinary/strong function with that name in all units, otherwise the linker complains later (of this one executable / sharedlib / staticlib) (except with weird things like linker option
muldefs).
- in the generated
.o file, the function is GLOBAL+DEFAULT
- Weak function
- is visible to other units too
- There can be multiple weak functions around, in addition to one normal/strong one. In this case, later when linking, the strong one is used for all units and the weak disappear.
- if there are just weaks and no strong one with that name, the first weak wins (i.e., the first file in the compiler command line that has such a weak function)
- in the generated
.o file, the function is WEAK+DEFAULT
- hidden functions
- become GLOBAL+HIDDEN (only here in the
.o file)
- otherwise behave like ordinary functions
- weak+hidden functions
- become WEAK+HIDDEN
- otherwise behave like weak functions
- static functions
- become LOCAL+DEFAULT
- not visible to other units; using them will cause the linker later to complain because all LOCAL are ignored for cross-unit usage
- several units might have its own static function with that name, and they all can be different
- LOCAL+HIDDEN doesn't exist in the
.o file
Compile-time linking
If a static library is created, no linking happens.
For shared libs and executables:
- Anything LOCAL (i.e., static functions) are completely ignored, not visible to other units and not exported as available shared lib function.
- GLOBAL/WEAK+DEFAULT are available as described above, and for shared libs they are entered in the available function list (dynsym) with the same type.
- GLOBAL/WEAK+HIDDEN are available to units of the same binary as described above, but during linking their type changes to LOCAL+DEFAULT (like static functions before linking). For shared libs and their available function list (dynsym), this means that either they are not entered at all, or that at runtime they are ignored.
Function names that get called usually have to exist, either in the binary that just is created, or in any shared library that is specified in the linker command line (and there as GLOBAL/WEAK+DEFAULT).
As explained before, any fully linked binary (shared lib or executable) won't have several GLOBAL/WEAK+DEFAULT functions with the same name, just max. one for each binary. However, different binaries still can have overlaps, i.e., two shared libs (or executable vs shared lib) might both have a func1. This will not cause errors; the lookup is as described in the interposition section below.
If during linking, a used function name is not implemented anywhere (neither executable nor any use shared lib), this usually causes an error. However, if the definition (in the .h file, etc., without code) is marked as WEAK, it links without error, and at runtime evaluates to a null pointer if it cannot be found (can be checked with if). At runtime it might be found because the shared libraries changed in the meantime, or LD_PRELOAD added more libraries.
Runtime interposition and optimizations
Function calls might generate binary instructions that are
- not interposable: These calls jump to some precalculated (absolute/relative) address, of a function that exists in the same binary (same shared lib or same executable, maybe in a different
.o unit or even in the same). This also makes optimizations like function inlining possible.
- interposable: During runtime, the runtime linker
ld.so is asked where the function with this name is, and in different program runs it might result in different results. This also means no function inlining, and possibly a bit worse performance. However, it makes it possible to override functions; see below.
Calls between binaries (executable → shared lib, or sharedlib1 → sharedlib2) are always interposable.
Calls within the same binary, between two "own" functions:
- LOCAL functions (static, hidden) are never interposable.
- For GLOBAL/WEAK functions it depends on the compiler/linker options when this binary was created
(e.g., fPIC fPIE fno-semantic-interposition -Bsymbolic -rdynamic, etc.)
If a call in a running program asks ld.so to search for func1:
- The program's executable binary is always checked first.
- If not found in the main exe,
LD_PRELOAD shared libraries are considered (if any),
- then, all shared libraries that the program uses are checked, in the order that was specified during linking (order in the command line).
Only GLOBAL/WEAK+DEFAULT functions are considered. At this stage, there is no difference between GLOBAL and WEAK anymore; WEAK in an earlier-checked place wins over a non-weak function in a later library. (There were historical differences long ago, but on Linux they are long gone.)
This implies, for example:
- Function calls between binaries can always be overridden by
LD_PRELOAD libraries. If the main executable wants to call func1 from some shared lib, and the preloaded lib has a different func1, the preloaded one wins. Similar for calls between shared libs; the preloaded one always is searched first.
- Functions that exist in the main executable (not in shared libs) can never be overridden, not even by
LD_PRELOAD. For this reason, during compiling the program, it doesn't actually matter if calls to functions within the program are compiled to ask ld.so or not, because the program itself always wins anyway. (The compilation mode does matter for shared libs.)
- Calls between sharedlib2 → sharedlib3 can be overridden by the earlier sharedlib1 or even by the main executable, so that sharedlib2 calls them instead of sharedlib3.
- Calls to global functions within sharedlib2 can, depending on how it was compiled, be either overridable by the main executable and earlier libs, or more optimized but not noticing overridings.
- To prevent overriding for specific functions in the lib without compiler flags, enabling better optimization, and even faster linking, it might be declared hidden (or even static) in the code. This however also means that this function can not be used from the main executable (especially if not overridden). A middle way, that again works without compiler options for specific functions, is to declare the function hidden but to add a global alias. The lib within itself calls the optimized non-overridden hidden function, and everything outside uses the alias (possibly overridden by other libs etc.)
A note about the executable's dynsym
All functions that should override overridable calls in other binaries need to be listed in the dynsym section of their own binary. For shared libs, all GLOBAL/WEAK functions are automatically there, but that is not necessarily true for the main executable.
readelf -s can display the list's contents. In my tests with GCC, it appeared that functions are added if the linker sees at compile time that some of the shared libs have a function with that name too (independent of the code content), but not added if the function appears to be unique (because then it would not override anything).
In some situations this might be a problem.
E.g., if during compiling only the main executable has a func1, but later one of the used shared libs gets changed to have a func1 too (without recompiling the main program), then the main func1 could not override the one of the library.
To avoid this by always listing all functions, use -rdynamic (only on the main executable; it is useless on libraries).
Finally, what the problem with mode PROTECTED that was shortly mentioned?
GLOBAL+PROTECTED on a library function in principle should work like a aliased hidden function like mentioned above: The library has one hidden (LOCAL+DEFAULT) function, that is used within the library itself with optimizations but without the possibility to override it, plus a public alias that can be used (and overridden) from outside.
With PROTECTED, setting HIDDEN and creating an alias would not be necessary.
There are two problems, however:
- Because of details around
ld.so and C ABI requirements (same address for same function), PROTECTED is slower than HIDDEN+alias.
- There are some bugs around, e.g., GCC 19520