16

Consider the shared object dependencies of /bin/bash, which includes /lib64/ld-linux-x86-64.so.2 (dynamic linker/loader):

ldd /bin/bash
    linux-vdso.so.1 (0x00007fffd0887000)
    libtinfo.so.6 => /lib/x86_64-linux-gnu/libtinfo.so.6 (0x00007f57a04e3000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f57a04de000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f57a031d000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f57a0652000)

Inspecting /lib64/ld-linux-x86-64.so.2 shows that it is a symlink to /lib/x86_64-linux-gnu/ld-2.28.so:

ls -la /lib64/ld-linux-x86-64.so.2 
lrwxrwxrwx 1 root root 32 May  1 19:24 /lib64/ld-linux-x86-64.so.2 -> /lib/x86_64-linux-gnu/ld-2.28.so

Furthermore, file reports /lib/x86_64-linux-gnu/ld-2.28.so to itself be dynamically linked:

file -L /lib64/ld-linux-x86-64.so.2
/lib64/ld-linux-x86-64.so.2: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=f25dfd7b95be4ba386fd71080accae8c0732b711, stripped

I'd like to know:

  1. How can the dynamically linker/loader (/lib64/ld-linux-x86-64.so.2) itself be dynamically linked? Does it link itself at runtime?
  2. /lib/x86_64-linux-gnu/ld-2.28.so is documented to handle a.out binaries (man ld.so), but /bin/bash is an ELF executable?

The program ld.so handles a.out binaries, a format used long ago; ld-linux.so* (/lib/ld-linux.so.1 for libc5, /lib/ld-linux.so.2 for glibc2) han‐ dles ELF, which everybody has been using for years now.

Jeff Schaller
  • 66,199
  • 35
  • 114
  • 250
Shuzheng
  • 4,023
  • 1
  • 31
  • 71
  • The kernel does not care about such subtle taxonomic subtleties (and neither should you ;-)). The kernel only makes the difference between ELFs which need an _interpreter_ and those which don't. And AFAIK, you cannot use an _interpreter_ which itself needs one. –  Sep 23 '19 at 14:43
  • @StephenKitt mine hasn't (`/lib/x86_64-linux-gnu/ld-2.28.so`, debian 10 buster) –  Sep 23 '19 at 14:50
  • @mosvy yeah, sorry, I got mixed up between `file`’s erroneous comment about how it defines static binaries, and the reality of `ld-2.28.so`... The differentiator is `PT_DYNAMIC`. – Stephen Kitt Sep 23 '19 at 14:53

2 Answers2

22
  1. Yes, it links itself when it initialises. Technically the dynamic linker doesn’t need object resolution and relocation for itself, since it’s fully resolved as-is, but it does define symbols and it has to take care of those when resolving the binary it’s “interpreting”, and those symbols are updated to point to their implementations in the loaded libraries. In particular, this affects malloc — the linker has a minimal version built-in, with the corresponding symbol, but that’s replaced by the C library’s version once it’s loaded and relocated (or even by an interposed version if there is one), with some care taken to ensure this doesn’t happen at a point where it might break the linker.

    The gory details are in rtld.c, in the dl_main function.

    Note however that ld.so has no external dependencies. You can see the symbols involved with nm -D; none of them are undefined.

  2. The manpage only refers to entries directly under /lib, i.e. /lib/ld.so (the libc 5 dynamic linker, which supports a.out) and /lib*/ld-linux*.so* (the libc 6 dynamic linker, which supports ELF). The manpage is very specific, and ld.so is not ld-2.28.so.

    The dynamic linker found on the vast majority of current systems doesn’t include a.out support.

file and ldd report different things for the dynamic linker because they have different definitions of what constitutes a statically-linked binary. For ldd, a binary is statically linked if it has no DT_NEEDED symbols, i.e. no undefined symbols. For file, an ELF binary is statically linked if it doesn’t have a PT_DYNAMIC section (this will change in the release of file following 5.37; it now uses the presence of a PT_INTERP section as the indicator of a dynamically-linked binary, which matches the comment in the code).

The GNU C library dynamic linker doesn’t have any DT_NEEDED symbols, but it does have a PT_DYNAMIC section (since it is technically a shared library). As a result, ldd (which is the dynamic linker) indicates that it’s statically linked, but file indicates that it’s dynamically linked. It doesn’t have a PT_INTERP section, so the next release of file will also indicate that it’s statically linked.

$ ldd /lib64/ld-linux-x86-64.so.2
        statically linked

$ file $(readlink /lib64/ld-linux-x86-64.so.2)
/lib/x86_64-linux-gnu/ld-2.28.so: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=f25dfd7b95be4ba386fd71080accae8c0732b711, stripped

(with file 5.35)

$ file $(readlink /lib64/ld-linux-x86-64.so.2)
/lib/x86_64-linux-gnu/ld-2.28.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=f25dfd7b95be4ba386fd71080accae8c0732b711, stripped

(with the currently in-development version of file).

Stephen Kitt
  • 411,918
  • 54
  • 1,065
  • 1,164
  • 1
    Why is the word "interpreting" used in the context of dynamic linking? That word is usually used in the context of programing languages. – Shuzheng Sep 23 '19 at 17:16
  • What do you mean by "GNU C library dynamic linker"? Are you referring to `/lib*/ld-linux*.so*`, or a third dynamic linker? – Shuzheng Sep 23 '19 at 17:24
  • Where can you see `ldd` reports the dynamic linker as statically linked? Because it's list of shared object dependencies is empty? – Shuzheng Sep 23 '19 at 17:26
  • 1
    Dynamically-linked programs need some work done to them before they can be executed; that work is done by the dynamic linker, which ends up playing a similar role to an interpreter — it interprets the relocation tables etc. to produce something which the computer can run. – Stephen Kitt Sep 23 '19 at 17:28
  • When I say “GNU C library dynamic linker”, I am referring to the implementation included in the GNU C library, usually shipped as `/lib*/ld-linux*.so*`. I specified the origin of the dynamic linker because there are other implementations available for Linux. – Stephen Kitt Sep 23 '19 at 17:30
  • The `/lib64/ld-linux-x86-64.so.2` reported by `ldd` is the dynamic linker for `/bin/bash` (in my case)? Why does it points to `ld-2.28.so`, which handles a.out format according to the man page? It would be more appropriate to point to `ld-linux*.so*` under `/lib/`? – Shuzheng Sep 23 '19 at 17:32
  • No, it doesn’t handle a.out “according to the man page”. The man page isn’t referring to `ld-2.28.so`; it’s referring to `ld.so`. The fact that `ld-linux.so.2` is a symlink to `ld-2.28.so` is an implementation detail which doesn’t matter to the man page. `/lib64/ld-linux*.so*` is the 64-bit x86 linker, `/lib/ld-linux*.so*` is the 32-bit x86 linker (with the old ABI), `/libx32/ld-linux*.so*` is the 32-bit x86 linker for the x32 ABI. You can’t make `/lib64/ld-linux*.so*` point to `/lib/ld-linux*.so*`. – Stephen Kitt Sep 23 '19 at 17:34
  • Thanks a lot. I was assuming that the man page was describing `ld*.so`, like you are inserting an extra `*` in `ld-linux*.so.*`. Furthermore, even thus the linker is found in `/lib64` the `/lib` documentation of the man page still applies, right? – Shuzheng Sep 23 '19 at 17:39
  • Yes, the manpage applies to `/lib64`, `/libx32` etc. (hence `/lib*/ld-linux*.so*` in my answer). – Stephen Kitt Sep 23 '19 at 17:49
  • @Shuzheng: When handling a system call like `execve("/bin/ls", ...)`, the kernel's ELF binfmt handler sees the `PT_INTERP` field in the binary headers and actually runs `/lib/ld.so` instead (or whatever path is in the PT_INTERP field). So yes, a very similar mechanism as a `#!/bin/sh` shell script is used to actually invoke the dynamic linker when you exec a dynamically linked binary. The first machine instruction executed in user-space for the process will be at the entry point of the dynamic linker. It makes some system calls and *then* jumps to the binary's `_start` entry point. – Peter Cordes Sep 23 '19 at 20:49
  • @Shuzheng: you can see this by running `gdb /bin/ls` vs. a statically-linked executable and using the `starti` GDB command to stop at the first user-space instruction in the process. – Peter Cordes Sep 23 '19 at 20:51
  • @PeterCordes, the dynamically-linked executable would stop at the first instruction of the dynamic linker, where as the statically-linked executable would stop at its first instruction (which could be libc)? – Shuzheng Sep 24 '19 at 06:12
  • @Shuzheng: yes. Except that a statically linked executable's first user-space instruction will always be part of the binary itself. First of all, there are no other files mapped into its address space (that's what statically linked is all about), just the stack, the executable's segments, and the kernel's VDSO pages. And 2nd, the CRT startup files containing `_start` that gcc/clang link by default (unless you use `-nostartfiles` or `-nostdlib`) are separate from libc. `_start` is always in the executable binary even in a dynamically linked executable. In the static case nothing runs before. – Peter Cordes Sep 24 '19 at 06:20
0
  1. I suspect the file program is wrong about the dynamically linker/loader being dynamically linked itself. The ldd program does not agree. At least not on my system (Debian Stretch):

    ldd /lib/x86_64-linux-gnu/ld-2.24.so
        statically linked
    
  2. man ld.so also reads: "ld-linux.so* handles ELF". On your system (and mine too by the way) both are symlinks to the same binary which I deduce is able to handle both ELF and the (old obsolete) a.out format.

Hkoof
  • 1,597
  • 10
  • 11