gcc uses the terms ''architecture'' to mean the ''instruction set'' of a specific CPU, and "target" covers the combination of CPU and architecture, along with other variables such as ABI, libc, endian-ness and more (possibly including "bare metal"). A typical compiler has a limited set of target combinations (probably one ABI, one CPU family, but possibly both 32- and 64-bit). A cross-compiler usually means either a compiler with a target other than the system it runs on, or one with multiple targets or ABIs (see also this).
Are binaries portable across different CPU architectures?
In general, no. A binary in conventional terms is native object code for a particular CPU or CPU family. But, there are several cases where they might be moderately to highly portable:
- one architecture is a superset of another (commonly x86 binaries target i386 or i686 rather than the latest and greatest x86, e.g.
-march=core2)
- one architecture provides native emulation or translation of another (you might have heard of Crusoe), or provides compatible co-processors (e.g. PS2)
- the OS and runtime support multiarch (e.g. ability to run 32-bit x86 binaries on x86_64), or make the VM/JIT seamless (Android using Dalvik or ART)
- there is support for "fat" binaries which essentially contain duplicate code for each supported architecture
If you somehow manage to solve this problem, the other portable binary problem of myriad library versions (glibc I'm looking at you) will then present itself. (Most embedded systems save you from that particular problem at least.)
If you haven't already, now is a good time to run gcc -dumpspecs and gcc --target-help to see what you're up against.
Fat binaries have various drawbacks, but still have potential uses (EFI).
There are two further considerations is missing from the other answers however: ELF and the ELF interpreter, and the Linux kernel support for arbitrary binary formats. I won't go into detail about binaries or bytecode for non-real processors here, though it is possible to treat these as "native" and execute Java or compiled Python bytecode binaries, such binaries are independent of the hardware architecture (but depend instead on the relevant VM version, which ultimately runs a native binary).
Any contemporary Linux system will use ELF binaries (technical details in this PDF), in the case of dynamic ELF binaries the kernel is in charge of loading the image into memory but it is the job of the ''interpreter'' set in the ELF headers to do the heavy lifting. Normally this involves making sure all the dependent dynamic libraries are available (with the help of the ''Dynamic'' section which lists the libraries and some other structures which list the required symbols) — but this is almost a general purpose indirection layer.
$ file /bin/ls
/bin/ls: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses \
shared libs), stripped
$ readelf -p .interp /bin/ls
String dump of section '.interp':
[ 0] /lib/ld-linux.so.2
(/lib/ld-linux.so.2 is also an ELF binary, it does not have an interpreter, and is native binary code.)
The problem with ELF is that the header in the binary (readelf -h /bin/ls) marks it for a specific architecture, class (32- or 64-bit), endian-ness and ABI
(Apple's "universal" fat binaries use an alternate binary format Mach-O instead which solves this problem, this originated on NextSTEP).
This means that an ELF executable must match the system it is to be run on.
One escape hatch is the interpreter, this can be any executable (including one that extracts or maps architecture specific subsections of the originally binary and invokes them), but you still are constrained by the type(s) of ELF your system will allow to run. (FreeBSD has an interesting way of handling Linux ELF files, its brandelf modifies the ELF ABI field.)
There is (using binfmt_misc) support for Mach-O on linux, there's an example there that shows you how to create and run a fat (32- & 64-bit) binary. Resource forks/ADS, as originally done on the Mac, could be a workaround, but no native Linux filesystem supports this.
More or less the same thing applies to kernel modules, .ko files are also ELF (though they have no interpreter set). In this case there's an extra layer which uses the kernel version (uname -r) in the search path, something that could theoretically be done instead in ELF with versioning, but at some complexity and little gain I suspect.
As noted elsewhere, Linux does not natively support fat binaries, but there
is an active fat-binary project: FatELF. It's been around for years, it was never integrated into the standard kernel partly due to (now expired) patent concerns.
At this time it requires both kernel and toolchain support.
It does not use the binfmt_misc approach, this side-steps the ELF header issues and allows for fat kernel modules too.
- If i have an application compiled to run on a 'x86 target, linux OS version x.y.z', can i just run the same compiled binary on another system 'ARM target, linux OS version x.y.z'?
Not with ELF, it won't let you do this.
- If above is not true, the only way is to get the application source code to rebuild/recompile using the relevant toolchain 'for example, arm-linux-gnueabi'?
The simple answer is yes. (Complicated answers include emulation, intermediate representations, translators and JIT; except for the case of "downgrading" an i686 binary to only use i386 opcodes they're probably not interesting here, and the ABI fixups are potentially as hard as translating native code.)
- Similarly, if i have a loadable kernel module (device driver) that works on a 'x86 target, linux OS version x.y.z', can i just load/use the same compiled .ko on another system 'ARM target, linux OS version x.y.z'?
No, ELF won't let you do this.
- If above is not true, the only way is to get the driver source code to rebuild/recompile using the relevant toolchain 'for example, arm-linux-gnueabi'?
The simple answer is yes. I believe FatELF lets you build a .ko that is multi-architecture, but at some point a binary version for every supported architecture has to be created. Things which require kernel modules often come with the source, and are build as required, e.g. VirtualBox does this.
This is already a long rambling answer, there's only one more detour. The kernel already has a virtual machine built in, albeit a dedicated one: the BPF VM which is used to match packets. The human readable filter "host foo and not port 22") is compiled to a bytecode and the kernel packet filter executes it. The new eBPF is not just for packets, in theory that VM code is portable across any contemporary linux, and llvm supports it but for security reasons it's probably not going to be suitable for anything other than administrative rules.
Now, depending on how generous you are with the definition of a binary executable, you can (ab)use binfmt_misc to implement fat binary support with a shell script, and ZIP files as a container format:
#!/bin/bash
name=$1
prog=${1/*\//} # basename
prog=${prog/.woz/} # remove extension
root=/mnt/tmpfs
root=$(TMPDIR= mktemp -d -p ${root} woz.XXXXXX)
shift # drop argv[0], keep other args
arch=$(uname -m) # i686
uname_s=$(uname -s) # Linux
glibc=$(getconf GNU_LIBC_VERSION) # glibc 2.17
glibc=${glibc// /-} # s/ /-/g
# test that "foo.woz" can unzip, and test "foo" is executable
unzip -tqq "$1" && {
unzip -q -o -j -d ${root} "$1" "${arch}/${uname_s}/${glibc}/*"
test -x ${root}/$prog && (
export LD_LIBRARY_PATH="${root}:${LD_LIBRARY_PATH}"
#readlink -f "${root}/${prog}" # for the curious
exec -a "${name}" "${root}/${prog}" "$@"
)
rc=$?
#rm -rf -- "${root}/${prog}" # for the brave
exit $rc
}
Call this "wozbin", and set it up with something like:
mount binfmt_misc -t binfmt_misc /proc/sys/fs/binfmt_misc
printf ":%s:%s:%s:%s:%s:%s:%s" \
"woz" "E" "" "woz" "" "/path/to/wozbin" "" > /proc/sys/fs/binfmt_misc/register
This registers .woz files with the kernel, the wozbin script is invoked instead with its first argument set to the path of an invoked .woz file.
To get a portable (fat) .woz file, simply create a test.woz ZIP file with a directory hierarchy so:
i686/
\- Linux/
\- glibc-2.12/
armv6l/
\- Linux/
\- glibc-2.17/
Within each arch/OS/libc directory (an arbitrary choice) place the architecture-specific test binary and components such as .so files.
When you invoke it the required subdirectory is extracted in to a tmpfs in-memory filesystem (on /mnt/tmpfs here) and invoked.
Update: see also the Cosmopolitan project which has a similar idea, and provides a solution for multi-platform executables using a polyglot format.