10

Long story short: how to print in a terminal the binary digits constituting a file e.g. a library .so or a simple text .txt file


PC hardware works with electrical signal (basically it's an ON/OFF behaviour) which is well logically translated by the binary system (digits 0s and 1s). Visualizing the content of a file would be an interesting educational exercise, as well as comparing a .txt and an executable that prints the same text.

mattia.b89
  • 3,142
  • 2
  • 14
  • 39
  • 1
    Please provide sample input and output – Praveen Kumar BS Aug 07 '21 at 08:42
  • 1
    You should explain the exact purpose of such output. Tody very few people can make sense of binary numbers; instead they use tools like `objdump`, `hexdump` or disassemblers. – U. Windl Aug 09 '21 at 07:14
  • 1
    If you want to do a lot of visualization of binary numbers it would be useful for you to learn how and why hex numbers are used. – Rodney Aug 09 '21 at 10:14

2 Answers2

19

xxd can give binary output. Example below.

$ cat foo
Hello World
$ xxd -b foo
00000000: 01001000 01100101 01101100 01101100 01101111 00100000  Hello
00000006: 01010111 01101111 01110010 01101100 01100100 00001010  World.
$
steve
  • 21,582
  • 5
  • 48
  • 75
  • 5
    Thanks! for sake of completeness it is included in the `vim` or `gvim` package on Arch Linux – mattia.b89 Aug 07 '21 at 10:21
  • 5
    BTW, that `.` at the end (00001010) is an ASCII newline `NL`. Since its first 4 bits are all zero, it's an unprintable ASCII control character so xxd just prints `.` – MSalters Aug 09 '21 at 10:30
  • Is this ASCII bits or UTF8 bits? – Jeremy Boden Aug 11 '21 at 17:21
  • The example file `foo` is ASCII encoded. – steve Aug 11 '21 at 17:51
  • 1
    @JeremyBoden the right side is ASCII. A byte is 8 bits: `00000000`. ASCII maps 7 bit numbers to characters `00000000` - `01111111`. The first bit isn't used by ASCII, it's always `0`. UTF-8 is a system that extends ASCII, essentially it says "if the first bit is `0`, read the byte as ASCII, if the first bit is `1` read the next 1, 2 or 3 bytes as well and use [this scheme](https://en.wikipedia.org/wiki/UTF-8#Encoding) to interpret that set of bytes as a number which is then mapped to a character". So it's ASCII, characters outside of ASCII will be displayed as `..` `...` or `....`. – Boris Verkhovskiy Feb 25 '23 at 09:24
13

With basenc (from coreutils)

$ echo 123 | basenc --base2msbf -w8
00110001
00110010
00110011
00001010
rowboat
  • 2,753
  • 4
  • 16