When trying to format printf output involving strings containing multi-byte characters, it became clear that printf does not count literal characters but the number of bytes, which makes formatting text difficult if single-byte and multi-byte characters are mixed. For example:
$ cat script
#!/bin/bash
declare -a a b
a+=("0")
a+=("00")
a+=("000")
a+=("0000")
a+=("00000")
b+=("0")
b+=("├─00")
b+=("├─000")
b+=("├─0000")
b+=("└─00000")
printf "%-15s|\n" "${a[@]}" "${b[@]}"
$ ./script
0 |
00 |
000 |
0000 |
00000 |
0 |
├─00 |
├─000 |
├─0000 |
└─00000 |
I found various suggested work-arounds (mainly wrappers using another language or utility to print the text). Are there any native bash solutions? None of the documented printf format strings appear to help. Would the locale settings be relevant in this situation, e.g., to use a fixed-width character encoding like UTF-32?