Why did {} start appearing as äå in Terminal.app?

Question

I'm running CentOS 7.9 and today my terminal is showing weird characters. Some letters look fine, but some symbols appear as some non-English characters. For example I made a text file with the following contents:

!@#$%^&*()_+{}

When I open this in vi or nano it looks correct just as I have written above. But in the terminal it looks like this:

$ cat chars.txt 
!É#$%Ü&*()_+äå

$ od -An -vtx1 < chars.txt
 21 40 23 24 25 5e 26 2a 28 29 5f 2b 7b 7d 0a

These weird characters appear everywhere, even while I'm typing. This machine was working fine until today. I think it happened after I downloaded a binary file with curl and forgot to use the -O argument. Rebooting and logging in as a different user did not help. My locale settings are below; I'm not sure what else to look for. The shell I'm using is Bash version 4.2, nothing unusual.

$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

Edit: It would be more accurate to say the server’s shell is bash 4.2. I was logging in using Terminal.app on macOS Big Sur.

I added it to the question. But for some reason, the problem has fixed itself now! — Elliott B, Mar 06 '21 at 06:57
@ilkkachu, CentOS7 was released in 2014, based on RHEL7 released a few month earlier, so packages there would be based on what was current a few months before that (so probably around mid-2013). It's still maintained until 2024. — Stéphane Chazelas, Mar 06 '21 at 14:26
Wow, '{}' turning into 'äå` brings back memories, but that would be a problem we had in Sweden 30 years ago. I thought it was extinct! — pipe, Mar 06 '21 at 14:29
What kind of terminal are you on? Or is it a virtual terminal of some sort (e.g. Linux virtual console)? Please add that information to the question, as it's more likely to be a setting with your terminal (as shown in the first answer) than with the system it's connected to. — Toby Speight, Mar 07 '21 at 10:07
Your edit is suggesting that the terminal or terminal emulator you used (note that a shell is not and doesn't have anything to do with a terminal emulator, other that it interacts with one when interactive and doesn't have much relevance to the issue you're having) was not CentOS's `xterm` or any of CentOS terminal emulator, but more likely one of macos'. Can you clarify which one it is? Was that `xterm` (which version?), or the native *Terminal* application of macos? — Stéphane Chazelas, Mar 08 '21 at 07:01
https://opensource.apple.com/source/X11apps/X11apps-44/xterm/xterm-269/version.h.auto.html suggests the latest version of macos comes with an old version of xterm as well (269 from 2011 even older than CentOS'). — Stéphane Chazelas, Mar 08 '21 at 07:15
Interesting. That would suggest that's also a VT220 (or superset thereof) emulator and with the same bug xterm originally had, suggesting its code is based on xterm's. Can you reproduce the issue by running `printf '\e(H'` as is the guess in my answer? — Stéphane Chazelas, Mar 08 '21 at 07:23
https://invisible-island.net/ncurses/terminfo.src.html#toc-_Terminal_app has some interesting information about that software, nothing in terms of `xterm`'s code heritage AFAICS. — Stéphane Chazelas, Mar 08 '21 at 07:37
Yes I can reproduce it with `printf '\e(H'` in Terminal (version 440). I don't use xterm but I have 366 installed and that does not show the problem. — Elliott B, Mar 09 '21 at 00:21

Stéphane Chazelas · Accepted Answer · 2021-03-08T08:27:56.113

I can reproduce it with the xterm terminal emulator (version 366), if I do:

$ printf '\e[?42h\e(H'; cat chars.txt; printf '\e(B\e[?42l'
!É#$%Ü&*()_+äå

Where:

\e[?42h. Enables National Replacement Character sets
\e(H selects a Swedish charset for the G0 set (CP1106).

The other ones undo the settings.

To restore the terminal to a sane state, you can also use the ncurses reset command. Here, I find that it's the \ec sequence it sends (rs1/reset_1string capability as sent by tput rs1 for instance) that takes care of restoring the default charsets).

As to why nano for instance displays them normally, you'll find that if you run nano inside a script command session and look in the typescript result afterwards, that nano does send a \e(B sequence (selects US-ASCII for G0) after having switched to the alternate screen with \e[?1049h presumably as part of some ncurses initialisation and the original charset is restored when nano leaves that alternate screen upon exit.

Now, getting a \e(H sequence (0x1b 0x28 0x48 byte sequence), in a binary file by chance is plausible. On average, one in 16 million random 3 byte sequences are that one. Here, I find some in:

$ LC_ALL=C grep -rFl $'\e(H' /lib
/lib/x86_64-linux-gnu/libicui18n.so.66.1
/lib/x86_64-linux-gnu/ceph/libceph-common.so.2
[...]

for instance.

But finding \e[?42h, a 6 byte (48 bit) sequence by chance would be a lot more unlikely (1 in 280 trillions 6-byte sequences). And even more so to have both that and \e(H in that order in the random binary file you dumped onto the terminal.

But xterm on CentOS7 is an old version (295). And in that version, the \e[?42h sequence to enable that ISO2022 handling is not necessary. In that version of xterm, \e(H alone is enough to obtain that behaviour. That changed in version 297 released in September 2013. That explains why it's more likely to run into that in CentOS7 or any system from that era than in more recent systems.

As you indicated your workstation was running macos and not CentOS, note that macos seems to be coming with an even older version of xterm (269 as of 2021), and I would expect the Terminal.app terminal emulator you clarified you were actually using has the same bug as xterm used to have (in that it wasn't emulating the VT220 terminal properly, though maybe its intent was to emulate those old versions of xterm instead).

In even older days (up until xterm 182 where it was changed I believe), another common artefact when dumping binary files to the terminal was switching to the Special Character and Line Drawing Set when the 0x0e byte (SO / ^N control character) was sent to the terminal. SO still switches to the G1 set, but at the time that G1 set was initialised as the line drawing set. You get the same effect today (though it's less likely to occur from random binary) by sending the \e(0 sequence, which selects the line drawing set for G0:

$ printf '\e(0 blahblah \e(B\n'
 ␉┌▒␤␉┌▒␤

You can get back to the old behaviour where ^N/^O switches between ASCII and the line drawing set with the \e)0 sequence.

As to why rebooting didn't help, bear in mind that it's your terminal emulator that was affected by that escape sequence. Rebooting the system you had sshed into from that terminal emulator would not help. Rebooting the system you ran xterm on would have helped, but so would have restarting just that terminal emulator, or running that reset command as seen above within the terminal (locally, or over ssh, it doesn't matter as long as the rs1 escape sequence is sent to the terminal emulator).

More info at:

https://vt100.net/docs/vt220-rm/chapter2.html (from the manual of the DEC VT220 terminal which introduced these sequences that xterm emulate)
https://invisible-island.net/xterm/ctlseqs/ctlseqs.html
https://en.wikipedia.org/wiki/ISO/IEC_2022
https://en.wikipedia.org/wiki/ISO/IEC_646

I wish I had a second upvote for answers of this quality. Thank you. — studog, Mar 06 '21 at 15:33
If you ssh into a`B` system and that system gets reboot, the `ssh` connection gets broken and needs to be restarted, correct?. So: what would be the sequence of events that lead to a persistent terminal connection to `B` across `B` reboots? — , Mar 07 '21 at 03:56
@Isaac `ssh` is restarted, but the terminal ssh is running in is not necessarily restarted. — wastl, Mar 07 '21 at 04:50
@studog, you do have enough rep to award a bounty, which amounts to the same thing. You'll need to wait a couple of days for the question to become eligible, though. — Toby Speight, Mar 07 '21 at 10:04
Wow! Answers like this are why I keep coming back to this site. Thank you for the excellent explanation. — Elliott B, Mar 08 '21 at 03:08
@Isaac Yes, and then another `ssh` to get back to the machine. — wastl, Mar 08 '21 at 16:29

Why did {} start appearing as äå in Terminal.app?

1 Answers1