Highest Voted 'unicode' Questions - Unix and Linux Stack Exchange

149

votes

11 answers

How can I remove the BOM from a UTF-8 file?

I have a file in UTF-8 encoding with BOM and want to remove the BOM. Are there any linux command-line tools to remove the BOM from the file? $ file test.xml test.xml: XML 1.0 document, UTF-8 Unicode (with BOM) text, with very long lines

command-line files unicode

asked Jul 23 '17 at 10:05

m13r

2,635
2
17
14

94

votes

3 answers

Awesome symbols and characters in a bash prompt

I just ran across a screenshot of someone's terminal: Is there a list of all of the characters which can be used in a Bash prompt, or can someone get me the character for the star and the right arrow?

bash prompt unicode

asked Dec 01 '11 at 23:29

Naftuli Kay

38,686
85
220
311

71

votes

2 answers

How can I set Vim's default encoding to UTF-8?

I'd like to contribute to an open source project by providing translated strings. One of their requirements is that contributors must use UTF-8 as the encoding for the PO files. I'm using Vim 7.3 on Linux. How can I be sure that Vim's encoding is…

vim character-encoding unicode

asked Oct 27 '11 at 11:16

Paolo

16,955
11
31
40

61

votes

3 answers

Why is printf "shrinking" umlaut?

If I execute the following simple script: #!/bin/bash printf "%-20s %s\n" "Früchte und Gemüse" "foo" printf "%-20s %s\n" "Milchprodukte" "bar" printf "%-20s %s\n" "12345678901234567890" "baz" It prints: Früchte und Gemüse foo Milchprodukte…

bash unicode printf

asked Mar 09 '17 at 11:44

René Nyffenegger

2,201
2
23
28

59

votes

6 answers

Filtering invalid utf8

I have a text file in an unknown or mixed encoding. I want to see the lines that contain a byte sequence that is not valid UTF-8 (by piping the text file into some program). Equivalently, I want to filter out the lines that are valid UTF-8. In other…

command-line text-processing character-encoding unicode

asked Jan 27 '11 at 00:13

Gilles 'SO- stop being evil'

807,993
194
1,674
2,175

46

votes

5 answers

Updated my arch linux server and now I get tmux: need UTF-8 locale (LC_CTYPE) but have ANSI_X3.4-1968

I recently updated my Arch Linux server and during that process tmux got updated. I was using tmux while the upgrade was going on and used it afterwards, but all during the same SSH session. Now, however, whenever I try to issue any tmux command I…

arch-linux tmux locale unicode

asked Apr 20 '16 at 19:30

RPiAwesomeness

980
2
8
10

46

votes

2 answers

What fonts are good for unicode glyphs

So I was looking at this answer on stackoverflow and realized that my fonts aren't covering a whole lot of the utf-8 unicode spectrum (as I get lots of squares). Does anyone know a font that will cover all of that post?

fonts unicode

asked May 30 '11 at 00:21

xenoterracide

57,918
74
184
250

43

votes

7 answers

Is there an alternative to sed that supports unicode?

For example: sed 's/\u0091//g' file1 Right now, I have to do hexdump to get hex number and put into sed as follows: $ echo -ne '\u9991' | hexdump -C 00000000 e9 a6 91 |...| 00000003 And then: $ sed…

sed unicode hexdump

asked Apr 17 '15 at 08:38

A-letubby

699
2
6
6

42

votes

2 answers

How to make tr aware of non-ascii(unicode) characters?

I'm trying to remove some characters from file(UTF-8). I'm using tr for this purpose: tr -cs '[[:alpha:][:space:]]' ' '

linux text-processing unicode tr

asked Sep 09 '15 at 12:57

MatthewRock

6,826
6
31
54

40

votes

4 answers

How to specify characters using hexadecimal codes in `grep`?

I am using following command to grep character set range for hexadecimal code 0900 (instead of अ) to 097F (instead of व). How I can use hexadecimal code in place of अ and व? bzcat archive.bz2 | grep -v '<[अ-व]*\s' | tr '[:punct:][:blank:][:digit:]'…

shell grep character-encoding unicode

asked Aug 26 '11 at 06:03

Dhrubo Bhattacharjee

501
1
4
8

39

votes

4 answers

gitk crashes when viewing commit containing emoji: X Error of failed request: BadLength (poly request too large or internal Xlib length error)

I'm able to open gitk but it crashes as soon as I open a commit whom changes contains an emoji (not the commit message). Error ❯ gitk --all X Error of failed request: BadLength (poly request too large or internal Xlib length error) Major opcode…

x11 git unicode emoji

asked Jan 15 '21 at 09:57

Édouard Lopez

1,282
12
23

38

votes

7 answers

Convert between Unicode Normalization Forms on the unix command-line

In Unicode, some character combinations have more than one representation. For example, the character ä can be represented as "ä", that is the codepoint U+00E4 (two bytes c3 a4 in UTF-8 encoding), or as "ä", that is the two codepoints U+0061…

command-line text-processing conversion unicode

asked Sep 10 '13 at 18:47

glts

572
1
4
12

37

votes

1 answer

Should we use UTF-8 characters like ⏰ in bash/shell script?

The simple code here is working as expected on my machine if launched with bash : function ⏰(){ date } ⏰ Could there be a problem for other people using this, or is it universal ? I'm wondering because I've never seen anything like this in other…

bash shell unicode

asked Nov 27 '18 at 10:34

bob dylan

1,832
3
20
31

37

votes

8 answers

How can I correctly decompress a ZIP archive of files with Hebrew names?

Someone sent me a ZIP file containing files with Hebrew names (and created on Windows, not sure with which tool). I use LXDE on Debian Stretch. The Gnome archive manager manages to unzip the file, but the Hebrew characters are garbled. I think I'm…

character-encoding zip unicode file-format

asked Dec 28 '15 at 17:47

einpoklum

8,772
19
65
129

33

votes

4 answers

Find the best font for rendering a codepoint

How to find the appropriate font for rendering unicode codepoints ? gnome-terminal find that characters like «⼼» can be rendered with fonts like Symbola rather than my terminal font or the codepoint-in-square fallback (). How ?

fonts unicode

asked Oct 15 '14 at 16:54

Nope

461
4
5

1

2 3

…

31 32 Next

Questions tagged [unicode]