Given a text file, or the output of a command, how can I truncate it so that every line longer than N characters (usually N=80 in a terminal) gets shorten to N characters maximum?
Asked
Active
Viewed 8.3k times
95
Totor
- 19,302
- 17
- 75
- 102
-
See also: http://unix.stackexchange.com/q/175852 – don_crissti Mar 25 '15 at 15:16
1 Answers
141
You can use cut to achieve this (using N=80 here):
some-command | cut -c -80
or
cut -c -80 some-file.txt
Replace 80 with the number of characters you want to keep.
Note that:
- Multi-bytes characters may not be handled correctly, depending on your implementation;
- Multi-characters bytes (aka tabs) may be treated as one char (& this question treats this).
Dale Anderson suggests the use of some-command | cut -c -$COLUMNS which truncates to the current terminal width.
Libin Wen suggests that the equivalent cut -c 1-80 may be better for understanding.
Totor
- 19,302
- 17
- 75
- 102
-
23
-
2fixed command syntax you must supply a range: "some-command | cut -c *1-80*" – Bernard Hauzeur Jan 27 '20 at 09:43
-
5Without the `1` it's easy to miss the `-` before the 80 so I agree it makes more sense. – Sridhar Sarnobat Feb 05 '20 at 01:05
-
8I like to use `some-command | cut -c -$COLUMNS` which uses the entire terminal width, whatever that currently happens to be. – Dale C. Anderson Jan 18 '21 at 06:52
-
-
@Kindred usually, `some-command` detect their output type (terminal, pipe, file...), and thus automatically disable colouring when their output is not going to a terminal. There is usually an option to force coloured output. For example, `grep --color=always` (same for `ls`). Look at the `man` for `your-command`. Note that your version of `cut` however, may not consider the extra colouring bytes as zero sized characters, and may therefore cut a text with and without colour differently. – Totor May 28 '22 at 12:42
-
Using `... cut -c -${SOME_NUMBER} ...` may result in invalid strings, as for example with emojis (which are multibyte characters). Example: Using `` (with each of these emoji smiley being a 4-byte character): `echo '' | cut -b -7 | xargs touch` creates a file named `''$'\360\237\230'`, that is: the first four bytes are correctly interpreted as the *Smiling Face with Sunglasses*, while the remaining three bytes result in an invalid UTF-8 byte sequence. – Abdull Aug 25 '23 at 21:44
-
Exacerbatingly, even in 2023, *GNU coreutils 9.3's cut* still doesn't handle UTF-8 right, as it assumes a "character" to always be "one byte" in size (see https://unix.stackexchange.com/a/163725/20230), so `cut -c` behaves effectively like `cut -b` ("b" for "byte"). To get rid of any invalid trailing bytes, the string can be piped through `iconv -c -t UTF-8`: `echo '' | cut -b -7 | iconv -c -t UTF-8 | xargs touch` creates a file named ``, while `echo '' | cut -b -8 | iconv -c -t UTF-8 | xargs touch` (one more byte) results in a file ``. – Abdull Aug 25 '23 at 21:45