361

I would like to remove all leading and trailing spaces and tabs from each line in an output.

Is there a simple tool like trim I could pipe my output into?

Example file:

test space at back 
 test space at front
TAB at end  
    TAB at front
sequence of some    space in the middle
some empty lines with differing TABS and spaces:





 test space at both ends 
rubo77
  • 27,777
  • 43
  • 130
  • 199
  • 11
    To anyone looking here for a solution to remove newlines, that is a different problem. By definition a newline creates a new line of text. Therefore a line of text cannot contain a newline. The question you want to ask is how to remove a newline from the beginning or end of a string: https://stackoverflow.com/questions/369758, or how to remove blank lines or lines that are just whitespace: https://serverfault.com/questions/252921 – Tony Jun 25 '18 at 23:24

21 Answers21

462
awk '{$1=$1;print}'

or shorter:

awk '{$1=$1};1'

Would trim leading and trailing space or tab characters1 and also squeeze sequences of tabs and spaces into a single space.

That works because when you assign something to one of the fields, awk rebuilds the whole record (as printed by print) by joining all fields ($1, ..., $NF) with OFS (space by default).

To also remove blank lines, change it to awk '{$1=$1};NF' (where NF tells awk to only print the records for which the Number of Fields is non-zero). Do not do awk '$1=$1' as sometimes suggested as that would also remove lines whose first field is any representation of 0 supported by awk (0, 00, -0e+12...)


¹ and possibly other blank characters depending on the locale and the awk implementation

Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
  • 4
    Semicolon on second example is superfluous. Could use: `awk '{$1=$1}1'` – Brian Nov 03 '15 at 19:18
  • 21
    @Brian, [no, the `;` is required in the standard awk syntax](http://austingroupbugs.net/view.php?id=226#c2226) – Stéphane Chazelas Nov 03 '15 at 22:07
  • Interesting... No semicolon is supported by gawk, mawk and OS X's awk. (At least for my versions (1.2, 4.1.1, and 20070501, respectively) – Brian Nov 03 '15 at 22:21
  • Although, even though gawk supports it, it confirms the semicolon usage with: `Multiple statements may be put on one line by separating them with a ";". This applies to both the statements within the action part of a pattern-action pair (the usual case), and to the pattern-action statements themselves.` – Brian Nov 03 '15 at 22:22
  • 8
    The only thing I don't like about this approach is that you lose repeating spaces within the line. For example, `echo -e 'foo \t bar' | awk '{$1=$1};1'` – user.friendly Jun 23 '17 at 01:12
  • 7
    `echo ' hello ' | xargs` – JREAM Apr 03 '18 at 10:32
  • In my special context. (Called in SQilte CLI `.shell` comand). It was not working. But the SED solution was OK. – takacsot Apr 01 '20 at 16:44
  • How do I do this recursively? – Aaron Franke May 16 '20 at 02:53
  • How can I turn this into an alias in bash? I tried but receiving errors – ctrlbrk May 20 '20 at 16:56
  • @ctrlbrk I'm not sure an alias is what you need. Might be better to create a custom bash script called `trim` and add it to your path. Then you could do `??? | trim`. – Tom Jan 28 '21 at 15:27
  • I have looked for something like this literally for years! Thanks! – Dash83 Nov 06 '21 at 12:36
  • `$ awk '{1.txt=1.txt};1'` leads to `awk: cmd. line:1: {1.txt=1.txt};1` `awk: cmd. line:1: ^ syntax error`. How to fix? – pmor Jul 28 '22 at 15:39
  • 1
    @pmor, the file or stream to process must be fed as input to `awk`. Like with `awk '{$1=$1};1' < 1.txt` or `awk '{$1=$1};1' 1.txt`. – Stéphane Chazelas Jul 28 '22 at 16:14
  • Since which version of `awk` should the command `awk '{$1=$1;print}'` work correctly? – Pro Backup Dec 14 '22 at 13:08
  • Simple, straight to the point! Nice one :) – Oo.oO May 21 '23 at 05:40
109

The command can be condensed like so if you're using GNU sed:

$ sed 's/^[ \t]*//;s/[ \t]*$//' < file

Example

Here's the above command in action.

$ echo -e " \t   blahblah  \t  " | sed 's/^[ \t]*//;s/[ \t]*$//'
blahblah

You can use hexdump to confirm that the sed command is stripping the desired characters correctly.

$ echo -e " \t   blahblah  \t  " | sed 's/^[ \t]*//;s/[ \t]*$//' | hexdump -C
00000000  62 6c 61 68 62 6c 61 68  0a                       |blahblah.|
00000009

Character classes

You can also use character class names instead of literally listing the sets like this, [ \t]:

$ sed 's/^[[:blank:]]*//;s/[[:blank:]]*$//' < file

Example

$ echo -e " \t   blahblah  \t  " | sed 's/^[[:blank:]]*//;s/[[:blank:]]*$//'

Most of the GNU tools that make use of regular expressions (regex) support these classes (here with their equivalent in the typical C locale of an ASCII-based system (and there only)).

 [[:alnum:]]  - [A-Za-z0-9]     Alphanumeric characters
 [[:alpha:]]  - [A-Za-z]        Alphabetic characters
 [[:blank:]]  - [ \t]           Space or tab characters only
 [[:cntrl:]]  - [\x00-\x1F\x7F] Control characters
 [[:digit:]]  - [0-9]           Numeric characters
 [[:graph:]]  - [!-~]           Printable and visible characters
 [[:lower:]]  - [a-z]           Lower-case alphabetic characters
 [[:print:]]  - [ -~]           Printable (non-Control) characters
 [[:punct:]]  - [!-/:-@[-`{-~]  Punctuation characters
 [[:space:]]  - [ \t\v\f\n\r]   All whitespace chars
 [[:upper:]]  - [A-Z]           Upper-case alphabetic characters
 [[:xdigit:]] - [0-9a-fA-F]     Hexadecimal digit characters

Using these instead of literal sets always seems like a waste of space, but if you're concerned with your code being portable, or having to deal with alternative character sets (think international), then you'll likely want to use the class names instead.

References

Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
slm
  • 363,520
  • 117
  • 767
  • 871
  • Note that `[[:space:]]` is not equivalent to `[ \t]` in the general case (unicode, etc). `[[:space:]]` will probably be much slower (as there are many more types of whitespaces in unicode than just `' '` and `'\t'`). Same thing for all the others. – Olivier Dulac Nov 21 '13 at 12:44
  • 1
    `sed 's/^[ \t]*//'` is not portable. Atually POSIX even requires that to remove a sequence of space, backslash or `t` characters, and that's what GNU `sed` also does when `POSIXLY_CORRECT` is in the environment. – Stéphane Chazelas Aug 11 '16 at 14:56
  • What if I want to trim newlines characters? '\n \n text \n \n' – Eugene Biryukov Jun 01 '18 at 08:54
  • 2
    I like the sed solution because of the lack of other side-affects as in the awk solution. The first variation does not work when I tried it in bash on OSX jsut now, but the character class version does work: `sed 's/^[[:blank:]]*//;s/[[:blank:]]*$//'` – Tony Jun 25 '18 at 23:13
  • @EugeneBiryukov see my comment on the original post – Tony Jun 25 '18 at 23:27
  • 1
    instead of `[ \t]*` why don't you use **`\s*`** to catch all white-spaces ? – Noam Manos Feb 19 '20 at 12:55
  • Sorry it doesn't work if there is a tab in the begging of a line. – Raymond Jan 21 '23 at 16:29
  • Can someone explain why `s,^[[:space:]]*,,` works, but `s,^[[:space:]]+,,` does not? – MichaelK Aug 30 '23 at 13:04
61

xargs without arguments do that.

Example:

trimmed_string=$(echo "no_trimmed_string" | xargs) 
Newton_Jose
  • 737
  • 5
  • 4
  • 18
    This also contracts multiple spaces within a line, which was not requested in the question – roaima Sep 09 '15 at 16:04
  • 3
    @roaima - true but the accepted answer also squeezes spaces (which was not requested in the question). I think the real problem here is that `xargs` will fail to deliver if the input contains backslashes and single quotes. – don_crissti Sep 09 '15 at 18:28
  • 1
    @don_crissti that doesn't mean the accepted answer correctly answers the question as asked, though. But in this case here it wasn't flagged as a caveat whereas in the accepted answer it was. I've hopefully highlighted the fact in case it's of relevance to a future reader. – roaima Sep 09 '15 at 19:22
  • 1
    It also breaks on single quotes, double quotes, backslash characters. It also runs one or more `echo` invocations. Some echo implementations will also process options and/or backslashes... That also only works for single-line input. – Stéphane Chazelas May 21 '19 at 17:19
  • This is a clever (unorthodox) answer! I agree w/ @StéphaneChazelas, essentially invoking the default `xargs /bin/echo` command. – John Doe Aug 18 '21 at 13:40
  • I know this has shortcomings, but if you want something simple, with known-restricted input, this is beautifully succinct in 5 chars... Obtuse perhaps, but succinct! – spechter Apr 26 '22 at 05:53
  • for multi-line inputs you can ` | xargs -L1` – Nitsan Avni Feb 09 '23 at 21:55
40

As suggested by Stéphane Chazelas in the accepted answer, you can now
create a script /usr/local/bin/trim:

#!/bin/bash
awk '{$1=$1};1'

and give that file executable rights:

chmod +x /usr/local/bin/trim

Now you can pass every output to trim for example:

cat file | trim

(for the comments below: i used this before: while read i; do echo "$i"; done
which also works fine, but is less performant)

rubo77
  • 27,777
  • 43
  • 130
  • 199
  • 1
    Good luck if your file is huge and/or contains backslashes. – don_crissti Dec 31 '14 at 01:31
  • 2
    @don_crissti: could you comment a bit more?, which solution would be better fitting for huge files, and how could I modify my solution if the file contained backslashes? – rubo77 Dec 31 '14 at 10:42
  • 4
    You'll have to use `while read -r line` to preserve backslashes and [even then...](http://unix.stackexchange.com/questions/176490/echoing-stdin-when-running-an-ed1-script/176514#comment292069_176502). As to huge files / speed, really, you picked the worst solution. I don't think there's anything worse out there. See the answers on [Why is using a shell loop to process text bad practice ?](http://unix.stackexchange.com/q/169716) including my comment on the last answer where I added a link to a speed benchmark. The `sed` answers here are perfectly fine IMO and far better than `read`. – don_crissti Dec 31 '14 at 12:24
  • @don_crissti ...and/or has lines starting with `-` and followed by combinations of 1 or more e, E or n characters, and/or contains NUL characters. Also, a non-terminated line after the last newline will be skipped. – Stéphane Chazelas May 27 '15 at 14:52
  • 3
    You can also add an alias in /etc/profile (or your ~/.bashrc or ~/.zshrc etc...) alias trim="awk '{\$1=\$1};1'" – Jeff Clayton Nov 20 '15 at 16:26
  • 3
    No need for `bash`, you can make it `#! /usr/bin/awk -f` `{$1=$1};1`. (beware of file names containing `=` characters though) – Stéphane Chazelas Aug 11 '16 at 14:45
  • 2
    note that it has to be on 2 lines, one for the she-bang, one for the code (`{$1=$1};1`). – Stéphane Chazelas Aug 11 '16 at 15:37
  • @StéphaneChazelas: That doesn't seem to work. Please post it as a whole new answer, so we can see the line breaks as you intended. And we can discuss then if it works. (you mean a she-bang with just `#!` ?) – rubo77 Aug 11 '16 at 18:35
  • awk didn't work for me, still untrimmed spaces – pronebird Nov 29 '16 at 12:48
  • Would this work as an alias, or would it be better as a function defined in /etc/profile? – RonJohn Nov 21 '17 at 23:15
  • How do I use this to trim files in-place? – Aaron Franke May 16 '20 at 02:55
  • On Solaris awk doesn't work, you need gawk. – access_granted Mar 02 '21 at 21:51
  • I'm using zshell and found it very convient declcare a function such as: function trim() { awk '{$1=$1};1' } No need for file. – Andries Aug 06 '23 at 18:52
32

If you store lines as variables, you can use bash to do the job:

remove leading whitespace from a string:

shopt -s extglob
printf '%s\n' "${text##+([[:space:]])}"

remove trailing whitespace from a string:

shopt -s extglob
printf '%s\n' "${text%%+([[:space:]])}"

remove all whitespace from a string:

printf '%s\n' "${text//[[:space:]]}"
Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
Łukasz Rajchel
  • 429
  • 4
  • 2
  • 1
    Removing all white-space from a string is not same as removing both leading and trailing spaces (as in question). – catpnosis Mar 24 '18 at 16:04
  • 3
    Far the best solution - it requires only bash builtins and no external process forks. – peterh Jul 05 '18 at 13:56
  • 2
    Nice. Scripts run a LOT faster if they don't have to pull in outside programs (such as awk or sed). This works with "modern" (93u+) versions of ksh, as well. – user1683793 Jul 10 '18 at 22:54
  • Upon testing, your `ltrim` and `rtrm` solutions do not work because they are too greedy. Given a string such as ` \n\n\n\v\f\t\t\r\r Anthony Rutledge. \n\r\v\\f\n\n\n\t\t\t `, your solutions will wipe away the entire string. – Anthony Rutledge Jul 26 '21 at 14:40
29

To remove all leading and trailing spaces from a given line thanks to a 'piped' tool, I can identify 3 different ways which are not completely equivalent. These differences concern the spaces between words of the input line. Depending on the expected behaviour, you'll make your choice.

Examples

To explain the differences, let consider this dummy input line:

"   \t  A   \tB\tC   \t  "

tr

$ echo -e "   \t  A   \tB\tC   \t  " | tr -d "[:blank:]"
ABC

tr is really a simple command. In this case, it deletes any space or tabulation character.

awk

$ echo -e "   \t  A   \tB\tC   \t  " | awk '{$1=$1};1'
A B C

awk deletes leading and tailing spaces and squeezes to a single space every spaces between words.

sed

$ echo -e "   \t  A   \tB\tC   \t  " | sed 's/^[ \t]*//;s/[ \t]*$//'
A       B   C

In this case, sed deletes leading and tailing spaces without touching any spaces between words.

Remark:

In the case of one word per line, tr does the job.

anatoly techtonik
  • 2,514
  • 4
  • 24
  • 37
frozar
  • 391
  • 3
  • 3
24
sed -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//'

If you're reading a line into a shell variable, read does that already unless instructed otherwise.

Gilles 'SO- stop being evil'
  • 807,993
  • 194
  • 1,674
  • 2,175
  • 1
    +1 for `read`. So if you pipe to while read it works: `cat file | while read i; do echo $i; done` – rubo77 Nov 21 '13 at 03:36
  • 1
    @rubo except that in your example the unquoted variable is also reprocessed by the shell. Use `echo "$i"` to see the true effect of the `read` – roaima Sep 09 '15 at 19:19
9

sed is a great tool for that:

                        # substitute ("s/")
sed 's/^[[:blank:]]*//; # parts of lines that start ("^")  with a space/tab 
     s/[[:blank:]]*$//' # or end ("$") with a space/tab
                        # with nothing (/)

You can use it for your case be either piping in the text, e.g.

<file sed -e 's/^[[...

or by acting on it 'inline' if your sed is the GNU one:

sed -i 's/...' file

but changing the source this way is "dangerous" as it may be unrecoverable when it doesn't work right (or even when it does!), so backup first (or use -i.bak which also has the benefit to be portable to some BSD seds)!

Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
Michael Durrant
  • 41,213
  • 69
  • 165
  • 232
9

An answer you can understand in a glance:

#!/usr/bin/env python3
import sys
for line in sys.stdin: print(line.strip()) 

Bonus: replace str.strip([chars]) with arbitrary characters to trim or use .lstrip() or .rstrip() as needed.

Like rubo77's answer, save as script /usr/local/bin/trim and give permissions with chmod +x.

qwr
  • 647
  • 7
  • 11
  • 2
    I'm not normally a fan of python in my scripts but this is by far one of the most legible scripts in comparison to all the other incantations in these answers. – Victor Mar 22 '22 at 19:10
6

You will be adding this to your little Bash library. I can almost bet on it! This has the benefit of not adding a newline character to the end of your output, as will happen with echo throwing off your expected output. Moreover, these solutions are reusable, do not require modifying the shell options, can be called in-line with your pipelines, and are posix compliant. This is the best answer, by far. Modify to your liking.

Output tested with od -cb, something some of the other solutions might want to do with their output.

BTW: The correct quantifier is the +, not the *, as you want the replacement to be triggered upon 1 or more whitespace characters!

ltrim (that you can pipe input into)

function ltrim ()
{
    sed -E 's/^[[:space:]]+//'
}

rtrim (that you can pipe input into)

function rtrim ()
{
    sed -E 's/[[:space:]]+$//'
}

trim (the best of both worlds and yes, you can pipe to it)

function trim ()
{
    ltrim | rtrim
}
4

If the string one is trying to trim is short and continuous/contiguous, one can simply pass it as a parameter to any bash function:

    trim(){
        echo $@
    }

    a="     some random string   "

    echo ">>`trim $a`<<"
Output
>>some random string<<
Subrata Das
  • 171
  • 4
4

Using Raku (formerly known as Perl_6):

raku -ne '.trim.put;'

Or more simply:

raku -pe '.=trim;'

As a previous answer suggests (thanks, @Jeff_Clayton!), you can create a trim alias in your bash environment:

alias trim="raku -pe '.=trim;'"

Finally, to only remove leading/trailing whitespace (e.g. unwanted indentation), you can use the appropriate trim-leading or trim-trailing command instead.

https://raku.org/

jubilatious1
  • 2,385
  • 8
  • 16
3
trimpy () {
    python3 -c 'import sys
for line in sys.stdin: print(line.strip())'
}
trimsed () {
gsed -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//'
}
trimzsh () {
   local out="$(</dev/stdin)"
   [[ "$out" =~ '^\s*(.*\S)\s*$' ]] && out="$match[1]"  || out=''
   print -nr -- "$out"
}
# example usage
echo " hi " | trimpy

Bonus: replace str.strip([chars]) with arbitrary characters to trim or use .lstrip() or .rstrip() as needed.

HappyFace
  • 1,493
  • 9
  • 21
1

translate command would work

cat file | tr -d [:blank:]
Jeff Schaller
  • 66,199
  • 35
  • 114
  • 250
Srinagesh
  • 45
  • 1
1

I wrote this shell function using awk

awkcliptor(){
    awk -e 'BEGIN{ RS="^$" } {gsub(/^[\n\t ]*|[\n\t ]*$/,"");print ;exit}' "$1" ; } 

BEGIN{ RS="^$" }:
in the beginning before start parsing set record
separator to none i.e. treat the whole input as
a single record

gsub(this,that):
substitute this regexp with that string

/^[\n\t ]*|[\n\t ]*$/:
of that string catch any pre newline space and tab class
or post newline space and tab class and replace them with
empty string

print;exit: then print and exit

"$1":
and pass the first argument of the function to be
process by awk

how to use:
copy above code , paste in shell, and then enter to
define the function.
then you can use awkcliptor as a command with first argument as the input file

sample usage:

echo '
 ggggg    

      ' > a_file
awkcliptor a_file

output:

ggggg

or

echo -e "\n ggggg    \n\n      "|awkcliptor 

output:

ggggg
1

For those of us without enough space in the brain to remember obscure sed syntax, just reverse the string, cut the 1st field with a delimiter of space, and reverse it back again.

cat file | rev | cut -d' ' -f1 | rev
Stewart
  • 12,628
  • 1
  • 37
  • 80
1

My favorite is using perl: perl -n -e'/[\s]*(.*)?[\s]*/ms && print $1'

Take for example:

MY_SPACED_STRING="\n\n   my\nmulti-line\nstring  \n\n"

echo $MY_SPACED_STRING

Would output:



   my
multi-line
string  


Then:

echo $MY_SPACED_STRING | perl -n -e'/[\s]*(.*)?[\s]*/ms && print $1'

Would output:

my
multi-line
string 
tin
  • 111
  • 2
0

for bash example:

alias trim="awk '{\$1=\$1};1'"

usage:

echo -e  "    hello\t\tkitty   " | trim | hexdump  -C

result:

00000000  68 65 6c 6c 6f 20 6b 69  74 74 79 0a              |hello kitty.|
0000000c
  • 1
    The `awk '{$1=$1};1'` answer was given long ago.  The idea of making an alias out of it was suggested in a comment almost as long ago.  Yes, you are allowed to take somebody else’s comment and turn it into an answer.  But, if you do, you should give credit to the people who posted the idea before you.  And this is such a trivial extension of the accepted answer that it’s not really worth the bother. – Scott - Слава Україні Sep 04 '20 at 04:08
  • Idea was to make alias. I doesn't seen that answer before. – Marek Lisiecki Sep 05 '20 at 18:13
  • and second thing from stack: "Thanks for the feedback! Votes cast by those with less than 15 reputation are recorded, but do not change the publicly displayed post score." – Marek Lisiecki Sep 05 '20 at 18:25
0

Remove start space and tab and end space and tab:

alias strip='python3 -c "from sys import argv; print(argv[1].strip(\" \").strip(\"\t\"))"'

Remove every space and tab

alias strip='python3 -c "from sys import argv; print(argv[1].replace(\"\t\", \"\").replace(\" \", \"\")"'

Give argument to strip. Use sys.stdin().read() to make pipeable instead of argv.

Machinexa
  • 123
  • 7
0

simple enough for my purposes was this:

_text_="    one    two       three         "

echo "$_text_" | { read __ ; echo ."$__". ; }

... giving ...

.one    two       three.

... if you want to squeeze the spaces then ...

echo .$( echo $_text_ ).

... gives ...

.one two three.
sol
  • 101
0

rust sd command sd '^\s*(.*)\s*' '$1'

walkman
  • 101
  • 2