Input
testing on Linux [Remove white space] testing on Linux
Output
testing on Linux [Removewhitespace] testing on Linux
So, how can we just remove all the white space between the brackets and achieve output as given?
testing on Linux [Remove white space] testing on Linux
testing on Linux [Removewhitespace] testing on Linux
So, how can we just remove all the white space between the brackets and achieve output as given?
If the [, ] are balanced and not nested, you could use GNU awk as in:
gawk -v RS='[][]' '
NR % 2 == 0 {gsub(/\s/,"")}
{printf "%s", $0 RT}'
That is use [ and ] as the record separators instead of the newline character and remove blanks on every other records only.
With sed, with the additional requirement that there be no newline character inside [...]:
sed -e :1 -e 's/\(\[[^]]*\)[[:space:]]/\1/g;t1'
If they are balanced but may be nested as in blah [blih [1] bluh] asd, then you could use perl's recursion regexp operators like:
perl -0777 -pe 's{(\[((?:(?>[^][]+)|(?1))*)\])}{$&=~s/\s//rsg}gse'
Another approach, which would scale to very large files would be to use the (?{...}) perl regexp operator to keep track of the bracket depth like in:
perl -pe 'BEGIN{$/=\8192}s{((?:\[(?{$l++})|\](?{$l--})|[^][\s]+)*)(\s+)}
{"$1".($l>0?"":$2)}gse'
Actually, you can also process the input one character at a time like:
perl -pe 'BEGIN{$/=\1}if($l>0&&/\s/){$_=""}elsif($_ eq"["){$l++}elsif($_ eq"]"){$l--}'
That approach can be implemented with POSIX tools:
od -A n -vt u1 |
tr -cs 0-9 '[\n*]' |
awk 'BEGIN{b[32]=""; b[10]=""; b[12]=""} # add more for every blank
!NF{next}; l>0 && $0 in b {next}
$0 == "91" {l++}; $0 == "93" {l--}
{printf "%c", $0}'
With sed (assuming no newline inside the [...]):
sed -e 's/_/_u/g;:1' -e 's/\(\[[^][]*\)\[\([^][]*\)]/\1_o\2_c/g;t1' \
-e :2 -e 's/\(\[[^]]*\)[[:space:]]/\1/g;t2' \
-e 's/_c/]/g;s/_o/[/g;s/_u/_/g'
Are considered white space above any horizontal (SPC, TAB) or vertical (NL, CR, VT, FF...) spacing character in the ASCII charset. Depending on your locale, others might get included.
Perl 5.14 solution (which is shorter and IMO easier to read—especially if you format it over multiple lines in a file, instead of as a one-liner)
perl -pE 's{(\[ .*? \])}{$1 =~ y/ //dr}gex'
That works because in 5.14, the regular expression engine is re-entrant. Here it is, expanded out and commented:
s{
(\[ .*? \]) # search for [ ... ] block, capture (as $1)
}{
$1 =~ y/ //dr # delete spaces. you could add in other whitespace here, too
# d = delete; r = return result instead of modifying $1
}gex; # g = global (all [ ... ] blocks), e = replacement is perl code, x = allow extended regex
Perl solution:
perl -pe 's/(\[[^]]*?)\s([^][]*\])/$1$2/ while /\[[^]]*?\s[^][]*\]/'