10

I have a file as below.

"ID" "1" "2"
"00000687" 0 1
"00000421" 1 0

I want to make it as below.

00000687 0 1
00000421 1 0

I want to

  1. remove the first line and
  2. remove double quotes from fields on any other lines.  FWIW, double quotes appear only in the first column.

I think cut -c would work, but cannot make it.  What should I do?

user10345633
  • 353
  • 1
  • 2
  • 12
  • 1
    Just to make sure: are you trying to 1) remove the first line and 2) remove double quotes from fields on any other lines? Can double quotes appear anywhere else than the first field? And should they only be removed from the first field, as your question's title suggests, if they do? – fra-san May 21 '20 at 11:21
  • 1
    Thank you fra-san. I want to 1) remove the first line and 2) remove double quotes from fields on any other lines. Yes, double quotes appear only in the first column. – user10345633 May 21 '20 at 11:24

6 Answers6

23

tail +tr:

tail -n +2 file | tr -d \"

tail -n+2 prints the file starting from line two to the end. tr -d \" deletes all double quotes.

Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
16

This should work:

sed -i '1d;s/"//g' filename

Explanation:

  • -i will modify the file in place
  • 1d will remove the first line
  • s/"//g will remove every " in the file

You can first try without -i and the output will be printed to stdout.

Francesco
  • 808
  • 7
  • 24
7

Solving the issue as it is presented in the title, i.e. removing double quotes from the first space-delimited column, only:

awk -F ' ' '{ gsub("\"", "", $1) }; NR > 1' file

This uses the gsub() command to remove all double quotes from the first field on each line. The NR > 1 at the end makes sure that the first line is not printed.

To remove the double quotes from the first field, but only if they appear as the first and last character of the field:

awk -F ' ' '$1 ~ /^".*"$/ { $1 = substr($1, 2, length($1) - 2) }; NR > 1' file

This uses a regular expression, ^".*"$ to detect whether there are double quotes at the start and end of the first field, and if there are, a block that extracts the internal part of the string with substr() is triggered. Any internal double quotes in the field are retained.

Kusalananda
  • 320,670
  • 36
  • 633
  • 936
4

Using Perl:

perl -ne ' { s/"//g; print if $. > 1 }' file

OR

perl -ne ' { if ($.>1) {s/"//g;print}  }' file

s/"//g; => Removes all the double quotes in the current line of the file (stored in $_ by default)

if $. > 1 => If the current line number is greater than 1

Peter Mortensen
  • 1,029
  • 1
  • 8
  • 10
2

There can be many ways to get your desired output, one can be with cut -c. Just you need to define the range of characters to extract and pipe the output to tail --lines=+2 command to remove the header (the first line). Such as:

cut -c2-9,11-14 <your_file_name> | tail --lines=+2

The -c2-9,11-14 option defines the range of character positions from 2 to 9 (position of characters for ID column) and from 11 to 14 (position of characters for the rest of characters excluding the ' " ').

The tail --lines=+2 command prints all lines from your file, but starting from line two.

For more inforamtion on cut command, you can visit this site.

Rasulli
  • 101
  • 4
0

One can do it also in gedit manually. Remove first line and then replace " by nothing.

Adam
  • 969
  • 1
  • 7
  • 15