2

I am trying to use sed to change the format of decimal numbers in a large CSV file before importing it into a SQLite database. They all have two decimal places, may be negative use comma as a decimal separator and are therefore escaped with double quotes. I was trying the following:

sed 's/"(-?)([:digit:]+),([:digit:]{2})"/$1$2.$3/g' input.csv > output.csv

The regex seems to work on a text editor on a sample of the file, but when running it through sed, there are no changes to the original file. What am I doing wrong?

user1608941
  • 125
  • 1
  • 1
  • 5
  • 1
    Try adding -r flag to sed. – Capt. Crunch Mar 15 '14 at 08:54
  • 1
    Seconded - in the format you've presented `-r` is a requirement. Depending on `sed` version `[:digit:]` may be a problem as well - `[0-9]` works just as well. – mikeserv Mar 15 '14 at 09:04
  • 1
    Apart from adding `-r`, i.e. enable extended regex syntax, change `[:digit:]` to `[[:digit:]]` or `[0-9]`. – devnull Mar 15 '14 at 09:08
  • Thanks! I'm getting an error saying that `-r` is an illegal option. – user1608941 Mar 15 '14 at 10:11
  • 1
    `sed` only changes the original file if you use the "in-place" option `-i`. By default, it just prints the edited text to stdout. – n.st Mar 15 '14 at 11:14
  • Also, if you plan to use sed, you should probably go download and build the GNU version. You'll thank yourself later. – mikeserv Mar 15 '14 at 13:47
  • More generally, see [Why does my regular expression work in X but not in Y?](http://unix.stackexchange.com/questions/119905/why-does-my-regular-expression-work-in-x-but-not-in-y) – Gilles 'SO- stop being evil' Mar 16 '14 at 04:31

2 Answers2

1

Since -r is unavailable, use this leaning toothpick forest:

sed 's/"\(-?[[:digit:]]\+\),\([[:digit:]]\{2\}\)"/\1.\2/g' input.csv > output.csv

sed -r is a GNU extension. And sadly, most tools that use regular expressions implement the language slightly differently (grep/sed, awk, perl, ...)

glenn jackman
  • 84,176
  • 15
  • 116
  • 168
0

I find Perl's syntax simpler for such things (I am ignoring the quotes but you can add them if you wish):

perl -pe 's/(-*)(\d+),(\d{0,2})/$1$2.$3/g' input.csv > output.csv

You can also use the -i option to edit the original file directly.

terdon
  • 234,489
  • 66
  • 447
  • 667