5

I have a paste command like this paste -d , file1.csv file2.csv file3.csv

And file2.csv contains numbers like this

0.2
0.3339
0.111111

I want the values in file2.csv having 3 decimals like this:

0.200
0.334
0.111

For one value this is working:

printf "%.3f" "0.3339" -> 0.334

But for multiple values in file2.csv this is not working:

paste -d , file1.csv <(printf %s "%.3f" "$(< file2.csv)") file3.csv

Maybe there is a good solution?

Jeff Schaller
  • 66,199
  • 35
  • 114
  • 250
R 9000
  • 167
  • 6

3 Answers3

14

There is a GNU utility called numfmt, part of the GNU coreutils collection of tools, that looks as if it could be useful here. It allows you to format numerical values, and the following command would format all values in file2.csv using the printf format string %.3f ("floating point value with three decimals of precision"). The formatted values would be printed on standard output:

$ numfmt --format=%.3f <file2.csv
0.200
0.334
0.112

As you can see, it uses "from-zero" rounding by default, but this can be changed with e.g. --round=nearest:

$ numfmt --format=%.3f --round=nearest <file2.csv
0.200
0.334
0.111

You may slot this into your paste command with a process substitution like so:

paste -d , file1.csv <( numfmt --format=%.3f --round=nearest <file2.csv ) file3.csv

If your file is a CSV file that is not "simple", i.e. it may contain quoted fields, then you may want to use a CSV-aware tool, such as Miller (mlr) to process the data. The following recreates the second numfmt example from above using the fmtnum() function in a put expression using Miller (this function takes a printf format string):

$ mlr --csv -N put '$1 = fmtnum($1, "%.3f")' file2.csv
0.200
0.334
0.111

The --csv and -N options make Miller read the input (and write the output) as header-less CSV.

Kusalananda
  • 320,670
  • 36
  • 633
  • 936
5

You're close; you just need to tell printf to zero-pad to the right of the decimal point:

$ cat 736678.txt
0.2
0.3339
0.111111
$ for value in $( cat 736678.txt ); do printf "%.3f\n" "$value"; done
0.200
0.334
0.111

The format string %.3f means "a floating-point number with precisely three decimal places to the right of the point".

DopeGhoti
  • 73,792
  • 8
  • 97
  • 133
  • 2
    `printf "%.03f\n" $( cat 736678.txt )` also works. – Edgar Magallon Feb 24 '23 at 02:10
  • 2
    I keep forgetting that `printf` will iterate to consume extra input. – DopeGhoti Feb 24 '23 at 03:25
  • That's wrong; both `%.03f` and `%.3f` mean exactly 3 fraction digits. (For _nonfraction_ digits e.g. `%08.3f` and `%8.3f` do handle _leading_ zeros differently.) The differences that matter here are to have `\n` in the format string, which you added without any mention of it, and to _not_ put the `$(cat file2)` or `$(< file2)` in quotes, as per @Edgar's comment. – dave_thompson_085 Feb 26 '23 at 01:57
  • Thx.. but what can I do if "cat 736678.txt" is a variable? – R 9000 Feb 26 '23 at 03:18
  • If your numerical value is a variable, supply it to `printf`: `printf "%.03f" "$value"`. – DopeGhoti Feb 28 '23 at 23:11
4

You could use awk to do all of reading, formatting and pasting:

LC_ALL=C awk '
  {
    getline f2 < "file2.csv"
    getline f3 < "file3.csv"
    printf "%s,%.3f,%s\n", $0, f2, f3
  }' file1.csv

You'll get as many lines on output as there are in file1.csv (with 0.000 for file2 or empty strings for file3 if those have fewer lines).

Beware that some implementations of awk, including GNU awk when there's a POSIXLY_CORRECT variable in the environment honour the locale's decimal radix character for both input and output. For instance, in a French or German locale where the decimal radix character is , instead of ., 1.2e5 would be interpreted as 1 as the .2e5 would be not recognised and treated as garbage and you'd get 1,000 on output, breaking the CSV formatting.

Hence the LC_ALL=C above to fix the locale to C where the decimal radix character is ..

Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501