3

I have a file whose columns contain simple arithmetic equations that I would like to merge to the arithmetic result.

Input sample (tab-separated columns):

+104-1+12   6   +3

I would like to compute the arithmetic sum within each column. If one column contains no arithmetic sign, I treat it as it contained a + before the item. Although it would be easy through sed to add a + sign if a column starts with no sign (sed -E 's/(\t)([0-9]*)/\1\t+\2/g' would work, assuming that a row never begins with a digit, as in the example)

The output I would expect is the following:

115 6   3

How can I achieve this in unix? awk/sed solutions are preferred.

dovah
  • 1,687
  • 6
  • 21
  • 39
  • i made edits in the question text, including just the first line of the given sample. the tab format was not quite right in the code chunk, sorry for that :/ – dovah May 09 '17 at 12:51

5 Answers5

6

You could use perl:

perl -pe 's/[\d+-]+/eval$&/ge' your-file

Or even:

perl -pe 's/[\d+-]+/$&/gee' your-file (thanks Rakesh)

Same with zsh:

set -o extendedglob # for the ## operator (same as ERE +)
while IFS= read -r line; do 
  printf '%s\n' ${line//(#m)[0-9+-]##/$((MATCH))}
done < your-file

Or:

zmodload zsh/mapfile
set -o extendedglob
printf %s ${mapfile[your-file]//(#m)[0-9+-]##/$((MATCH))}

In all four, we're looking for sequences of digits, - and + characters and passing them to the interpreter's arithmetic processor (eval in perl (or the ee flag that causes the expansion of the replacement to be evaluated as perl code), $((...)) in zsh).

We're not validating the expressions before passing to the interpreter, so it may cause failures (for instance on sequences like -+- or 3++) but at least, because we're only considering digits and -/+ characters, it shouldn't do much more harm than reporting an error message and aborting the command.

Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
  • set -o extendedglob requires what bash version? –  May 09 '17 at 13:38
  • @RakeshSharma, as stated, that's `zsh` code, not bash. If run from `bash`, wrap the code in `zsh -c 'the-code'` (though it would make more sense to switch to `zsh` for the whole script). – Stéphane Chazelas May 09 '17 at 14:08
3

I won't duplicate the Addition with 'sed' answer; nor did I find a way in awk, but here's a bash version:

while IFS= read -r line
do
  set -f; set -- $line
  for e in "$@"
  do
    printf "%d " "$(( e ))"
  done
  echo
done < input
Jeff Schaller
  • 66,199
  • 35
  • 114
  • 250
2
sed -E 's/(\t)([0-9])/\1+\2/g' data.file |
while IFS= read -r l; do
   set -f; IFS=$'\t'
   printf '0%s\n' $l | bc -l | paste -s -
done

sed -e 's/\t\([0-9]\)/\t+\1/' data.file |
while IFS= read -r l; do
   set -f; IFS=$'\t'
   printf '0%s\n' $l | bc -c |
   sed -ne '
      $!{
         y/:@irKW/      /
         s/[^ 0-9]/ & /g
         s/[ ][ ]*/ /g;s/^[ ]*//;s/[ ]*$/p/p
      }
   ' | dc | paste -s -
done

Here we generate a postfix representation of the math expression and before passing it onto the postfix calculator dc, we clean up the non-math info from the output of the bc -c command.

Result

115     6       3
1

Using awk getline from a pipe

awk '{
  for (i=1;i<=NF;i++) {
    sub(/^\+/,"",$i); 
    cmd = sprintf("echo %s | bc -l", $i); 
    cmd | getline $i; close(cmd);
  }
} 1' file
115 6 3
25 6 2 69 57
steeldriver
  • 78,509
  • 12
  • 109
  • 152
  • 1
    It also amounts to a command injection vulnerability if the content of the file is not tightly controlled. Also note that it runs one `sh` and one `bc` command per field. – Stéphane Chazelas May 09 '17 at 13:09
0

Here's an all-awk solution taking advantage of awk's ability to marshall string representations of numbers into numeric representations, with no use of external executables:

awk -F"\t" \
'BEGIN { OFS="\t" }
 { gsub(/-/,"|-") 
   gsub(/\+/,"|")
   for(i=1; i<=NF; i++) { ## iterate over columns
     num_parts=split($i,parts,"|")
     for(j=1; j<=num_parts; j++) ## iterate over arithmetic expression parts
       sums[i] += parts[j]+0 ## Adding zero marshals the string into a numeric
   }}
 END{
      for(i=1; i<=NF; i++) { 
        if(i>1) printf OFS
        printf sums[i]
      } 
      print "" }' file
taltman
  • 161
  • 1