16

Each line contains text and numbers in one column. I need to calculate the sum of the numbers in each row. How can I do that? Thx

example.log contains:

time=31sec
time=192sec
time=18sec
time=543sec

The answer should be 784

Gilles 'SO- stop being evil'
  • 807,993
  • 194
  • 1,674
  • 2,175
Jack
  • 161
  • 1
  • 1
  • 3
  • I tried this method awk '{ sum += $1}; END { print sum }' example.log but it's only for numbers in line – Jack May 27 '15 at 18:13
  • 2
    There is almost the same question in [SO]: [How can I quickly sum all numbers in a file?](http://stackoverflow.com/q/2702564/). Maybe time to have cross-site duplicates? – fedorqui May 28 '15 at 07:26

10 Answers10

18

If your grep support -o option, you can try:

$ grep -o '[[:digit:]]*' file | paste -sd+ - | bc
784

POSIXly:

$ printf %d\\n "$(( $(tr -cs 0-9 '[\n*]' <file | paste -sd+ -) ))"
784
cuonglm
  • 150,973
  • 38
  • 327
  • 406
16

With a newer version (4.x) of GNU awk:

awk 'BEGIN {FPAT="[0-9]+"}{s+=$1}END{print s}'

With other awks try:

awk -F '[a-z=]*' '{s+=$2}END{print s}'
Janis
  • 14,014
  • 3
  • 25
  • 42
  • 4
    You need `s+0` in case where `s` is empty, it will print `0` instead of empty. – cuonglm May 27 '15 at 18:20
  • Let me explain that. - There is just one case where `s` can be empty; if the input data contains **no lines** (i.e. if there is **no input at all**). In that case there are two behaviours possible; 1) no input => no output, or 2) always output something, if only 0. Both are sensible options depending on the application context. The `+0` is addressing option 2). To address option 1) you'd rather have to write `END {if(s) print s}`. - Therefore it makes no sense to assume either option (for this corner case of no data) until it is specified by the question. – Janis May 28 '15 at 12:40
10
awk -F= '{sum+=$2};END{print sum}'
Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
snth
  • 311
  • 1
  • 3
  • 2
    We prefer long form answers. Can you please elaborate on how this works? – slm May 28 '15 at 09:31
  • 2
    @slm, that answer is not any more or less verbose than the other answers here and is self explanatory. It also has the advantage of working with input like `time=1.4e5sec` – Stéphane Chazelas May 28 '15 at 21:42
  • @StéphaneChazelas - agreed, but this is a new user and we do encourage users to provide more than single line answers. A bit of text explaining how it works would make it a much stronger answer than just code. – slm May 28 '15 at 21:47
  • 4
    @slm, this is a new user with one of the best answers (from a technical stand point) and he gets two downvotes and a negative comment. Not a very warm welcome. – Stéphane Chazelas May 28 '15 at 21:52
  • @StéphaneChazelas I notice you made a slight rearrangement to the code. Any reason that the `;` needs to be in there at all? Shouldn't it just be removed entirely? – Tom Fenech May 29 '15 at 08:45
  • 1
    @TomFenech, the POSIX syntax for awk requires that those pattern/action items be separated by either ";" or "newline", so you may find awk implementations where it fails without this ";". – Stéphane Chazelas May 29 '15 at 09:28
  • @slm You want me to rewrite the manual? – snth May 30 '15 at 18:56
  • @snth - certainly not. But a short description of what code does is always the approach one should aim for when writing an answer. Someone that is providing guidance in the form of an answer should always take it as a teaching moment for the person asking the Q as well as anyone else that might find their answer when they search via Google in the future. Writing answers is voluntary and I don't think it's too much to ask for a sentence or two to go along with code. – slm May 31 '15 at 01:36
  • @slm depends on how much free time you have and how you value it, I guess. – snth Jun 05 '15 at 08:01
7

Another GNU awk one:

awk -v RS='[0-9]+' '{n+=RT};END{print n}'

A perl one:

perl -lne'$n+=$_ for/\d+/g}{print$n'

A POSIX one:

tr -cs 0-9 '[\n*]' | grep . | paste -sd + - | bc
Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
6
sed 's/=/ /' file | awk '{ sum+=$2 } END { print sum}'
don_crissti
  • 79,330
  • 30
  • 216
  • 245
  • Awesome answer, but no need for `sed`: `awk --field-separator = '{ sum+=$2 } END { print sum}' data.dat` – user1717828 May 27 '15 at 23:45
  • @user1717828: you should rather use the (shorter, and more compatible!) `-F'='` instead of `--field-separator =` – Olivier Dulac May 29 '15 at 09:14
  • @OlivierDulac, weird, my `man awk` only gives `-F fs` and `--field-separator fs` – user1717828 May 29 '15 at 10:40
  • @user1717828: `-F'='` or `-F '='` are 2 ways of doing the `-F fs` (fs is "=" in your case) . I added the singlequotes to ensure the fs is properly seen & interpreted by awk, not the shell (usefull if the fs is ';' for example) – Olivier Dulac May 29 '15 at 11:56
4

You can try this:

awk -F"[^0-9]+" '{ sum += $2 } END { print sum+0; }' file
cuonglm
  • 150,973
  • 38
  • 327
  • 406
taliezin
  • 9,085
  • 1
  • 34
  • 38
4

Everyone has posted awesome awk answers, which I like very much.

A variation to @cuonglm replacing grep with sed:

sed 's/[^0-9]//g' example.log | paste -sd'+' - | bc
  1. The sed strips everything except for the numbers.
  2. The paste -sd+ - command joins all the lines together as a single line
  3. The bc evaluates the expression
Stephen Quan
  • 511
  • 3
  • 7
3

Through python3,

import re
with open(file) as f:
    m = f.read()
    l = re.findall(r'\d+', m)
    print(sum(map(int, l)))
Avinash Raj
  • 3,653
  • 4
  • 20
  • 34
3

You should use a calculator.

{ tr = \ | xargs printf '[%s=]P%d+p' | dc; } <infile 2>/dev/null

With your four lines that prints:

time=31
time=223
time=241
time=784

And more simply:

tr times=c '    + p' <infile |dc

...which prints...

31
223
241
784

If speed is what you're after then dc is what you want. Traditionally it was bc's compiler - and still is for many systems.

mikeserv
  • 57,448
  • 9
  • 113
  • 229
  • Not according to [my measurements](http://stackoverflow.com/a/18382280/7552): it depends how much work you have to do to generate the formula – glenn jackman May 28 '15 at 13:24
  • @glennjackman - your measurements don't include `dc` as near as I can tell. What are you talking about? – mikeserv May 28 '15 at 15:14
  • By the way, when comparing the old crew to the new crew - such as when you benchmark `perl` v the standard unix toolset - it really doesn't make much sense if you use GNU tools compiled on a GNU toolchain. All of the bloat that can negatively affect Perl's performance is *also* in *all* of those GNU-compiled GNU utils. Sad but true. You need a real, simply built, simple toolset to accurately judge the difference. Like an heirloom-toolchest set statically linked against musl libs for instance - in that way you can bench the one-tool/one-job paradigm vs the one-tool-to-rule-them-all one. – mikeserv May 28 '15 at 15:27
3

Pure bash solution (Bash 3+):

while IFS= read -r line; do                   # While it reads a line:
    if [[ "$line" =~ [0-9]+ ]]; then      # If the line contains numbers:
        ((counter+=BASH_REMATCH[0]))          # Add the current number to counter
    fi                                    # End if.
done                                  # End loop.

echo "Total number: $counter"         # Print the number.
unset counter                         # Reset counter to 0.

Short version:

while IFS= read -r l; do [[ "$l" =~ [0-9]+ ]] && ((c+=BASH_REMATCH)); done; echo $c; c=0
cuonglm
  • 150,973
  • 38
  • 327
  • 406
0x2b3bfa0
  • 257
  • 1
  • 9