How can I quickly sum all numbers in a file?

Question

Each line contains text and numbers in one column. I need to calculate the sum of the numbers in each row. How can I do that? Thx

example.log contains:

time=31sec
time=192sec
time=18sec
time=543sec

The answer should be 784

I tried this method awk '{ sum += $1}; END { print sum }' example.log but it's only for numbers in line — Jack, May 27 '15 at 18:13
There is almost the same question in [SO]: [How can I quickly sum all numbers in a file?](http://stackoverflow.com/q/2702564/). Maybe time to have cross-site duplicates? — fedorqui, May 28 '15 at 07:26

cuonglm · Answer 1 · 2015-05-28T01:26:01.537

18

If your grep support -o option, you can try:

$ grep -o '[[:digit:]]*' file | paste -sd+ - | bc
784

POSIXly:

$ printf %d\\n "$(( $(tr -cs 0-9 '[\n*]' <file | paste -sd+ -) ))"
784

edited May 28 '15 at 01:26

answered May 27 '15 at 18:23

cuonglm

150,973
38
327
406

Janis · Answer 2 · 2015-05-27T18:19:27.490

16

With a newer version (4.x) of GNU awk:

awk 'BEGIN {FPAT="[0-9]+"}{s+=$1}END{print s}'

With other awks try:

awk -F '[a-z=]*' '{s+=$2}END{print s}'

edited May 27 '15 at 18:19

answered May 27 '15 at 18:14

Janis

14,014
3
25
42

4

You need `s+0` in case where `s` is empty, it will print `0` instead of empty. – cuonglm May 27 '15 at 18:20
Let me explain that. - There is just one case where `s` can be empty; if the input data contains **no lines** (i.e. if there is **no input at all**). In that case there are two behaviours possible; 1) no input => no output, or 2) always output something, if only 0. Both are sensible options depending on the application context. The `+0` is addressing option 2). To address option 1) you'd rather have to write `END {if(s) print s}`. - Therefore it makes no sense to assume either option (for this corner case of no data) until it is specified by the question. – Janis May 28 '15 at 12:40

score 10 · Answer 3 · edited May 28 '15 at 21:43

10

awk -F= '{sum+=$2};END{print sum}'

edited May 28 '15 at 21:43

Stéphane Chazelas

522,931
91
1,010
1,501

answered May 28 '15 at 04:38

snth

311
1
3

2

We prefer long form answers. Can you please elaborate on how this works? – slm May 28 '15 at 09:31
2

@slm, that answer is not any more or less verbose than the other answers here and is self explanatory. It also has the advantage of working with input like `time=1.4e5sec` – Stéphane Chazelas May 28 '15 at 21:42
@StéphaneChazelas - agreed, but this is a new user and we do encourage users to provide more than single line answers. A bit of text explaining how it works would make it a much stronger answer than just code. – slm May 28 '15 at 21:47
4

@slm, this is a new user with one of the best answers (from a technical stand point) and he gets two downvotes and a negative comment. Not a very warm welcome. – Stéphane Chazelas May 28 '15 at 21:52
@StéphaneChazelas I notice you made a slight rearrangement to the code. Any reason that the `;` needs to be in there at all? Shouldn't it just be removed entirely? – Tom Fenech May 29 '15 at 08:45
1

@TomFenech, the POSIX syntax for awk requires that those pattern/action items be separated by either ";" or "newline", so you may find awk implementations where it fails without this ";". – Stéphane Chazelas May 29 '15 at 09:28
@slm You want me to rewrite the manual? – snth May 30 '15 at 18:56
@snth - certainly not. But a short description of what code does is always the approach one should aim for when writing an answer. Someone that is providing guidance in the form of an answer should always take it as a teaching moment for the person asking the Q as well as anyone else that might find their answer when they search via Google in the future. Writing answers is voluntary and I don't think it's too much to ask for a sentence or two to go along with code. – slm May 31 '15 at 01:36
@slm depends on how much free time you have and how you value it, I guess. – snth Jun 05 '15 at 08:01

Stéphane Chazelas · Answer 4 · 2015-09-28T20:33:52.020

7

Another GNU awk one:

awk -v RS='[0-9]+' '{n+=RT};END{print n}'

A perl one:

perl -lne'$n+=$_ for/\d+/g}{print$n'

A POSIX one:

tr -cs 0-9 '[\n*]' | grep . | paste -sd + - | bc

edited Sep 28 '15 at 20:33

answered May 27 '15 at 21:14

Stéphane Chazelas

522,931
91
1,010
1,501

score 6 · Answer 5 · edited May 27 '15 at 21:31

6

sed 's/=/ /' file | awk '{ sum+=$2 } END { print sum}'

edited May 27 '15 at 21:31

don_crissti

79,330
30
216
245

answered May 27 '15 at 21:07

user2570505

69
1

Awesome answer, but no need for `sed`: `awk --field-separator = '{ sum+=$2 } END { print sum}' data.dat` – user1717828 May 27 '15 at 23:45
@user1717828: you should rather use the (shorter, and more compatible!) `-F'='` instead of `--field-separator =` – Olivier Dulac May 29 '15 at 09:14
@OlivierDulac, weird, my `man awk` only gives `-F fs` and `--field-separator fs` – user1717828 May 29 '15 at 10:40
@user1717828: `-F'='` or `-F '='` are 2 ways of doing the `-F fs` (fs is "=" in your case) . I added the singlequotes to ensure the fs is properly seen & interpreted by awk, not the shell (usefull if the fs is ';' for example) – Olivier Dulac May 29 '15 at 11:56

score 4 · Answer 6 · edited May 27 '15 at 18:18

4

You can try this:

awk -F"[^0-9]+" '{ sum += $2 } END { print sum+0; }' file

edited May 27 '15 at 18:18

cuonglm

150,973
38
327
406

answered May 27 '15 at 18:17

taliezin

9,085
1
34
38

score 4 · Answer 7 · answered May 27 '15 at 23:45

Everyone has posted awesome awk answers, which I like very much.

A variation to @cuonglm replacing grep with sed:

sed 's/[^0-9]//g' example.log | paste -sd'+' - | bc

The sed strips everything except for the numbers.
The paste -sd+ - command joins all the lines together as a single line
The bc evaluates the expression

Avinash Raj · Answer 8 · 2015-05-29T04:10:08.487

3

Through python3,

import re
with open(file) as f:
    m = f.read()
    l = re.findall(r'\d+', m)
    print(sum(map(int, l)))

edited May 29 '15 at 04:10

answered May 28 '15 at 04:14

Avinash Raj

3,653
4
20
34

`re.findall` returns a list of strings, this is not going to work – iruvar May 28 '15 at 22:28
@1_CR ya , I forget that. Check it now. – Avinash Raj May 29 '15 at 04:10
Maybe `sum(int(e) for e in l)` is more pythonic. – cuonglm May 29 '15 at 15:18

mikeserv · Answer 9 · 2015-05-28T04:53:46.563

3

You should use a calculator.

{ tr = \ | xargs printf '[%s=]P%d+p' | dc; } <infile 2>/dev/null

With your four lines that prints:

time=31
time=223
time=241
time=784

And more simply:

tr times=c '    + p' <infile |dc

...which prints...

If speed is what you're after then dc is what you want. Traditionally it was bc's compiler - and still is for many systems.

edited May 28 '15 at 04:53

answered May 28 '15 at 04:38

mikeserv

57,448
9
113
229

Not according to [my measurements](http://stackoverflow.com/a/18382280/7552): it depends how much work you have to do to generate the formula – glenn jackman May 28 '15 at 13:24
@glennjackman - your measurements don't include `dc` as near as I can tell. What are you talking about? – mikeserv May 28 '15 at 15:14
By the way, when comparing the old crew to the new crew - such as when you benchmark `perl` v the standard unix toolset - it really doesn't make much sense if you use GNU tools compiled on a GNU toolchain. All of the bloat that can negatively affect Perl's performance is *also* in *all* of those GNU-compiled GNU utils. Sad but true. You need a real, simply built, simple toolset to accurately judge the difference. Like an heirloom-toolchest set statically linked against musl libs for instance - in that way you can bench the one-tool/one-job paradigm vs the one-tool-to-rule-them-all one. – mikeserv May 28 '15 at 15:27

score 3 · Answer 10 · edited Jun 06 '15 at 07:26

Pure bash solution (Bash 3+):

while IFS= read -r line; do                   # While it reads a line:
    if [[ "$line" =~ [0-9]+ ]]; then      # If the line contains numbers:
        ((counter+=BASH_REMATCH[0]))          # Add the current number to counter
    fi                                    # End if.
done                                  # End loop.

echo "Total number: $counter"         # Print the number.
unset counter                         # Reset counter to 0.

Short version:

while IFS= read -r l; do [[ "$l" =~ [0-9]+ ]] && ((c+=BASH_REMATCH)); done; echo $c; c=0

Maybe also: `PS4='$((x+=${time%s*}))' time=0 x=0 sh -x – mikeserv May 31 '15 at 09:39 — mikeserv, May 31 '15 at 09:39

How can I quickly sum all numbers in a file?

10 Answers10