How do I print out the byte size of each file in my Bash script?

Question

My current code is like so:

scan.sh:

#!/bin/bash
while IFS= read -r line;
do
    byte = $(stat -c%s "$line");
    echo "$line : $byte";
done< <(ls *.$1)

The output would be like this:

./scan.sh cpp
./scan.sh: line 4: byte: command not found
arraysum.cpp :
./scan.sh: line 4: byte: command not found
countLines.cpp :
./scan.sh: line 4: byte: command not found
createtext.cpp :
./scan.sh: line 4: byte: command not found
multiproc1.cpp :
./scan.sh: line 4: byte: command not found
myWc.cpp :
./scan.sh: line 4: byte: command not found
test.cpp :

Basically my code will take one syntax and will search the directory based on that syntax. The problem is I want it to print out "name of file" + "byte size of file", only I can't seem to get that working.

It should be `byte=$(....` -- get rid of the spaces around the `=` sign. — Stephen Harris, Oct 06 '18 at 20:40

score 7 · Answer 1 · answered Oct 06 '18 at 20:41

In the syntax of Bourne-like shells like bash, there must not be any space around the = sign in assignments.

byte=value

Here though, parsing the output of ls is a bad idea.

You can just write it:

#! /bin/sh -
stat -c '%n: %s' -- *."$1"

If you do need a loop, just write it:

#! /bin/zsh -
for file in *.$1; do
  stat -c '%n: %s' -- $file
done

Or if you have to use bash:

#! /bin/bash -
shopt -s failglob
for file in *."$1"; do
  stat -c '%n: %s' -- "$file"
done

user10543 · Answer 2 · 2018-10-08T23:51:17.840

Here's a simple way to do this that should work with the Bourne shell and its descendants (including bash and ksh), if you don't care too much about the exact output format:

$ for file in *; do if [ -f "$file" ] && [ -r "$file" ]; then wc -c "$file"; fi; done
      23 HEAD
     111 config
      73 description

If you also don't care too much about errors and corner cases (in which case, good luck to you):

$ for file in *; do wc -c $file; done

Notes:

If you're writing this with bash or ksh, you're probably better off using (( )) or [[ ]] instead of [ ]. (source) Also, below, consider using $(wc -c <"$file") instead of `wc -c <"$file"`). (source)
-f tests to see if what you're looking at is an ordinary file (not a directory, device, pipe, socket, tty, or generally some weird thing that can't be said to have a size in bytes). -r tests that the file is readable, i.e., that wc has a chance of succeeding; if you're looking at huge files or files that you can't read, use stat as per your original version and Stéphane's answer.
The quotes ("$file") are necessary if any of the files have spaces or tabs in them (e.g., a file named my stuff.txt).
If you do care about the exact format, you should probably use some combination of `wc -c <"$file"` (which will not print the filename) and echo or echo -n (which will print whatever you'd like).
If the files of interest are arguments to a script, in that script use "$@" (explanation).

I agree with @stéphane-chazelas that you shouldn't parse the output of ls; but if you do, you don't need process substitution. You can more simply read the output of the command:

ls | while IFS= read -r blah blah blah

or, if you want to recurse through a directory:

find blah -type f -print | while IFS= read -r blah blah

or better still:

find blah -type f print0 | xargs -o blah blah

where -print0 and xargs -0 again properly handle filenames with spaces or tabs

The problem with using `wc` here is that it will read the whole contents of the file, which can quickly get pretty expensive on large files... Using `stat` is the correct approach here. — filbranden, Oct 07 '18 at 15:24
"pretty expensive on large files": I agree (and said so). But, what's a large file? To me, it's hundreds of megabytes. For anything smaller, performance optimization is unnecessary. Since @li-wang appears to be counting bytes in .cpp files, I think `wc` is appropriate. — user10543, Oct 07 '18 at 19:56

How do I print out the byte size of each file in my Bash script?

scan.sh:

2 Answers2