count lines in a file

Question

I'm sure there are many ways to do this: how can I count the number of lines in a text file?

$ <cmd> file.txt
1020 lines

score 119 · Accepted Answer · answered Dec 01 '10 at 22:16

119

The standard way is with wc, which takes arguments to specify what it should count (bytes, chars, words, etc.); -l is for lines:

$ wc -l file.txt
1020 file.txt

answered Dec 01 '10 at 22:16

Michael Mrozek

91,316
38
238
232

How do I count the lines in a file if I want to **ignore** comments? Specifically, I want to *not* count lines that begin with a +, some white space (could be no white space) and then a %, which is the way comment lines appear in a git diff of a MATLAB file. I tried doing this with grep, but couldn't figure out the correct regular expression. – Gdalya Jul 11 '13 at 01:36
@Gdalya I hope the following pipeline will do this (no tests were perfomed): `cat matlab.git.diff | sed -e '/^\+[ ]*.*\%$/d' | wc -l`. `/regexp/d` deletes a line if it matches `regexp`, and `-e` turns on an adequate (IMNSHO) syntax for `regexp`. – dbanet Nov 06 '13 at 21:29
2

Why not simply `grep -v '^+ *%' matlab.git.diff | wc -l`? – celtschk Jul 06 '14 at 19:51
@celtschk , as long as this is usual in comment lines: is it possible to modify your `grep` command in order to consider as comment cases like `" + Hello"` (note the space(s) before the `+`)? – Sopalajo de Arrierez Jan 18 '15 at 20:46
1

@SopalajodeArrierez: Of course it is possible: `grep -v '^ *+' matlab.git.diff | wc -l` (I'm assuming the quote signs were not actually meant to be part of the line; I also assume that both lines with and without spaces in front of the `+` are meant to be comments; if at least one space is mandatory, either replace the star `*` with `\+`, or just add another space in front of the star). Probably instead of matching only spaces, you'd want to match arbitrary whitespace; for this replace the space with `[[:space:]]`. Note that I've also removed matching the `%` since it's not in your example. – celtschk Feb 14 '15 at 15:02

Dennis Williamson · Answer 2 · 2015-10-18T00:08:07.380

21

Steven D forgot GNU sed:

sed -n '$=' file.txt

Also, if you want the count without outputting the filename and you're using wc:

wc -l < file.txt

Just for the heck of it:

cat -n file.txt | tail -n 1 | cut -f1

edited Oct 18 '15 at 00:08

answered Dec 03 '10 at 01:40

Dennis Williamson

6,620
1
34
38

2

Or `grep -c ''`, or `tr -dc '\n' | wc -c`, or `nl -ba -nln | tail -n 1 |sed -e 's/[^0-9].*//'`... Is any of these useful in itself (as opposed to things to build upon to make a program that does more than counting lines), other than `wc -l` and pure (ba)sh? – Gilles 'SO- stop being evil' Dec 03 '10 at 01:58
1

@Gilles: I think the phrase "many ways" in the question triggered a challenge that Steve and I rose to. – Dennis Williamson Dec 03 '10 at 02:03
+1. Small note: `cat -n` is a GNU extension. – Steven D Dec 03 '10 at 02:03
@Dennis: I'm obviously not objecting... Follow-up challenge: can you do it by piping the data into `uniq`, with only POSIX tools to help you and no constraint on the line length? – Gilles 'SO- stop being evil' Dec 03 '10 at 02:21
@Steven: I'm pretty sure I've been using `cat -n` since before GNU existed. As a matter of fact, [HP-UX has it](http://h20000.www2.hp.com/bc/docs/support/SupportManual/c02253215/c02253215.pdf) (PDF), [Solaris](http://docs.sun.com/app/docs/doc/816-5165/cat-1?l=en&a=view) does, [FreeBSD as early as 1.0](http://www.freebsd.org/cgi/man.cgi?query=cat&apropos=0&sektion=0&manpath=FreeBSD+1.0-RELEASE&format=html) does, even [2.10 BSD (1986)](http://www.freebsd.org/cgi/man.cgi?query=cat&apropos=0&sektion=0&manpath=2.10+BSD&format=html) ... – Dennis Williamson Dec 03 '10 at 02:27
... and [AIX](http://publib.boulder.ibm.com/infocenter/aix/v6r1/topic/com.ibm.aix.cmds/doc/aixcmds1/cat.htm), too. – Dennis Williamson Dec 03 '10 at 02:27
1

@Gilles: `sed 's/.*//' file.txt | uniq -c` – Dennis Williamson Dec 03 '10 at 02:30
@Dennis: Yet another reason I should stop trusting wikipedia: http://en.wikipedia.org/wiki/Cat_%28Unix%29 – Steven D Dec 03 '10 at 02:32
@Steven: I believe what they mean by "GNU only" is the long option `--number`. – Dennis Williamson Dec 03 '10 at 02:35
@Dennis: Of course this works, but it doesn't answer my question (you're not piping the data into `uniq`, you're piping it into `sed`). – Gilles 'SO- stop being evil' Dec 03 '10 at 08:10
2

@Gilles: Oh, you meant *first*. `uniq -c -w 0 file.txt` and you can `cut -c -7` to keep only the number. Or, more POSIXly: `uniq -c file.txt | awk '{c+=$1}END{print c}'`. How about `dc` (even though it's not POSIX)? `uniq -c file.txt | cut -c -7 | sed '$alax' | dc -e '[pq]sb[+z1=blax]sa' -`. `bc` is POSIX: `uniq -c file.txt | cut -c -7 | sed -n ':a;${s/\n/ + /gp;b};N;ba' | bc`. The easy answer if you assume a limited line length: `uniq -c -f 100000 file.txt`. – Dennis Williamson Dec 03 '10 at 16:21
You have to quote the expression `'$='` so it doesn't get expanded by the shell - the command above as is fails with zsh. (Would have edited it myself, but it's just 2 characters of difference...) – Josip Rodin Oct 17 '15 at 20:19
1

@JosipRodin: Quotes added – Dennis Williamson Oct 18 '15 at 00:09
`sed '$='` is not only GNU `sed`, the `=` is a SUSv4 `sed` command and the address `$` is also specified as the last line by SUSv4. – Kusalananda Jul 05 '16 at 09:49

score 18 · Answer 3 · edited Dec 02 '10 at 20:10

18

As Michael said, wc -l is the way to go. But, just in case you inexplicably have bash, perl, or awk but not wc, here are a few more solutions:

Bash-only

$ LINECT=0; while read -r LINE; do (( LINECT++ )); done < file.txt; echo $LINECT

Perl Solutions

$ perl -lne 'END { print $. }' file.txt

and the far less readable:

$ perl -lne '}{ print $.' file.txt

Awk Solution

$  awk 'END {print NR}' file.txt

edited Dec 02 '10 at 20:10

Gilles 'SO- stop being evil'

807,993
194
1,674
2,175

answered Dec 01 '10 at 23:18

Steven D

45,310
13
119
114

score 14 · Answer 4 · answered Sep 18 '14 at 14:56

14

Word of warning when using

wc -l

because wc -l functions by counting \n, if the last line in your file doesn't end in a newline effectively the line count will be off by 1. (hence the old convention leaving newline at the end of your file)

Since I can never be sure if any given file follows the convention of ending the last line with a newline or not, I recommend using any of these alternate commands which will include the last line in the count regardless of newline or not.

sed -n $= filename
perl -lne 'END { print $. }' filename
awk 'END {print NR}' filename
grep -c '' filename

answered Sep 18 '14 at 14:56

pretzels1337

241
2
4

nice summary. And welcome to *unix & linux* – Sebastian Sep 18 '14 at 15:29
Hm is the last piece really line? – gena2x Sep 18 '14 at 19:48
1

I'm sure it depends on everyone's usecase; for the 'last piece' is usually a line of text that someone didn't cap off with a newline. The usecase I most often encounter is a file with a single string of text that does not end in a newline. wc -l would count this as "0", when I would otherwise expect a count of "1". – pretzels1337 Sep 23 '14 at 16:22
It is not the count of wc that will be off by one, but your count: While in Windows (and DOS before it), the CR/LF sequence is a line separator, on Unix the LF character is a line terminator. That is, without a newline at the end, you don't have a line (and strictly speaking, not a valid text file). – celtschk Feb 07 '20 at 09:15

score 3 · Answer 5 · answered Jul 05 '16 at 07:50

3

You can always use the command grep as follows:

grep -c "^" file.txt

It will count all the actual rows of file.txt, whether or not its last row contains a LF character at the end.

answered Jul 05 '16 at 07:50

Paolo

31
2

score 2 · Answer 6 · answered Jul 06 '14 at 20:06

2

In case you only have bash and absolutely no external tools available, you could also do the following:

count=0
while read
do
  ((count=$count+1))
done <file.txt
echo $count

Explanation: the loop reads standard input line by line (read; since we do nothing with the read input anyway, no variable is provided to store it in), and increases the variable count each time. Due to redirection (<file.txt after done), standard input for the loop is from file.txt.

answered Jul 06 '14 at 20:06

celtschk

10,644
1
19
27

This is a very inefficient way to do it. Remember, bash reads are slow. – codeforester Feb 07 '20 at 01:51
@codeforester: That's true, but (a) it was a solution for when you have no other tool available, and (b) slow doesn't mean unawaitable. I just tried with a text file of 125MB (taking an actual text file and concatenating it a thousand times) and more than 2.6 million lines, and it took slightly less than 14 seconds. Not nothing — the tools do it in a fraction of a second — but certainly awaitable. – celtschk Feb 07 '20 at 08:58
1

This would miscount if any line ended with a backslash. – Kusalananda Apr 11 '21 at 17:54

Henry Tseng · Answer 7 · 2021-04-11T17:45:16.670

If you're looking to count smaller files a simple wc -l file.txt could work.

Looking for an answer to this question myself, working with large files that are several gigs, I found the following tool:

https://github.com/crioux/turbo-linecount

Also, depending on your system configuration--if you're using an older version of wc you might be better off piping larger chunks with dd like so:

dd if={file_path} bs=128M | wc -l

score 0 · Answer 8 · answered Oct 30 '22 at 09:09

0

grep -c $ is very simple and works great.

I even saved it as an alias since I use it a lot _{^{(lc stand for line count)}}:

alias lc="grep -c $"

It can be used either this way:

lc myFile

Or that way:

cat myFile | lc

Note that this will not count the last line if it is empty. For my uses that is almost always OK though.

answered Oct 30 '22 at 09:09

pitamer

101
2

If your `grep` implementation does not count the last line if it's empty, it's severely broken. On the contrary there are some `grep` implementations that count the extra bytes found after the last newline in non-text files as an extra line. For instance, `printf foo | grep -c $` outputs 1 with GNU `grep` even though `printf` outputs no line. `printf foo | wc -l` correctly outputs 0. – Stéphane Chazelas Oct 30 '22 at 10:38
@StéphaneChazelas But `wc -l` will be off by 1 for files without a newline at the end of the file (in those cases it will not count the last line, even if it has text). Isn't that sort of *more* broken? For my personal use, I'd rather skip counting the newline at EOF than skip counting a line with actual stuff in it. But I guess everybody counts lines for different purposes, and `wc -l` might be better for some! :) – pitamer Oct 30 '22 at 13:10
That's the point. A line has to delimited by a newline character. The characters after the last newline if any don't form part of a line. If there are such characters, by definition the file is not a text file. The output of `printf foo` does not form a text file. – Stéphane Chazelas Oct 30 '22 at 17:21

count lines in a file

8 Answers8

Bash-only

Perl Solutions

Awk Solution

Linked

Related