84

I'm sure there are many ways to do this: how can I count the number of lines in a text file?

$ <cmd> file.txt
1020 lines
tshepang
  • 64,472
  • 86
  • 223
  • 290
Chris Smith
  • 951
  • 1
  • 6
  • 6

8 Answers8

119

The standard way is with wc, which takes arguments to specify what it should count (bytes, chars, words, etc.); -l is for lines:

$ wc -l file.txt
1020 file.txt
Michael Mrozek
  • 91,316
  • 38
  • 238
  • 232
  • How do I count the lines in a file if I want to **ignore** comments? Specifically, I want to *not* count lines that begin with a +, some white space (could be no white space) and then a %, which is the way comment lines appear in a git diff of a MATLAB file. I tried doing this with grep, but couldn't figure out the correct regular expression. – Gdalya Jul 11 '13 at 01:36
  • @Gdalya I hope the following pipeline will do this (no tests were perfomed): `cat matlab.git.diff | sed -e '/^\+[ ]*.*\%$/d' | wc -l`. `/regexp/d` deletes a line if it matches `regexp`, and `-e` turns on an adequate (IMNSHO) syntax for `regexp`. – dbanet Nov 06 '13 at 21:29
  • 2
    Why not simply `grep -v '^+ *%' matlab.git.diff | wc -l`? – celtschk Jul 06 '14 at 19:51
  • @celtschk , as long as this is usual in comment lines: is it possible to modify your `grep` command in order to consider as comment cases like `" + Hello"` (note the space(s) before the `+`)? – Sopalajo de Arrierez Jan 18 '15 at 20:46
  • 1
    @SopalajodeArrierez: Of course it is possible: `grep -v '^ *+' matlab.git.diff | wc -l` (I'm assuming the quote signs were not actually meant to be part of the line; I also assume that both lines with and without spaces in front of the `+` are meant to be comments; if at least one space is mandatory, either replace the star `*` with `\+`, or just add another space in front of the star). Probably instead of matching only spaces, you'd want to match arbitrary whitespace; for this replace the space with `[[:space:]]`. Note that I've also removed matching the `%` since it's not in your example. – celtschk Feb 14 '15 at 15:02
21

Steven D forgot GNU sed:

sed -n '$=' file.txt

Also, if you want the count without outputting the filename and you're using wc:

wc -l < file.txt

Just for the heck of it:

cat -n file.txt | tail -n 1 | cut -f1
Dennis Williamson
  • 6,620
  • 1
  • 34
  • 38
  • 2
    Or `grep -c ''`, or `tr -dc '\n' | wc -c`, or `nl -ba -nln | tail -n 1 |sed -e 's/[^0-9].*//'`... Is any of these useful in itself (as opposed to things to build upon to make a program that does more than counting lines), other than `wc -l` and pure (ba)sh? – Gilles 'SO- stop being evil' Dec 03 '10 at 01:58
  • 1
    @Gilles: I think the phrase "many ways" in the question triggered a challenge that Steve and I rose to. – Dennis Williamson Dec 03 '10 at 02:03
  • +1. Small note: `cat -n` is a GNU extension. – Steven D Dec 03 '10 at 02:03
  • @Dennis: I'm obviously not objecting... Follow-up challenge: can you do it by piping the data into `uniq`, with only POSIX tools to help you and no constraint on the line length? – Gilles 'SO- stop being evil' Dec 03 '10 at 02:21
  • @Steven: I'm pretty sure I've been using `cat -n` since before GNU existed. As a matter of fact, [HP-UX has it](http://h20000.www2.hp.com/bc/docs/support/SupportManual/c02253215/c02253215.pdf) (PDF), [Solaris](http://docs.sun.com/app/docs/doc/816-5165/cat-1?l=en&a=view) does, [FreeBSD as early as 1.0](http://www.freebsd.org/cgi/man.cgi?query=cat&apropos=0&sektion=0&manpath=FreeBSD+1.0-RELEASE&format=html) does, even [2.10 BSD (1986)](http://www.freebsd.org/cgi/man.cgi?query=cat&apropos=0&sektion=0&manpath=2.10+BSD&format=html) ... – Dennis Williamson Dec 03 '10 at 02:27
  • ... and [AIX](http://publib.boulder.ibm.com/infocenter/aix/v6r1/topic/com.ibm.aix.cmds/doc/aixcmds1/cat.htm), too. – Dennis Williamson Dec 03 '10 at 02:27
  • 1
    @Gilles: `sed 's/.*//' file.txt | uniq -c` – Dennis Williamson Dec 03 '10 at 02:30
  • @Dennis: Yet another reason I should stop trusting wikipedia: http://en.wikipedia.org/wiki/Cat_%28Unix%29 – Steven D Dec 03 '10 at 02:32
  • @Steven: I believe what they mean by "GNU only" is the long option `--number`. – Dennis Williamson Dec 03 '10 at 02:35
  • @Dennis: Of course this works, but it doesn't answer my question (you're not piping the data into `uniq`, you're piping it into `sed`). – Gilles 'SO- stop being evil' Dec 03 '10 at 08:10
  • 2
    @Gilles: Oh, you meant *first*. `uniq -c -w 0 file.txt` and you can `cut -c -7` to keep only the number. Or, more POSIXly: `uniq -c file.txt | awk '{c+=$1}END{print c}'`. How about `dc` (even though it's not POSIX)? `uniq -c file.txt | cut -c -7 | sed '$alax' | dc -e '[pq]sb[+z1=blax]sa' -`. `bc` is POSIX: `uniq -c file.txt | cut -c -7 | sed -n ':a;${s/\n/ + /gp;b};N;ba' | bc`. The easy answer if you assume a limited line length: `uniq -c -f 100000 file.txt`. – Dennis Williamson Dec 03 '10 at 16:21
  • You have to quote the expression `'$='` so it doesn't get expanded by the shell - the command above as is fails with zsh. (Would have edited it myself, but it's just 2 characters of difference...) – Josip Rodin Oct 17 '15 at 20:19
  • 1
    @JosipRodin: Quotes added – Dennis Williamson Oct 18 '15 at 00:09
  • `sed '$='` is not only GNU `sed`, the `=` is a SUSv4 `sed` command and the address `$` is also specified as the last line by SUSv4. – Kusalananda Jul 05 '16 at 09:49
18

As Michael said, wc -l is the way to go. But, just in case you inexplicably have bash, perl, or awk but not wc, here are a few more solutions:

Bash-only

$ LINECT=0; while read -r LINE; do (( LINECT++ )); done < file.txt; echo $LINECT

Perl Solutions

$ perl -lne 'END { print $. }' file.txt

and the far less readable:

$ perl -lne '}{ print $.' file.txt

Awk Solution

$  awk 'END {print NR}' file.txt
Gilles 'SO- stop being evil'
  • 807,993
  • 194
  • 1,674
  • 2,175
Steven D
  • 45,310
  • 13
  • 119
  • 114
14

Word of warning when using

wc -l

because wc -l functions by counting \n, if the last line in your file doesn't end in a newline effectively the line count will be off by 1. (hence the old convention leaving newline at the end of your file)

Since I can never be sure if any given file follows the convention of ending the last line with a newline or not, I recommend using any of these alternate commands which will include the last line in the count regardless of newline or not.

sed -n $= filename
perl -lne 'END { print $. }' filename
awk 'END {print NR}' filename
grep -c '' filename
pretzels1337
  • 241
  • 2
  • 4
  • nice summary. And welcome to *unix & linux* – Sebastian Sep 18 '14 at 15:29
  • Hm is the last piece really line? – gena2x Sep 18 '14 at 19:48
  • 1
    I'm sure it depends on everyone's usecase; for the 'last piece' is usually a line of text that someone didn't cap off with a newline. The usecase I most often encounter is a file with a single string of text that does not end in a newline. wc -l would count this as "0", when I would otherwise expect a count of "1". – pretzels1337 Sep 23 '14 at 16:22
  • It is not the count of wc that will be off by one, but your count: While in Windows (and DOS before it), the CR/LF sequence is a line separator, on Unix the LF character is a line terminator. That is, without a newline at the end, you don't have a line (and strictly speaking, not a valid text file). – celtschk Feb 07 '20 at 09:15
3

You can always use the command grep as follows:

grep -c "^" file.txt

It will count all the actual rows of file.txt, whether or not its last row contains a LF character at the end.

Paolo
  • 31
  • 2
2

In case you only have bash and absolutely no external tools available, you could also do the following:

count=0
while read
do
  ((count=$count+1))
done <file.txt
echo $count

Explanation: the loop reads standard input line by line (read; since we do nothing with the read input anyway, no variable is provided to store it in), and increases the variable count each time. Due to redirection (<file.txt after done), standard input for the loop is from file.txt.

celtschk
  • 10,644
  • 1
  • 19
  • 27
  • This is a very inefficient way to do it. Remember, bash reads are slow. – codeforester Feb 07 '20 at 01:51
  • @codeforester: That's true, but (a) it was a solution for when you have no other tool available, and (b) slow doesn't mean unawaitable. I just tried with a text file of 125MB (taking an actual text file and concatenating it a thousand times) and more than 2.6 million lines, and it took slightly less than 14 seconds. Not nothing — the tools do it in a fraction of a second — but certainly awaitable. – celtschk Feb 07 '20 at 08:58
  • 1
    This would miscount if any line ended with a backslash. – Kusalananda Apr 11 '21 at 17:54
0

If you're looking to count smaller files a simple wc -l file.txt could work.

Looking for an answer to this question myself, working with large files that are several gigs, I found the following tool:

https://github.com/crioux/turbo-linecount

Also, depending on your system configuration--if you're using an older version of wc you might be better off piping larger chunks with dd like so:

dd if={file_path} bs=128M | wc -l

Henry Tseng
  • 101
  • 3
0

grep -c $ is very simple and works great.

I even saved it as an alias since I use it a lot (lc stand for line count):

alias lc="grep -c $"

It can be used either this way:

lc myFile

Or that way:

cat myFile | lc

Note that this will not count the last line if it is empty. For my uses that is almost always OK though.

pitamer
  • 101
  • 2
  • If your `grep` implementation does not count the last line if it's empty, it's severely broken. On the contrary there are some `grep` implementations that count the extra bytes found after the last newline in non-text files as an extra line. For instance, `printf foo | grep -c $` outputs 1 with GNU `grep` even though `printf` outputs no line. `printf foo | wc -l` correctly outputs 0. – Stéphane Chazelas Oct 30 '22 at 10:38
  • @StéphaneChazelas But `wc -l` will be off by 1 for files without a newline at the end of the file (in those cases it will not count the last line, even if it has text). Isn't that sort of *more* broken? For my personal use, I'd rather skip counting the newline at EOF than skip counting a line with actual stuff in it. But I guess everybody counts lines for different purposes, and `wc -l` might be better for some! :) – pitamer Oct 30 '22 at 13:10
  • That's the point. A line has to delimited by a newline character. The characters after the last newline if any don't form part of a line. If there are such characters, by definition the file is not a text file. The output of `printf foo` does not form a text file. – Stéphane Chazelas Oct 30 '22 at 17:21