11

Output of locate txt | head:

/etc/brltty/brl-ba-all.txt
/etc/brltty/brl-bd-all.txt
/etc/brltty/brl-bl-18.txt
/etc/brltty/brl-bl-40_m20_m40.txt
/etc/brltty/brl-ec-all.txt
/etc/brltty/brl-ec-spanish.txt
/etc/brltty/brl-eu-all.txt
/etc/brltty/brl-lb-all.txt
/etc/brltty/brl-lt-all.txt
/etc/brltty/brl-mb-all.txt

Output of locate *.txt | head:

/home/abc/capital.txt
/home/abc/state.txt

Why is there such a huge difference in the output? The second command only seems to check my home folder, but the first command seems to check many directories. Why is that the case?

polemon
  • 11,133
  • 11
  • 69
  • 111

3 Answers3

23

locate txt locates all files (of any type including regular, symlink, directory, socket...) whose path contains txt¹, so includes /foo/xtxty/bar, /foo/bar.txt, /foo/txt.bar, etc.

locate *.txt is wrong because that * is left unquoted, so the *.txt would be expanded by the shell to all the filenames in the current directory matching that pattern first and the result passed to locate, so for instance, if the current directory contained a.txt and b.txt, that would end up running locate a.txt b.txt which locates paths that contain either a.txt and b.txt or depending on the locate implementation, both of them such as /foo/da.txtob.txt (yours looking like it's one in the first category and you likely ran the command from within /home/abc).

If there's no .txt file in the current directory, depending on the shell you get either an error or locate is called with *.txt literally as argument.

To always call locate with a literal *.txt which is what you want here, you want to make sure that * is quoted for the shell, either with "...", '...' or backslash if in a Bourne-like shell, single quotes being the best as they quote every character in Bourne-like shell and are the most portable among shells:

locate '*.txt'

Then, as that argument contains a wildcard (*), locate switches from a subtstring search to a pattern matching search (same ones as those recognised by the shell or find -name for instance) and returns all the file paths that match that pattern, that is, all the file paths that end in .txt, like /foo/.txt or /foo/bar.txt.

locate is not a standard command, and there are many incompatible implementations around, but those simple behaviours above are common to most if not all. Most implementations support various options to do the matching differently. Check your own locate documentation with man locate, not some random pages on the internet as they may very well document a different implementation and/or version.


¹ and that did exist at the time the locate database was last updated

Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
8

When you use globbing like that (<command> *.txt) for instance, the expansion is done by your shell, not the command you're running, or in this case locate.

If you want to ensure the globbing character gets passed on to the command you're running, put it in quotes:

locate "*.txt"

or escape it with \:

locate \*.txt

Furthermore, locate has a matching algorithm more or less based on regexp, but not quite. If searching for txt, it'll assume you want txt to be part of the filename, not necessarily at the end.

You can force regexp handling though:

locate -r "txt$"

For further details, refer to the locate man page

polemon
  • 11,133
  • 11
  • 69
  • 111
5

locate takes patterns as arguments. Since *.txt is not quoted, the shell tries to expand it first instead of passing it literally.

Also, note:

As a special case, a pattern containing no globbing characters (``foo'') is matched as though it were ``*foo*''.

So locate txt is equivalent to locate '*txt*'. Assuming the current directory only contains a.txt and b.txt, then locate *.txt expands to locate a.txt b.txt, and so is equivalent to locate '*a.txt*' '*b.txt*'.

chepner
  • 7,341
  • 1
  • 26
  • 27