22

The following command will tar all "dot" files and folders:

tar -zcvf dotfiles.tar.gz .??*

I am familiar with regular expressions, but I don't understand how to interpret .??*. I executed ls .??* and tree .??* and looked at the files which were listed. Why does this regular expression include all files within folders starting with . for example?

cjm
  • 26,740
  • 12
  • 88
  • 84
SabreWolfy
  • 1,154
  • 4
  • 13
  • 26
  • 1
    see also http://unix.stackexchange.com/questions/1168/how-to-glob-every-hidden-file-except-current-and-parent-directory – Lesmana Jul 23 '12 at 20:37

3 Answers3

32

Globs are not regular expressions. In general, the shell will try to interpret anything you type on the command line that you don't quote as a glob. Shells are not required to support regular expressions at all (although in reality many of the fancier more modern ones do, e.g. the =~ regex match operator in the bash [[ construct).

The .??* is a glob. It matches any file name that begins with a literal dot ., followed by any two (not necessarily the same) characters, ??, followed by the regular expression equivalent of [^/]*, i.e. 0 or more characters that are not /.

For the full details of shell pathname expansion (the full name for "globbing"), see the POSIX spec.

dhag
  • 15,440
  • 4
  • 54
  • 65
jw013
  • 50,274
  • 9
  • 137
  • 141
  • 10
    Additional point: this is an attempt to write a glob which matches all of the dotfiles in a directory *except* the special entries `.` and `..`, which one normally does not want to do anything with. It's not quite right; it doesn't pick up anything named '`.X`' where X is some character other than dot. I don't think it's possible to write a single glob that matches *every* dotfile except `.` and `..`, but you can do it with two: `tar zcvf dotfiles.tar.gz .[!.] .??*` for instance. – zwol Jul 22 '12 at 21:45
  • @Zack: Thanks for the clarification. I posted a comment about that, but then deleted it. `ls .?` returned the same as `ls ..`, which meant there were no other entries in the folder matching the pattern `.?`. I would have done `.[^.]` for all `.?` files other than `..`. – SabreWolfy Jul 23 '12 at 17:27
  • 2
    @SabreWolfy If you read the POSIX spec carefully, that's actually an important difference between globs and regex: in bracket expressions, `[^abc]` in regex syntax means the same as `[!abc]` in glob syntax (i.e. `^` is replaced with `!` for globs). Using `[^abc]` style syntax in a glob is not very portable because POSIX does not specify what it means, so some shells interpret it using regex-like semantics while others treat `^` as just a literal character. – jw013 Jul 23 '12 at 17:31
  • 1
    @jw013: Thanks for the details. I must remember that glob != regexp :) – SabreWolfy Jul 23 '12 at 17:34
  • 1
    It just occurred to me that `.[!.]` may be left as a literal on the `tar` command line in the common case where there are no files that match that pattern. Some shells let you control that behavior, e.g. with bash, `shopt -s nullglob` will make it vanish from the command line if it doesn't match anything, but that's not a universal feature. – zwol Jul 23 '12 at 19:03
  • @Zack With `bash` you also get the `dotglob` and `GLOBIGNORE` features. With `GLOBIGNORE=.`, a simple `.*` should do. – jw013 Jul 23 '12 at 19:27
9

The .??* wildcard (not a regular expression, though it looks that way) translates into filenames that start with a period (.) , followed by two single characters (??), and then any number (zero or more) of other characters (*).

Maybe this page on Wildcards in Filenames will be helpful.

Levon
  • 11,174
  • 4
  • 45
  • 41
-1

To add to the other answers, a single ? will translate to a single character filename and ?? will match filenames that has only two characters and so on.

[root@mercy testdir_2]# ls
ion  it  r
[root@mercy  testdir_2]# ls ?
r
[root@mercy  testdir_2]# ls ??
it
[root@mercy 1 testdir_2]# ls ???
ion
[root@mercy  testdir_2]#
Sreeraj
  • 4,984
  • 10
  • 38
  • 57