5
-f, --canonicalize
canonicalize by following every symlink in every component of the given name recursively; all but the last component must exist
-e, --canonicalize-existing
canonicalize by following every symlink in every component of the given name recursively, all components must exist

I am not able to understand what does -f or -e do? The wordings are not at all clear. Canonical name is basically shortest unique absolute path. But what does it mean by component of a name? Does it mean subdirectories? What does it mean by "recursively" here? What I understand is recursively search every subdirectories of the given canonical name. But then it doesn't make sense for a symbolic link.

Next what does it mean for "-e" option that all component must exist? What is a component here?

Can someone please help with a simple example? Thanks

jlas
  • 95
  • 1
  • 5
  • One more confusion: why do we need readlink at all. "ls -l" shows the symlink details. – jlas Nov 14 '20 at 22:20
  • "_Canonical name is basically shortest unique absolute path_" no it's not. Consider `mkdir -p /a/b/c; touch /a/b/c/d; ln -s /a/b/c/d /e`. Shortest path is `/e` but canonical is `/a/b/c/d` – roaima Nov 15 '20 at 00:22
  • @roaima Then how do we define canonical name/path? – jlas Nov 15 '20 at 02:28
  • The canonical path to a file or directory is one that does not use symbolic links, @jlas – roaima Nov 15 '20 at 08:57
  • @roaima so "/home/user/temp/random_file.txt" and "/home/user/../user/temp/random_file.txt" both be canonical path. I did some research and found this link https://stackoverflow.com/questions/12100299/whats-a-canonical-path . It says it has to be unique and because both of these path are basically same, so the shortest "absolute" path is the canonical path. Is it wrong? – jlas Nov 15 '20 at 20:13
  • 1
    I've avoided using "_shortest path_" because of this situation `mkdir -p /a/b/c /f; touch /a/b/c/d; ln /a/b/c/d /f`, where `/a/b/c/d` and `/f/d` are the _same file_. Both `/a/b/c/d` and `/f/d` are canonical paths to `d`. However, if you would like to think of the canonical path as being "_the shortest path from `/` to the file without using symbolic links_" you will usually be sufficiently correct. – roaima Nov 16 '20 at 16:36

2 Answers2

6

First component here means an element of the path. Example :

/home/user/.ssh => <component1>/<component2>/<component3>

1- Suppose we have a directories structure like this :

lols
├── lol
├── lol1 -> lol
└── lol2 -> lol1

And also the non-existent directory here will be lols/lol3 So you can compare the output of each command :

readlink -f lols/lol1 : /lols/lol
readlink -e lols/lol1 : /lols/lol

The output here will be the same because all the components of the path exists.

readlink -f lols/lol8 : lols/lol8
readlink -e lols/lol8 : <empty outpyt>

The output here is different because with -f it will show the result because there is one component that exists at least in the path which is lols and with -e the output will be empty because all path components must exist.

And the last one is with having multiple non-existent directories :

readlink -f lols/lol8/lol10 : <empty output>
readlink -e lols/lol8/lol10 : <empty output>

Here the output will be empty because as described in the map page : -f : all but the last component must exist => Not respected -o : all components must exist => Not respected

2- Why not use only ls -l :

Suppose we create a file named file1 and create asymlink to this file named link1 and from link1 create another symlink link2 :

touch file1 : file1 
ln -s file1 link1 : link1 -> file1
ln -s link1 link2 : link2 -> link1

Then with ls -l link2 the output will be : link2 -> link1 And if we use readlink link2 the output will be : link1 ; same as ls -l But if we use readlink -f|-e link2the output will be : file1 ; so it will point to the source file.

So when to use readlink instead of ls ? When there is a nested symlinks (Recursive Read). When the files/directories are on a different locations.

So better to use readlink instead of ls to avoid errors.

Reda Salih
  • 1,724
  • 4
  • 9
3

This is meaningful for links following a route over more than one hops until their final target. For example:

touch test_file

ln -s test_file test_link
ln -s non_existing_target dead_link

ln -s test_link link1
ln -s dead_link link2

In the above, link1 is finally linking to a file, through test_link, so -f and -e would give the same result. link2 is pointing to a dead link, and you see that:

> readlink -e link2
> readlink -f link2
/home/thanasis/temp/non_existing_target

dead_link is the "last component" in the expression "all but the last component must exist". -f is resolving to the target that doesn't exist, while -e is giving no output.


Note that man readlink recommends that

realpath is the preferred command to use for canonicalization functionality

For this example, realpath -m, (--missing) would give the same output to readlink -f. In general realpath -e is the way to test if a link can be resolved to a final existing target file, which is returning the expected error here:

> realpath -e link2
realpath: link2: No such file or directory

While ls -l returns all results and probably has some red color output for any dead links. Also the option -L exists, to dereference the links and show their target, this is for humans to read, never use ls to decide anything about a link inside a script.

thanasisp
  • 7,802
  • 2
  • 26
  • 39
  • great explanation. Defining "component" made the wordings very clear. On another note, I have noticed, many times "man" page and linux documentation lack clear explanation, for eg as component is not defined in this question, why is it so? I find Linux or virtualization difficult mainly because there is no good documentation or much information at one place, that's make it complex and make new learner discouraged. I wonder why it is so. – jlas Nov 15 '20 at 01:52
  • @jlas `man` pages assume an audience of experts (or at least not novices). This helps them stay concise and to the point. One could debate if this is the right trade-off, but that would not change the fact. – SethMMorton Nov 15 '20 at 02:13
  • I suggest you to focus into this: `realpath is the preferred command to use for canonicalization functionality` – thanasisp Nov 15 '20 at 08:51