readlink -f and -e option description not clear

Question

-f, --canonicalize
canonicalize by following every symlink in every component of the given name recursively; all but the last component must exist
-e, --canonicalize-existing
canonicalize by following every symlink in every component of the given name recursively, all components must exist

I am not able to understand what does -f or -e do? The wordings are not at all clear. Canonical name is basically shortest unique absolute path. But what does it mean by component of a name? Does it mean subdirectories? What does it mean by "recursively" here? What I understand is recursively search every subdirectories of the given canonical name. But then it doesn't make sense for a symbolic link.

Next what does it mean for "-e" option that all component must exist? What is a component here?

Can someone please help with a simple example? Thanks

One more confusion: why do we need readlink at all. "ls -l" shows the symlink details. — jlas, Nov 14 '20 at 22:20
"_Canonical name is basically shortest unique absolute path_" no it's not. Consider `mkdir -p /a/b/c; touch /a/b/c/d; ln -s /a/b/c/d /e`. Shortest path is `/e` but canonical is `/a/b/c/d` — roaima, Nov 15 '20 at 00:22
The canonical path to a file or directory is one that does not use symbolic links, @jlas — roaima, Nov 15 '20 at 08:57
@roaima so "/home/user/temp/random_file.txt" and "/home/user/../user/temp/random_file.txt" both be canonical path. I did some research and found this link https://stackoverflow.com/questions/12100299/whats-a-canonical-path . It says it has to be unique and because both of these path are basically same, so the shortest "absolute" path is the canonical path. Is it wrong? — jlas, Nov 15 '20 at 20:13
I've avoided using "_shortest path_" because of this situation `mkdir -p /a/b/c /f; touch /a/b/c/d; ln /a/b/c/d /f`, where `/a/b/c/d` and `/f/d` are the _same file_. Both `/a/b/c/d` and `/f/d` are canonical paths to `d`. However, if you would like to think of the canonical path as being "_the shortest path from `/` to the file without using symbolic links_" you will usually be sufficiently correct. — roaima, Nov 16 '20 at 16:36

Reda Salih · Accepted Answer · 2020-11-14T23:16:51.350

First component here means an element of the path. Example :

/home/user/.ssh => <component1>/<component2>/<component3>

1- Suppose we have a directories structure like this :

lols
├── lol
├── lol1 -> lol
└── lol2 -> lol1

And also the non-existent directory here will be lols/lol3 So you can compare the output of each command :

readlink -f lols/lol1 : /lols/lol
readlink -e lols/lol1 : /lols/lol

The output here will be the same because all the components of the path exists.

readlink -f lols/lol8 : lols/lol8
readlink -e lols/lol8 : <empty outpyt>

The output here is different because with -f it will show the result because there is one component that exists at least in the path which is lols and with -e the output will be empty because all path components must exist.

And the last one is with having multiple non-existent directories :

readlink -f lols/lol8/lol10 : <empty output>
readlink -e lols/lol8/lol10 : <empty output>

Here the output will be empty because as described in the map page : -f : all but the last component must exist => Not respected -o : all components must exist => Not respected

2- Why not use only ls -l :

Suppose we create a file named file1 and create asymlink to this file named link1 and from link1 create another symlink link2 :

touch file1 : file1 
ln -s file1 link1 : link1 -> file1
ln -s link1 link2 : link2 -> link1

Then with ls -l link2 the output will be : link2 -> link1 And if we use readlink link2 the output will be : link1 ; same as ls -l But if we use readlink -f|-e link2the output will be : file1 ; so it will point to the source file.

So when to use readlink instead of ls ? When there is a nested symlinks (Recursive Read). When the files/directories are on a different locations.

So better to use readlink instead of ls to avoid errors.

@jlas Keep following symbolic links until you haven't got a symbolic link any more. — wizzwizz4, Nov 15 '20 at 18:11

thanasisp · Answer 2 · 2020-11-15T02:16:53.790

This is meaningful for links following a route over more than one hops until their final target. For example:

touch test_file

ln -s test_file test_link
ln -s non_existing_target dead_link

ln -s test_link link1
ln -s dead_link link2

In the above, link1 is finally linking to a file, through test_link, so -f and -e would give the same result. link2 is pointing to a dead link, and you see that:

> readlink -e link2
> readlink -f link2
/home/thanasis/temp/non_existing_target

dead_link is the "last component" in the expression "all but the last component must exist". -f is resolving to the target that doesn't exist, while -e is giving no output.

Note that man readlink recommends that

realpath is the preferred command to use for canonicalization functionality

For this example, realpath -m, (--missing) would give the same output to readlink -f. In general realpath -e is the way to test if a link can be resolved to a final existing target file, which is returning the expected error here:

> realpath -e link2
realpath: link2: No such file or directory

While ls -l returns all results and probably has some red color output for any dead links. Also the option -L exists, to dereference the links and show their target, this is for humans to read, never use ls to decide anything about a link inside a script.

great explanation. Defining "component" made the wordings very clear. On another note, I have noticed, many times "man" page and linux documentation lack clear explanation, for eg as component is not defined in this question, why is it so? I find Linux or virtualization difficult mainly because there is no good documentation or much information at one place, that's make it complex and make new learner discouraged. I wonder why it is so. — jlas, Nov 15 '20 at 01:52
@jlas `man` pages assume an audience of experts (or at least not novices). This helps them stay concise and to the point. One could debate if this is the right trade-off, but that would not change the fact. — SethMMorton, Nov 15 '20 at 02:13
I suggest you to focus into this: `realpath is the preferred command to use for canonicalization functionality` — thanasisp, Nov 15 '20 at 08:51

readlink -f and -e option description not clear

2 Answers2