I am searching for a simple way to perform a full system scan using clamav on a machine that also has Timeshift based snapshooting enabled.
As suggested by this answer on the Ubuntu site, I was using a command like:
clamscan -r --bell -i -exclude-dir="^/sys" /
(note: the -exclude-dir="^/sys" param was suggested to me by another user that pointed out that /sys is a virtual directory and probably best excluded from scans to avoid possible read-access errors)
The command works as expected, 'check all files on the computer, but only display infected files and ring a bell when found'.
This has an evident problem: "all files on the computer" includes the "/timeshift" directory, which is the directory Timeshift use to store snapshot data.
Now, taken from Timeshift official page:
In RSYNC mode, snapshots are taken using rsync and hard-links. Common files are shared between snapshots which saves disk space. Each snapshot is a full system backup that can be browsed with a file manager.
To put it simply: Timeshift duplicates the changed files, and uses hard-links to reference the unchanged ones. As far as my understanding go, this means that the "first" snapshot is probably a full copy of the filesystem (obviously excluding any file/path that Timeshift is configured to ignore) while any following snapshot only includes the changed files and mere links to to unchanged ones.
The problem: under standard settings, clamscan will also scan EVERY file in the /timeshift folder. While I am fine with scanning the changed files which are actual real files... scanning the links seems a waste since basically it means scanning the same file multiple times - one for the snapshot the file was first changed for, and then one for each link to the unchanged file in the following snapshots.
I am therefore searching for a simple way to exclude those hard-links from the scan. man clamscan shows the existence of a --follow-file-symlinks option, but even doing
clamscan -r --bell -i -exclude-dir="^/sys" --follow-file-symlinks=0 /
doesn't seem to work. After all, as far as my understanding go, that option only excludes symlinks, while Timeshift is using hard-links.
So, my question is: is there any way to perform a full system scan while skipping scanning the hard-linked files in the /timeshift directory while at the same time scanning the real ones?
(as a bonus side-question: would the same be possible using the clamtk UI too?)