How to ensure all processes are killed before unmounting a filesystem?

Question

I am trying to unmount a busy file system on which continuous I/O reads and writes are happening by a multithreaded program due to which the umount command fails.

root@ubuntu:~ # umount /mount/v1
umount: /mount/v1: target is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))

Now, I tried to kill all the processes using

/sbin/fuser -m /mount/v1 -k

But as per fuser Command

fuser -k or -K might not be able to detect and kill new processes that are created immediately after the program starts to run.

which is what happens in my case as some of the threads might have issued IO request when at the same time. When I again go for unmounting the file system, it again says it is busy and this becomes a loop.

My question is, how do I ensure that no new processes are able to do read/writes to the filesystem once

/sbin/fuser -m /mount/v1 -k

command is issued so that the filesystem can be gracefully unmounted.

"by a multithreaded program". Do you now the process name? If so, can use `pkill` to kill them all. — kaylum, Mar 08 '20 at 10:40
the process name is not available in my scenario, also there can be `n` number of such programs which might have pending IO requests for the disk mounted at `/mount/v1`. — Vikas Kumar, Mar 08 '20 at 11:05
Another option is lazy umount with -l then fuser to kill. That should prevent any new opens. — kaylum, Mar 08 '20 at 11:08

score 1 · Answer 1 · answered Mar 08 '20 at 13:58

Here are two methods, if you don't trust your applications, you can directly go to the second method.

1. `umount --lazy`

If those interfering processes don't chdir()) into the mount point (use absolute paths for example) and you know they will eventually give up, then on Linux you can use umount --lazy: the kernel will unmap the filesystem from the directories arborescence making it unreachable to new accesses and will automatically unmount it when last reference disappears. Reference can be an open fd, a cwd, a mount point inside (this last one would be bad)...

But this will not affect processes/threads keeping a reference "inside", example still open file descriptors or having cwd inside. Then this would be even more difficult since now the filesystem is not reachable anymore and it becomes even more difficult to track offending processes/threads (eg: fuser -m will not find them anymore). So somehow you need processes to behave correctly for this method.

2. freezer cgroup

Alternate solution that should work for bad behaving processes: group all these processes (and threads) in the same freezer cgroup. Freeze them all, preventing them to add resource usage to the mountpoint (by forking/cloning/opening new fds). Even if you miss a few on a first loop, you'll eventually catch all new in subsequent iterations. As they are now frozen, no new activity will happen: you can now kill them all. For cgroups v1, already initialized by the OS (eg: by systemd or cgmanager, else you'll have to figure out how to mount the freezer cgroup subsystem), something like this should work. Note that apparently writting to cgroup.procs should include all threads which should appear in tasks but there have been reports of unreliable behaviour, so just in case I also iterate over threads even if this is probably not needed (ie: tasks is probably already populated correctly with threads, so the for t loop is redundant).

mkdir -p /sys/fs/cgroup/freezer/prepareumount

echo FROZEN > /sys/fs/cgroup/freezer/prepareumount/freezer.state

for i in $(seq 1 10); do
    for p in $(fuser -m /mount/v1 2>/dev/null); do
        echo $p > /sys/fs/cgroup/freezer/prepareumount/cgroup.procs
        for t in $(ps -L -o tid= -p $p); do
            echo $t > /sys/fs/cgroup/freezer/prepareumount/tasks
        done
    done
done

#give time to an overloaded kernel to freeze everything (FREEZING->FROZEN)
while ! cat /sys/fs/cgroup/freezer/prepareumount/freezer.state | grep -q FROZEN; do
    sleep 0.1
done

# kills, delayed
for i in $(cat /sys/fs/cgroup/freezer/prepareumount/cgroup.procs); do
    kill -KILL $i
done

# actual kills happen now, at once
echo THAWED > /sys/fs/cgroup/freezer/prepareumount/freezer.state

sleep 1
umount /mount/v1

This would have to be adapted if using cgroups v2.

Thanks. Can you please clarify what difference does using an absolute path and relative path make when the process uses `chdir()` in method1 (umount --lazy)? — Vikas Kumar, Mar 10 '20 at 17:16
If the process uses an absolute path, it doesn't *have* to use chdir(), but it still *could* have chosen to use chdir(), or open a file descriptor on a directory part of the mounted filesystem. That's chdir() or the directory kept open that counts, because probably all chidren will also have it open, while a file wouldn't be by all. If it uses chdir/keeps a dir open and doesn't want to die (or respawn sub processes/threads faster than killed), then the mountpoint probably can't be freed with 1st method — A.B, Mar 10 '20 at 18:09

How to ensure all processes are killed before unmounting a filesystem?

1 Answers1

1. umount --lazy

2. freezer cgroup

1. `umount --lazy`