45

I have to copy files on a machine. And the data is immensely large. Now servers need to serve normally, and there are usually a particular range of busy hours on those. So is there a way to run such commands in a way that if server hits busy hours, it pauses process, and when it gets out of that range, it resumes it?

Intended-Result

cp src dst

if time between 9:00-14:00 pause process
After 14:00 resume cp command.
Rui F Ribeiro
  • 55,929
  • 26
  • 146
  • 227
Sollosa
  • 1,887
  • 4
  • 19
  • 32
  • 23
    rsync can resume partial transfers – Thorbjørn Ravn Andersen Feb 21 '19 at 15:37
  • 2
    Do you *need* the actual data to be copied as a backup? If not, could you use `cp -al` to make a hardlink farm? Or use a filesystem that supports block-level reflinks with copy-on-write, using `cp -a --reflink=auto`? BTRFS and ZFS support that for copies within the same physical device. – Peter Cordes Feb 21 '19 at 17:05
  • 10
    Do any of the files in `src` change between 9:00 and 14:00? If so, simply pausing and resuming the `cp` process may result in corrupted files. It may be better to run `rsync` in combination with the `timeout` command. – Mark Plotnick Feb 21 '19 at 19:51
  • From and to where are the files being copied? Is this a virtual system? What is the source filesystem? What's the purpose of the copy? – Braiam Feb 24 '19 at 20:23
  • @Braiam Im using rsync, and copying files from remote unto local machine. I just used cp command as example here btw – Sollosa Feb 26 '19 at 06:35

6 Answers6

84

You can pause execution of a process by sending it a SIGSTOP signal and then later resume it by sending it a SIGCONT.

Assuming your workload is a single process (doesn't fork helpers running in background), you can use something like this:

# start copy in background, store pid
cp src dst &
echo "$!" >/var/run/bigcopy.pid

Then when busy time starts, send it a SIGSTOP:

# pause execution of bigcopy
kill -STOP "$(cat /var/run/bigcopy.pid)"

Later on, when the server is idle again, resume it.

# resume execution of bigcopy
kill -CONT "$(cat /var/run/bigcopy.pid)"

You will need to schedule this for specific times when you want it executed, you can use tools such as cron or systemd timers (or a variety of other similar tools) to get this scheduled. Instead of scheduling based on a time interval, you might choose to monitor the server (perhaps looking at load average, CPU usage or activity from server logs) to make a decision of when to pause/resume the copy.

You also need to manage the PID file (if you use one), make sure your copy is actually still running before pausing it, probably you'll want to clean up by removing the PID file once the copy is finished, etc.

In other words, you need more around this to make a reliable, but the base idea of using these SIGSTOP and SIGCONT signals to pause/resume execution of a process seems to be what you're looking for.

Stephen Kitt
  • 411,918
  • 54
  • 1,065
  • 1,164
filbranden
  • 21,113
  • 3
  • 58
  • 84
  • 8
    +1 See also https://utcc.utoronto.ca/~cks/space/blog/unix/SIGSTOPUsesAndCautions – bishop Feb 21 '19 at 16:50
  • 1
    Maybe add a reminder that you should be very careful that '/var/run/bigcopy.pid' still refers to the same process as you think it does. randomly stopping other processes on the system may not be desirable. I know of no safe way to ensure that the pid refers to the program you think it does though... – Evan Benn Feb 22 '19 at 02:27
  • @EvanBenn Yeah that's what I meant in a way with "make sure your copy is actually still running before pausing it" though your point is surely more explicit than that! Yeah checking PIDs is inherently race-y so it's sometimes not really possible to do it 100% reliably... – filbranden Feb 22 '19 at 03:02
  • @cat Not really, a process can't block SIGSTOP. See the link from the first comment: "SIGSTOP is a non-blockable signal like SIGKILL" (or just google it, you'll see that's the case.) – filbranden Feb 22 '19 at 03:42
  • @EvanBenn something like `ps -eo pid,command` could be used to ensure the pid is running and that the command line of the process matches what you expect: just store `command="full command"`, invoke as `$command`, then make pid file match what `ps -eo` you'd expect when you want to SIGCONT: `cmd="stuff here" ; $cmd & echo "$? $cmd" > psfile ;` then later `if ps -eo pid,command | grep -q $(cat psfile); then echo "it's alive!" fi` – Ajax Jan 07 '22 at 11:40
  • Tested code: `cmd="echo start ; sleep 1 ; echo end" ; ` `sh -c "$cmd" & echo "$! sh -c $cmd" > /tmp/psfile ; sleep 2 ; if ps -eo pid,command | grep -v grep | grep -q "$(cat /tmp/psfile)"; then echo "Running" ; else echo "Done"; fi` To see "Running", change the sleep in cmd= to 3 – Ajax Jan 07 '22 at 11:51
80

Instead of suspending the process, you could also give it lower priority:

renice 19 "$pid"

will give it the lowest priority (highest niceness), so that process will yield the CPU to other processes that need it most of the time.

On Linux, the same can be done with I/O with ionice:

ionice -c idle -p "$pid"

Will put the process in the "idle" class, so that it will only get disk time when no other program has asked for disk I/O for a defined grace period.

Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
  • 22
    This is a typical case of an [XY problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). The question was how to pause a process, but this does not answer the question. While indeed lowering the priority is the better approach to the *actual* problem, it does not answer the question. I would [edit] the question to also include how to pause a process and why pausing might be a problem (e.g. file could be edited while paused). – MechMK1 Feb 21 '19 at 16:47
  • 25
    @DavidStockinger, technically, this answer tells how to tell the OS to pause the process when it (the OS, CPU, I/O scheduler) is busy (even if it's for fractions of seconds at a time). How to suspend the process manually has already been covered in other answers. This solution doesn't address the problem of files being modified whilst they are being copied. – Stéphane Chazelas Feb 21 '19 at 16:59
  • 5
    Changing the I/O priority isn't always the best solution. If you're copying from spinning disks, you may still incur a seek before each high-priority request which you wouldn't incur if you completely paused the low-priority operation. – Mark Feb 21 '19 at 22:37
  • 2
    Lower priority does not even solve the problem. Even if the box is completely idle for a few seconds or minutes, that does not mean that a huge copy process which will evict everything from the filesystem cache is going to be unobtrusive. As soon as there's a load again, it's going to be very slow paging everything back in. – R.. GitHub STOP HELPING ICE Feb 22 '19 at 19:00
  • 2
    @DavidStockinger the preferred way of dealing with XY problems is to give the _right_ solution, even if that's not what the question is asking for. When you know the approach described in the question is wrong, then a good answer doesn't give that wrong approach but instead proposes a better one. – terdon Feb 23 '19 at 15:00
  • 2
    Unfortunately that means that people searching for the thing literally discussed in the question find only answers to some other question. A good answer to an XY problem gives the preferred alternative _as a bonus part of the answer_, while still answering the question posed (or you can suggest an alternative question in the comments). Fortunately other answers here do do that in this case. – Lightness Races in Orbit Feb 24 '19 at 17:57
13

Yes, you need to acquire the process id of the process to pause (via the ps command), then do:

$> kill -SIGSTOP <pid>

The process will then show up with Status "T" (in ps).

To continue, do a:

$> kill -CONT <pid>
Jeff Schaller
  • 66,199
  • 35
  • 114
  • 250
gerhard d.
  • 2,168
  • 12
  • 21
8

Use rsync, forget about cp, for this scenario. there are params to limit bandwith, or can be killed/stoped and started later, in a way it will continue, where it left google rsync example/s

3

If you are going to do it by interrupting the running process, I suggest playing with the Screen program. I haven't used Linux in a while, but IIRC just pausing the command and resuming it later leaves you pretty vulnerable, if you accidentally get logged off you won't be able to resume your session.

With screen I believe you can interrupt the session then detach it and log out. Later you can go back in and reattach to that session. You'd have to play with it a bit but it made sessions much more robust.

You can also log out and go home then log in remotely, reattach to the system y you started in the office and resume it for the evening, then pick it up again the next day at work.

Bill K
  • 294
  • 2
  • 10
  • I'm already using tmux for tha. But I'm writing a script that would be self-aware or preferably environment-aware, so it stops if server gets high traf, and continue when it's normal. – Sollosa Feb 24 '19 at 13:42
1

If your shell supports it (almost all do), you can press ^Z (Ctrl+Z) to easily send a SIGTSTP signal to the foreground task, then continue it with fg (on foreground) or bg (on background).

If you do this on multiple tasks and want to return to them later, you can use jobs command, then return with fg/bg %#, where # is the number given in brackets on jobs.

Keep in mind that SIGTSTP is a bit different than SIGSTOP (which is used on all other answers), most importantly due to the fact that it can be ignored (but I didn't see a program ignore it other than sl). More details can be found on this answer on StackOverflow.

ave
  • 113
  • 7