59

I am running a docker server on Arch Linux (kernel 4.3.3-2) with several containers. Since my last reboot, both the docker server and random programs within the containers crash with a message about not being able to create a thread, or (less often) to fork. The specific error message is different depending on the program, but most of them seem to mention the specific error Resource temporarily unavailable. See at the end of this post for some example error messages.

Now there are plenty of people who have had this error message, and plenty of responses to them. What’s really frustrating is that everyone seems to be speculating how the issue could be resolved, but no one seems to point out how to identify which of the many possible causes for the problem is present.

I have collected these 5 possible causes for the error and how to verify that they are not present on my system:

  1. There is a system-wide limit on the number of threads configured in /proc/sys/kernel/threads-max (source). In my case this is set to 60613.
  2. Every thread takes some space in the stack. The stack size limit is configured using ulimit -s (source). The limit for my shell used to be 8192, but I have increased it by putting * soft stack 32768 into /etc/security/limits.conf, so it ulimit -s now returns 32768. I have also increased it for the docker process by putting LimitSTACK=33554432 into /etc/systemd/system/docker.service (source, and I verified that the limit applies by looking into /proc/<pid of docker>/limits and by running ulimit -s inside a docker container.
  3. Every thread takes some memory. A virtual memory limit is configured using ulimit -v. On my system it is set to unlimited, and 80% of my 3 GB of memory are free.
  4. There is a limit on the number of processes using ulimit -u. Threads count as processes in this case (source). On my system, the limit is set to 30306, and for the docker daemon and inside docker containers, the limit is 1048576. The number of currently running threads can be found out by running ls -1d /proc/*/task/* | wc -l or by running ps -elfT | wc -l (source). On my system they are between 700 and 800.
  5. There is a limit on the number of open files, which according to some sources is also relevant when creating threads. The limit is configured using ulimit -n. On my system and inside docker, the limit is set to 1048576. The number of open files can be found out using lsof | wc -l (source), on my system it is about 30000.

It looks like before the last reboot I was running kernel 4.2.5-1, now I’m running 4.3.3-2. Downgrading to 4.2.5-1 fixes all the problems. Other posts mentioning the problem are this and this. I have opened a bug report for Arch Linux.

What has changed in the kernel that could be causing this?


Here are some example error messages:

Crash dump was written to: erl_crash.dump
Failed to create aux thread

 

Jan 07 14:37:25 edeltraud docker[30625]: runtime/cgo: pthread_create failed: Resource temporarily unavailable

 

dpkg: unrecoverable fatal error, aborting:
 fork failed: Resource temporarily unavailable
E: Sub-process /usr/bin/dpkg returned an error code (2)

 

test -z "/usr/include" || /usr/sbin/mkdir -p "/tmp/lib32-popt/pkg/lib32-popt/usr/include"
/bin/sh: fork: retry: Resource temporarily unavailable
 /usr/bin/install -c -m 644 popt.h '/tmp/lib32-popt/pkg/lib32-popt/usr/include'
test -z "/usr/share/man/man3" || /usr/sbin/mkdir -p "/tmp/lib32-popt/pkg/lib32-popt/usr/share/man/man3"
/bin/sh: fork: retry: Resource temporarily unavailable
/bin/sh: fork: retry: No child processes
/bin/sh: fork: retry: Resource temporarily unavailable
/bin/sh: fork: retry: No child processes
/bin/sh: fork: retry: No child processes
/bin/sh: fork: retry: Resource temporarily unavailable
/bin/sh: fork: retry: Resource temporarily unavailable
/bin/sh: fork: retry: No child processes
/bin/sh: fork: Resource temporarily unavailable
/bin/sh: fork: Resource temporarily unavailable
make[3]: *** [install-man3] Error 254

 

Jan 07 11:04:39 edeltraud docker[780]: time="2016-01-07T11:04:39.986684617+01:00" level=error msg="Error running container: [8] System error: fork/exec /proc/self/exe: resource temporarily unavailable"

 

[Wed Jan 06 23:20:33.701287 2016] [mpm_event:alert] [pid 217:tid 140325422335744] (11)Resource temporarily unavailable: apr_thread_create: unable to create worker thread
cdauth
  • 1,387
  • 2
  • 12
  • 19
  • 1
    Did you recently upgrade to the 4.3 kernel? – Roni Choudhury Jan 07 '16 at 16:47
  • That’s very well possible. Why? – cdauth Jan 07 '16 at 16:54
  • 1
    Amazing, I downgraded to kernel 4.2.5-1 and everything is working again! Do you have any clue what is causing this and how to fix it with 4.3? – cdauth Jan 07 '16 at 17:43
  • No clue what's causing it. My method of fixing it is waiting for the Arch Linux forum threads on the topic to be marked "SOLVED" :-P. – Roni Choudhury Jan 07 '16 at 18:43
  • So how did you guess it? Did you come across similar problems? – cdauth Jan 07 '16 at 19:10
  • Sadly, I'm having the exact same problem (plus, Chrome no longer plays youtube videos - not sure if that's related) and found your SE question. [This thread](https://bbs.archlinux.org/viewtopic.php?id=207255) was just started today and describes the same problem, along with the solution of downgrading to 4.2.5. – Roni Choudhury Jan 07 '16 at 19:22
  • Interestingly, after downgrading to 4.2.5 myself, Chrome is behaving well again. Now if only I could find VirtualBox packages for arch that are compatible with this kernel... – Roni Choudhury Jan 07 '16 at 20:21
  • Funny that I am running 4.3.3-2 on my desktop computer and I have Chromium open with more than 100 tabs without a problem. – cdauth Jan 08 '16 at 01:08
  • [The behaviour is not in fact specific to Arch or to Docker, but is exhibited in a *lot* of things](https://news.ycombinator.com/item?id=11675133). There is an open systemd bug about that at the time that I write this. – JdeBP May 12 '16 at 05:43
  • 2
    +1 For being an excellently asked and researched question, even if I _didn't_ have the same problem – Roy Truelove Jul 08 '16 at 21:50

4 Answers4

57

The problem is caused by the TasksMax systemd attribute. It was introduced in systemd 228 and makes use of the cgroups pid subsystem, which was introduced in the linux kernel 4.3. A task limit of 512 is thus enabled in systemd if kernel 4.3 or newer is running. The feature is announced here and was introduced in this pull request and the default values were set by this pull request. After upgrading my kernel to 4.3, systemctl status docker displays a Tasks line:

# systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/etc/systemd/system/docker.service; disabled; vendor preset: disabled)
   Active: active (running) since Fri 2016-01-15 19:58:00 CET; 1min 52s ago
     Docs: https://docs.docker.com
 Main PID: 2770 (docker)
    Tasks: 502 (limit: 512)
   CGroup: /system.slice/docker.service

Setting TasksMax=infinity in the [Service] section of docker.service fixes the problem. docker.service is usually in /usr/share/systemd/system, but it can also be put/copied in /etc/systemd/system to avoid it being overridden by the package manager.

A pull request is increasing TasksMax for the docker example systemd files, and an Arch Linux bug report is trying to achieve the same for the package. There is some additional discussion going on on the Arch Linux Forum and in an Arch Linux bug report regarding lxc.

DefaultTasksMax can be used in the [Manager] section in /etc/systemd/system.conf (or /etc/systemd/user.conf for user-run services) to control the default value for TasksMax.

Systemd also applies a limit for programs run from a login-shell. These default to 4096 per user (will be increased to 12288) and are configured as UserTasksMax in the [Login] section of /etc/systemd/logind.conf.

LB2
  • 125
  • 1
  • 1
  • 5
cdauth
  • 1,387
  • 2
  • 12
  • 19
  • 2
    FWIW, the service file was at `/lib/systemd/system/docker.service` in my Debian testing. – The Compiler Mar 31 '16 at 13:45
  • 3
    FWIW, saying `systemctl set-property docker.service TasksMax=4096` will set the property for a currently running service and persist the setting for subsequent reboots in the correct place for the docker installation in question. – Nakedible Apr 08 '16 at 12:04
  • [This is a common approach](https://news.ycombinator.com/item?id=11675133). But note that the Docker change that you proposed was reverted after you posted this answer, on 2016-02-09, this reversion being then released to the world in Docker version 1.10.1. – JdeBP May 12 '16 at 05:49
  • man thanks thanks thanks ! i have been looking for tooooo long for this – achabahe Aug 31 '17 at 12:24
  • If you make the change to the config file (mine was in `/etc/systemd/system/docker.service.d/50-TasksMax.conf` on Ubuntu 16), you need to run `systemctl daemon-reload`. Doing a `sudo service docker restart` will NOT work. – osman Oct 18 '17 at 21:02
  • thanks, this helped me with my apache being limited – Hayden Thring Jan 18 '23 at 03:03
6

cdauth's answer is correct, but there is another detail to add.

On my Ubuntu 16.04 system with systemd 229 and a 4.3 kernel, a 512 pid limit was enforced on session scopes by default even when UserTasksMax was set to the new, increased default of 12288. So any user session scope was limited to 512 threads.

The only way I found to remove the limit was to set DefaultTasksMax=unlimited in /etc/systemd/system.conf and systemctl daemon-reexec (or reboot).

You can check if this is happening by issuing systemctl status, picking a session scope, and cat /sys/fs/cgroup/pids/user.slice/user-${UID}.slice/session-FOO.scope/pids.max.

  • I made the change to /etc/systemd/system.conf and rebooted. Docker still lists the limit on tasks as 512. Using @Nakedible's comment from above did update the available tasks. – Ben Mathews May 05 '16 at 21:01
  • 1
    Thanks Ryan! @BenMathews perhaps this was because **both** are valid issues on Ubuntu 16.04, you need to fix them **both** for things to work properly. This issue appears to apply to containers started by a daemon, not by a user in a shell. So everything appears fine, you add `@reboot lxc-autostart` to your crontab to autostart them on boot, and you suddenly get crippled containers after reboot. – qris May 29 '16 at 11:08
1

After reading this thread.

This solution worked for me: docker -d --exec-opt native.cgroupdriver=cgroupfs. I actually added it to the OPTIONS in /etc/sysconfig/docker...

GAD3R
  • 63,407
  • 31
  • 131
  • 192
-1
[3178:4:0817/094911.485035:ERROR:platform_thread_posix.cc(155)] pthread_create: Resource temporarily unavailable (11)

I was able to resolve this issue in podman run using --pids-limit=-1 argument

AdminBee
  • 21,637
  • 21
  • 47
  • 71
GrabbenD
  • 1
  • 1