47

I have been using ssh to access remote servers for many months, but recently I haven't been able to establish a reliable connection. Sometimes I cannot login and get the message "Connection reset by port 22", when I can login I get the error message "client_loop: send disconnect: Broken pipe" in a few minutes (even if the terminal is not idle).

My ~/.ssh/config file has:

Host *  

     ServerAliveInterval 300
     ServerAliveCountMax 2
     TCPKeepAlive yes

My /etc/ssh/sshd_config file has:

#ClientAliveInterval 300
#ClientAliveCountMax 3

I recently upgraded my xfinity plan to a faster speed and the problem started happening then. But xfinity insists the issue is on my end. Note that my roommate also has the same issue with ssh...

Is there something that I'm missing on my end? Any help would be greatly appreciated! (I'm running on a Mac)

Ashka Shah
  • 571
  • 1
  • 4
  • 4

4 Answers4

56

I solved the same problem by editing the file ~/.ssh/config to have:

Host *
    ServerAliveInterval 20
    TCPKeepAlive no

Motivation:

TCPKeepAlive no means "do not send keepalive messages to the server". When the opposite, TCPKeepAlive yes, is set, then the client sends keepalive messages to the server and requires a response in order to maintain its end of the connection. This will detect if the server goes down, reboots, etc. The trouble with this is that if the connection between the client and server is broken for a brief period of time (due to flaky a network connection), this will cause the keepalive messages to fail, and the client will end the connection with "broken pipe".

Setting TCPKeepAlive no tells the client to just assume the connection is still good until proven otherwise by a user request, meaning that temporary connection breakages while your ssh term is sitting idle in the background won't kill the connection.

David Wickstrom
  • 661
  • 4
  • 5
  • 4
    Is this setting for the server or client? – Morten May 18 '22 at 09:38
  • 6
    The setting is for the client. – David Wickstrom May 18 '22 at 12:51
  • 1
    Is the ServerAliveInterval still required if TCPKeepAlive is set to no? – ljden Jun 06 '22 at 02:07
  • 3
    @ljden Yes. Two different keep-alive mechanisms. From `man ssh_config`: `It is important to note that the use of server alive messages is very different from TCPKeepAlive (below). The server alive messages are sent through the encrypted channel and therefore will not be spoofable. The TCP keepalive option enabled by TCPKeepAlive is spoofable. The server alive mechanism is valuable when the client or server depend on knowing when a connection has become inactive. ` – Pod Jun 06 '22 at 15:57
  • 5
    My experience is that `TCPKeepAlive` is useless in most contexts (not just ssh connections). There are two many address translating firewalls/routers and load balancers that ignore TCP keepalive packets and drop "idle" connections anyway. TCP Keepalive was a good idea in the 90s and into the 2000s, but not anymore. The `ServerAliveInterval` is very good, though. It exchanges handshakes at the application layer rather than the TCP layer, and usually prevents the network devices from dropping connections. – Sotto Voce Jul 20 '22 at 23:01
  • `$ cat ~/.ssh/config` cat: /Users/xgqfrms-mm/.ssh/config: `No such file or directory` Is it mean I need to create one `config` file? – xgqfrms Apr 13 '23 at 00:22
  • @DavidWickstrom This was very helpful, thank you. – GreNIX Jul 11 '23 at 19:12
11

David's answer is OK, but a more comprehensive solution is explained below.

You can address this problem on either the client or the server.

How do you know where to do it?

  • Set it on your machine if you connect to multiple servers via SSH.

  • If you are a sysadmin and several users complain about frequent SSH connection disconnect, you may set it on the server.

Client side

When connecting to a server, use the -o option:

ssh -o ServerAliveInterval=600 [email protected]

The value 600 represents 600 seconds; i.e., 10 minutes.

Alternatively, add it to your ssh config file:

  1. Create the ssh config file (if it does not exist):
    touch ~/.ssh/config
    
  2. Set the permissions:
    chmod 600 ~/.ssh/config
    
  3. Set the parameter in the config file. For example:
    echo "ServerAliveInterval 600" >> ~/.ssh/config
    
    … or use an editor.

Server side

  1. Open the sshd config file located at /etc/ssh/sshd_config.
  2. Set the parameters ClientAliveInterval and ClientAliveCountMax to the desired values.

For example, ClientAliveInterval=200 and ClientAliveCountMax=3 means that the server will send an alive message after 200 seconds. If there is no response from the client, it will again send an alive message at 400 seconds. If there is still no response from the client, it will send another alive message at 600 seconds.  If there is still no response, the SSH connection will be disconnected.

Source: Fixing Broken Pipe Error With SSH Connection at the Linux Handbook.

Pablo Johnson
  • 211
  • 2
  • 4
1

I had this issue when trying to SSH into a Windows laptop from my Mac. I tried the above solutions regarding ServerAliveInterval and TCPKeepAlive to no avail.

I'm not exactly sure why this worked, but I got past the issue by deleting my public SSH key from the remote laptop's ~/.ssh/authorized_keys file. This was especially weird, though, because according to my ssh -v log, my public key seemed to be working for authentication just fine:

debug1: Authentication succeeded (publickey).
Authenticated to 192.168.1.21 ([192.168.1.21]:22).
debug1: channel 0: new [client-session]
debug1: Requesting [email protected]
debug1: Entering interactive session.
debug1: pledge: filesystem full
client_loop: send disconnect: Broken pipe

In any case, deleting the key got me past the client_loop: send disconnect: Broken pipe error, and I am now able to log in.

Raj K
  • 111
  • 2
  • This workaround is essentially to use password rather than public key authentication. E.g., another convenient fix is to put "PubkeyAuthentication no" in the client-side SSH config file under the corresponding host. – Raj K Nov 11 '22 at 17:29
0

I was getting exactly the same error only for one particular user so to say: X. After trying everything was exposed here then I realized that I added up to user X the group of user Y. But Y user is not allowed to login via ssh. After quitting X from Y's group then I could log-in again.

Hope this can help someone too.

  • 1
    There is no evidence to suggest that this was the underlying cause of the issue that the user in the question had. – Kusalananda Dec 19 '22 at 13:44