GlusterFS replicated volume auto failover not working

Question

I have two nodes that have a replicated gluster volume. Then a third node that has mounted that volume using the following mount string in /etc/fstab:

node1,node2:/gv0   /glustermount   glusterfs   defaults 0 0

I also tried adding one or both servers as backup volfile servers like this:

node1,node2:/gv0       /glustershare   glusterfs       defaults,backup-volfile-servers=node2:/gv0:node1:/gv0 0 0

The volume mounts fine and I can list the contents from node3. My expectation is that I can reboot either node1 or node2 and as long as I do them one at a time and allow them enough time to come all the way up and be fully available I will never lose access from node3.

If I reboot node1 I can continue to list the contents just fine. Once node1 has been fully rebooted and is available (I have waited a long time and verified it is listed online in the "gluster volume status" response just to be sure) I then reboot node2. I immediately lose the mount. No matter how long I wait it never comes back. I can, however, immediately or later remount "mount /glustermount" and it mounts again fine with only node1 available (while node2 is shutdown or still rebooting).

Is there something wrong with my configuration or is this not the expected behavior? I thought the biggest touted advantage of using the fuse client over NFS was that you got the auto failover at the expense of some performance.

In the gluster log on node3 when I reboot node 1 I see expected entries like:

failed to connect with remote-host: node1 (No data available)
connecting to next volfile server node2

So when I later restart node2 I expect to see similar entries in reverse but instead I see:

Exhuasted all volfile servers
Unmounting '/glustermount'

So why did it not see node1 as an available volfile server but if I rerun the mount command it mounts just fine with only node1 available?

GlusterFS replicated volume auto failover not working

0 Answers0