Hi, I've added a "failover" test group to the buildbot that mounts a "regular" (non-scaleout) cluster and switches the fileserver to another cluster node live and it looks like it's working: you can keep on using the mount. In non-scale-out, the file server has its own virtual IP that both node share. So when you "move" the fileserver to a different node, it doesn't actually change IP. After doing that we realized that this actually works already without -o witness since it's reconnecting to the same IP. Now we need to add a scale-out cluster fileserver in buildbot where, IIUC (please correct me Samuel) the fileserver is actually using the node IP instead of this virtual-IP shared by nodes. So that when we move the fileserver, it actually changes its IP address and we can test this properly. As for the code, I'm not an expert on reconnection but it looks for merging I think. It doesn't handle multichannel but multchannel doesn't handle reconnection well anyway. There is an issue which pops up in other parts of the code as well. If you run a command too quickly after the transition, they will fail with EIO so it's not completely failing over but I think there can be the same issue with DFS (Paulo, any ideas/comments?) which is why we do 2 times ls and we ignore the result of the first in the DFS tests. the dfs test code: def io_reco_test(unc, opts, cwd, expected): try: lsdir = '.' cddir = os.path.join(ARGS.mnt, cwd) info(("TEST: mount {unc} , cd {cddir} , ls {lsdir}, expect:[{expect}]\n"+ " disconnect {cddir} , ls#1 {lsdir} (fail here is ok), ls#2 (fail here NOT ok)").format( unc=unc, cddir=cddir, lsdir=lsdir, expect=" ".join(['"%s"'%x for x in expected]) )) Cheers, -- Aurélien Aptel / SUSE Labs Samba Team GPG: 1839 CB5F 9F5B FB9B AA97 8C99 03C8 A49B 521B D5D3 SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, DE GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 247165 (AG München)