Recovering out of sync nodes from input/output error

lanning at lanning.cc (Robert Hajime Lanning) · Wed, 11 Apr 2012 15:45:29 -0700

On 04/11/12 04:00, Alex Florescu wrote:
> I am now playing with a test environment which almost mirrors the prod
> and I can always reproduce the problem.
> We have websites on two servers which use gluster for common usage
> files. We also use DNS round robin for request balancing (this is a main
> element in the scenario).
>
> Setup: Two servers running Gentoo 2.0.3 kernel 3.0.6, glusterfs 3.2.5
> Gluster commands:
> gluster volume create vol-replication replica 2 transport tcp
> 10.0.2.14:/local 10.0.2.15:/local
> gluster volume start vol-replication
> gluster volume set vol-replication network.ping-timeout 1
> node1 (10.0.2.14): mount -t glusterfs 10.0.2.14:/vol-replication /a
> node2 (10.0.2.15): mount -t glusterfs 10.0.2.15:/vol-replication /a
>
> Now assume that connectivity between the two nodes has failed, but they
> can still be accessed from the outside world and files can be written on
> them through Apache.
> Request 1 -> 10.0.2.14 -> creates file howareyou
> Request 2 -> 10.0.2.15 -> creates file hello

So, now you have a "split-brain" problem.

> At some point, connectivity between the two nodes recovers and disaster
> strikes:
> ls /a
> ls: cannot access /a: Input/output error

Which directory is the "source of truth"?

did "howareyou" exist on 10.0.2.15 and was deleted during the outage, or 
is it a new file?
vice versa for "hello"

So, when you look at the directory itself, which state is correct?

Gluster does not have a transaction log for each brick, to sync
across.

>
> The only way to recover this was to delete the offending files. This was
> easy to do on the test environment because there were two files
> involved, but on the prod environment we had many more and I managed to
> recover only after deleting the gluster volume and the local content
> including the local storage directory itself! Nothing else of what I
> tried (stopping volume, recreating volume, emptying the local storage
> directory, remounting, restarting gluster) worked.
>
> Any hint on how one could recover from this sort of situation?
> Thank you.

Tar replica1 and untar on replica2.  Then delete everything on replica1.
Then self-heal should take care of the rest.

-- 
Mr. Flibble
King of the Potato People