op_ret setting in gd_commit_op_phase

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Devel-list,

Current implementation of gd_commit_op_phase sets op_ret to a non zero
value if any of the commit operation fails and the transaction fails.

Cluster comprises of 2 nodes.
1. Stop the volume at Node 1
2. Start the volume at Node 1 and while volume was starting up bring
down Node 2.
3. Volume start fails with a message "volume start: test-vol: failed:
Commit failed on 00000000-0000-0000-0000-000000000000. Please check log
file for details."

4. gluster volume status now shows the volume as started although the
previous transaction failed.

In this case, since the local commit op succeed, changes to volinfo was
made but op_ret was non zero as the remote commit op failed at the other
node (due to other node going down at same point of time).

I was thinking of moving the local commit op code after the remote
commit ops and then overriding the op_ret and op_errstr with the local
commit op's behaviour. I know with this fix we can't solve the entire
inconsistency issue here as the current design doesn't have UNDO
framework but with this fix at least we can throw a correct message in CLI.

Your thoughts would be highly appreciated.

~Atin
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-devel




[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux