Race with volfile notification and stopping of brick

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I came across a situation where there were a few IOs going to the subvolume which was not available. The situation happens due to the following.

During the remove brick commit the following things happen, the brick stop, volfile creation, and volfile change notification to client.

The order in which this happens is
1) the brick is stopped.
2) the volfile are created and then the notification go to the client.
This way there is a window between the brick stop and the clients being notified that the brick has been stopped.

The brick is unavailable and the IO is coming to the stopped brick as the client is unaware of the volfile change for a while. And this results in an IO failure.

So I feel its better to do it in the following order:
1) create the volfile.
2) notify the client.
3) stop the brick.

This way the clients are notified and the IO starts going to the right subvol and the brick is available till then and as the brick is stopped after this the condition is resolved.

As this change is on the basic functionality, I thought of bringing it up here to everyones notice.
If you find anything that could break because of this change, or feel if there is a better way to handle this, Do let me know. 

Thanks to Du, Atin, Kaushal and Nithya for helping me with this.

Regards,
Hari.
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux