On the ways to detach a brick gracefully from a glusterfs volume while rebooting a node

joe at julianfamily.org (Joe Julian) · Tue, 03 Sep 2013 11:30:30 -0700

On 09/03/2013 10:45 AM, Anirban Ghoshal wrote:
> We are using GlusterFS 3.4.0 and we have a replicated volume with one 
> brick each on two real-time servers. For certain maintenance purposes, 
> it may be desirable to periodically reboot them. During said reboots, 
> we wish to umount the brick residing on it. However, umount fails (as 
> expected), because of the GlusterFS threads that are using it. We 
> thought of the following ways to counter this:
>
> a) Stop the volume, thereby causing its GlusterFS threads to 
> terminate. However, this will mean that the other server would not be 
> able to access the volume, which will be a problem.
>
> b) Kill the glusterFS threads on the volume, thereby allowing umount 
> to proceed. However, I am given to understand that this method is not 
> very graceful, and may lead to data loss in case some local 
> modifications have not synced onto the other server.
>
> c) Delete the brick from the volume, remove its 
> "trusted.glusterfs.volume-id", and then re-add it once the server 
> comes back up.
>
> Could you  help me with some advice on what would be the best way to 
> do it?
>
>
The brick service is glusterfsd so that's what'll need killed. What I 
like to do is:

Kill the brick services for that brick. I, personally, use pkill -f 
$brick_path since the only application I have running that has the brick 
path in the command options is glusterfsd. Do no "pkill -9". This will 
terminate glusterfsd without shutting down the TCP connections leading 
to your clients hanging for ping-timeout seconds.

Perform your maintenance.

Start the brick(s) for that volume again with "gluster volume start $vol 
force".

Any files that were changed during the downtime will be self-healed.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130903/25d4fc79/attachment.html>