Re: [RFC] What if client fuse process crash?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2019/8/6 2:57 下午, Ravishankar N wrote:

On 06/08/19 11:44 AM, Changwei Ge wrote:
Hi Ravishankar,


Thanks for your share, it's very useful to me.

I am setting up a glusterfs storage cluster recently and the umount/mount recovering process bothered me.
Hi Changwei,
Why are you needing to do frequent remounts? If your gluster fuse client is crashing frequently, that should be investigated and fixed. If you have a reproducer, please raise a bug with all the details like the glusterfs version, core files and log files.


Hi Ravi,

Actually, glusterfs client fuse process ran well in my environment. But high-availability and fault-tolerance are also my big concerns.

So I killed the fuse process to see what would happen. AFAIK, userspace processes are likely to be killed or crashed somehow, which is not under our control. :-(

Another scenario is *software upgrade*. Since we have to upgrade glusterfs client version in order to enrich features and fix bugs.  It will be friendly to applications if the upgrade is transparent.


Thanks,

Changwei


Regards,
Ravi


I happened to find some patches[1] from internet aiming to address such a problem but no idea why they were not managed to merge into glusterfs mainline.

Do you know why?


Thanks,

Changwei


[1]:

https://review.gluster.org/#/c/glusterfs/+/16843/

https://github.com/gluster/glusterfs/issues/242


On 2019/8/6 1:12 下午, Ravishankar N wrote:
On 05/08/19 3:31 PM, Changwei Ge wrote:
Hi list,

If somehow, glusterfs client fuse process dies. All subsequent file operations will be failed with error 'no connection'.

I am curious if the only way to recover is umount and mount again?
Yes, this is pretty much the case with all fuse based file systems. You can use -o auto_unmount (https://review.gluster.org/#/c/17230/) to automatically cleanup and not having to manually unmount.

If so, that means all processes working on top of glusterfs have to close files, which sometimes is hard to be acceptable.

There is https://research.cs.wisc.edu/wind/Publications/refuse-eurosys11.html, which claims to provide a framework for transparent failovers. I can't find any publicly available code though.

Regards,
Ravi


Thanks,

Changwei


_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-devel





[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux