Re: kvm vm cephfs mount hangs on osd node (something like umount -l available?) (help wanted going to production)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

there have been several threads about hanging cephfs mounts, one quite long thread [1] describes a couple of debugging options but also mentions to avoid mounting cephfs on OSD nodes in a production environment.

Do you see blacklisted clients with 'ceph osd blacklist ls'? If the answer is yes try to unblock that client [2]. The same option ('umount -l') is available on a cephfs client, you can try that, too. Other options described in [1] are to execute an MDS failover, but sometimes a reboot of that VM is the only solution left.

Regards,
Eugen


[1] http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-August/028719.html [2] https://docs.ceph.com/en/latest/cephfs/eviction/#advanced-un-blocklisting-a-client


Zitat von Marc Roos <M.Roos@xxxxxxxxxxxxxxxxx>:

Is there not some genius out there that can shed a ligth on this? ;)
Currently I am not able to reproduce this. Thus it would be nice to have
some procedure at hand that resolves stale cephfs mounts nicely.


-----Original Message-----
To: ceph-users
Subject:  kvm vm cephfs mount hangs on osd node (something
like umount -l available?) (help wanted going to production)



I have a vm on a osd node (which can reach host and other nodes via the
macvtap interface (used by the host and guest)). I just did a simple
bonnie++ test and everything seems to be fine. Yesterday however the
dovecot procces apparently caused problems (only using cephfs for an
archive namespace, inbox is on rbd ssd, fs meta also on ssd)

How can I recover from such lock-up. If I have a similar situation with
an nfs-ganesha mount, I have the option to do a umount -l, and clients
recover quickly without any issues.

Having to reset the vm, is not really an option. What is best way to
resolve this?



Ceph cluster: 14.2.11 (the vm has 14.2.16)

I have in my ceph.conf nothing special, these 2x in the mds section:

mds bal fragment size max = 120000
# maybe for nfs-ganesha problems?
# http://docs.ceph.com/docs/master/cephfs/eviction/
#mds_session_blacklist_on_timeout = false
#mds_session_blacklist_on_evict = false
mds_cache_memory_limit = 17179860387


All running:
CentOS Linux release 7.9.2009 (Core)
Linux mail04 3.10.0-1160.6.1.el7.x86_64 #1 SMP Tue Nov 17 13:59:11 UTC
2020 x86_64 x86_64 x86_64 GNU/Linux
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux