Re: MDS blocklist/evict clients during network maintenance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Dan,

thanks for the link, I've been reading it over and over again but still didn't come to a conclusion yet. IIRC, the maintenance windows are one hour long, currently every week. But it's not entirely clear if the maintenance will even have an impact, because apparently, last time nobody complained. But there have been interruptions which caused stale clients in the last weeks, so it's difficult to predict. They mainly use rbd and CephFS for k8s clusters, but so far I haven't heard about rbd issues during this maintenance windows. They have grafana showing a drop of many MDS sessions when the network is interrupted, I think from around 130 active sessions to around 30. So not all sessions were dropped. After the maintenance, they failed the MDS and the number of sessions was restored. Since they don't have access to the k8s clusters themselves, they can't do much on that side. We're still wondering if a MDS failover is really necessary or if anything on the client side could be done. But I only have very limited details on this. The MDS log (I don't have a copy) shows that the session drops are caused by the client evictions. Do you think it could make sense to disable client eviction/blocklisting only during this maintenance window? Or can that be dangerous because we can't predict which clients will actually be interrupted and how k8s will handle the returning clients if they won't be evicted?

Thanks
Eugen

Zitat von Dan van der Ster <dan.vanderster@xxxxxxxxx>:

Hi Eugene,

Disabling blocklisting on eviction is a pretty standard config. In my
experience it allows clients resume their session cleanly without needing a
remount.

There's docs about this here:
https://docs.ceph.com/en/latest/cephfs/eviction/#advanced-configuring-blocklisting

I don't have a good feeling if this will be useful for your network
intervention though... What are you trying to achieve? How long will
clients be unreachable?

Cheers, Dan


--
Dan van der Ster
CTO@CLYSO & CEC Member


On Thu, Nov 21, 2024, 10:15 Eugen Block <eblock@xxxxxx> wrote:

Hi,

can anyone share some experience with these two configs?

ceph config get mds mds_session_blocklist_on_timeout
true
ceph config get mds mds_session_blocklist_on_evict
true

If there's some network maintenance going on and the client connection
is interrupted, could it help to disable evicting and blocklisting MDS
clients? And what risks should we be aware of if we tried that? We're
not entirely sure yet if this could be a reasonable approach, but
we're trying to figure out how to make network maintenance less
painful for clients.
I'm also looking at some other possible configs, but let's start with
these two first.

Any comments would be appreciated!

Thanks!
Eugen
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux