On Fri, Oct 30, 2020 at 2:13 AM Frank Schilder <frans@xxxxxx> wrote: > > Dear cephers, > > I have a somewhat strange situation. I have the health warning: > > # ceph health detail > HEALTH_WARN 3 clients failing to respond to capability release > MDS_CLIENT_LATE_RELEASE 3 clients failing to respond to capability release > mdsceph-12(mds.0): Client sn106.hpc.ait.dtu.dk:con-fs2-hpc failing to respond to capability release client_id: 30716617 > mdsceph-12(mds.0): Client sn269.hpc.ait.dtu.dk:con-fs2-hpc failing to respond to capability release client_id: 30717358 > mdsceph-12(mds.0): Client sn009.hpc.ait.dtu.dk:con-fs2-hpc failing to respond to capability release client_id: 30749150 > > However, these clients are not busy right now. Also, they hold almost nothing; see snippets from "session ls" below. It is possible that a very IO intensive application was running on these nodes and these release requests got stuck. How do I resolve this issue? Can I just evict the client? > > Version is mimic 13.2.8. Note that we execute a drop cache command after a job finishes on these clients. Its possible that the clients dropped the caps already before the MDS request was handled/received. Can you share any config changes you've made on the MDS? Also, Mimic is EOL as you probably know. Please upgrade :) -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx