Re: Failing to respond to cache pressure?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 05/05/2015 18:17, Lincoln Bryant wrote:
Hello all,

I'm seeing some warnings regarding trimming and cache pressure. We're running 0.94.1 on our cluster, with erasure coding + cache tiering backing our CephFS.

      health HEALTH_WARN
             mds0: Behind on trimming (250/30)
             mds0: Client 74135 failing to respond to cache pressure

The trimming error popped up after restarting the mds, but then went away on its own. However, "failing to respond to cache pressure" persists.

The cluster is basically idle at the moment (no reads/writes when watching ceph -w), so this is very confusing to me.
There are two ways you get this:
* You have a client which really is failing to respond to requests from the MDS to trim its cache (i.e. an older client from before we fixed some bugs in this area) * Your client is fine, but you're hitting a case in which the health check itself is buggy (http://tracker.ceph.com/issues/11482)

Either way, you can clear it by unmounting and remounting the client.
Is there any way to identify the hostname or IP address of client 74135, so I can check the client itself?

The command you're looking for is "ceph daemon mds.<id> session ls" on the host where the MDS is running. For nice recent clients, that'll give you metadata like the hostname, but even for older clients it'll tell you the IP address that you can hopefully resolve yourself.

Cheers,
John
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux