Re: Locating CephFS clients in warn message

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I found there is an option `mds_health_summarize_threshold` so it could show the clients that are lagging.

I increased the default value. I ran `ceph daemon   perf dump` to make sure `inodes < inodes_max`. The problem still persists. I'll try looking into the code for clues.

Thanks

On Fri, Nov 11, 2016 at 1:57 PM Goncalo Borges <goncalo.borges@xxxxxxxxxxxxx> wrote:
Doesn't the mds log tell you which clients ids are with problems?

Does you mds has enough RAM so that you can increase the default value 100000 of the mds cache size
?

Cheers
G.


From: Yutian Li [lyt@xxxxxxxxxx]
Sent: 11 November 2016 14:03
To: Goncalo Borges; ceph-users@xxxxxxxxxxxxxx
Subject: Re: Locating CephFS clients in warn message

As for now, when I run `dump_ops_in_flight`, `ops` in empty and `num_ops` is 0.
But when I run `ceph status`, I still get 15 clients failing to respond to cache pressure.

Where should I start solving this problem?

On Thu, Nov 10, 2016 at 6:16 PM Goncalo Borges <goncalo.borges@xxxxxxxxxxxxx> wrote:
Hi

"ceph daemon mds.<id> session ls", executed in your mds server, should give you hostname and client id of all your cephfs clients.

"ceph daemon mds.<id> dump_ops_in_flight" should give you operations not completed or pending to complete for certain clients ids. In case of problems, that those problematic clients will probably appear there.

Cheers
Goncalo


________________________________________
From: ceph-users [ceph-users-bounces@xxxxxxxxxxxxxx] on behalf of Yutian Li [lyt@xxxxxxxxxx]
Sent: 10 November 2016 15:21
To: ceph-users@xxxxxxxxxxxxxx
Subject: Locating CephFS clients in warn message

I get a HEALTH_WARN when I run `ceph status`. It says

     health HEALTH_WARN
            mds0: Many clients (17) failing to respond to cache pressure

I have 50 OSDs, 3 MONs, and 1 MDS. I just use CephFS and attach it to 20 ~ 30 clients using kernel mount option.

I wonder how to locate those "many" clients that are failing to respond. I don't even see an ID of the lagging clients anywhere.

Thanks!
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux