Re: How to associate a cephfs client id to its process

Gregory Farnum <gfarnum@xxxxxxxxxx> · Wed, 14 Sep 2016 14:04:09 -0700

On Wed, Sep 14, 2016 at 7:02 AM, Heller, Chris <cheller@xxxxxxxxxx> wrote:
> I am making use of CephFS plus the cephfs-hadoop shim to replace HDFS in a
> system I’ve been experimenting with.
>
>
>
> I’ve noticed that a large number of my HDFS clients have a ‘num_caps’ value
> of 16385, as seen when running ‘session ls’ on the active mds. This appears
> to be one larger than the default value for ‘client_cache_size’ so I presume
> some relation, though I have not seen any documentation to corroborate this.
>
>
>
> What I was hoping to do is track down which ceph client is actually holding
> all these ‘caps’, but since my system can have work scheduled dynamically
> and multiple clients can be running on the same host, its not obvious how to
> associate the client ‘id’ as reported by ‘session ls’ with any one process
> on the give host.
>
>
>
> Is there steps I can follow to back track the client ‘id’ to a process id?

Hmm, it looks like we no longer directly associate the process ID with
the client session. There is a "client metadata" config option you can
fill in with arbitrary "key=value[,key2=value2]* strings if you can
persuade Hadoop to set that to something useful on each individual
process. If you have logging or admin sockets enabled then you should
also be able to find them named by client ID and trace those back to
the pid with standard linux tooling.

I've created a ticket to put this back in as part of the standard
metadata: http://tracker.ceph.com/issues/17276
-Greg
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com