Re: Improving metadata throughput

Gregory Farnum <gfarnum@xxxxxxxxxx> · Thu, 30 Jun 2016 11:15:36 -0700



On Wed, Jun 29, 2016 at 2:02 PM, Daniel Davidson
<danield@xxxxxxxxxxxxxxxx> wrote:
> I am starting to work with and benchmark our ceph cluster.  While throughput
> is so far looking good, metadata performance so far looks to be suffering.
> Is there anything that can be done to speed up the response time of looking
> through a lot of small files and folders?  Right now, I am running four
> metadata servers and the filesystem is mounted via fuse.
>
> We use module to manage environmental variables for the applications on our
> cluster.  When I type "module avail" it takes about 30 minutes to get a
> response the first time with a pair of my monitors running at 100% during
> this time.  Later ones are near instantaneous.

Can you describe your setup and your testing/experience in more detail?
There isn't much reason for your monitors (did you mean MDSes? or the
CPUs, which are colocated with something else?) to be using 100% CPU
as a result of metadata throughput.
I hope the 4 MDSes are 1 active and 3 standbys.
In general, metadata can take a bit of time to be read into the MDS
cache (a RADOS read, ie a disk seek, per directory) but should then be
quick to access, unless it is contending with another client changing
the directories in question (due to cache coherence, that can take
more time). I'm not familiar with Module, but from the man page it
seems to recurse arbitrarily deep once you specify a directory, so if
you've got it searching some large/deep paths, it could take a while.
30 minutes is a *really* long time though, making it sound like you're
colliding with something else to make it slow.
-Greg
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com