Re: CephFS unresponsive at scale (2M files,

Thomas Lemarchand <thomas.lemarchand@xxxxxxxxxxxxxxxxxx> · Tue, 18 Nov 2014 10:36:42 +0100

Hi Kevin,

There are every (I think) MDS tunables listed on this page with a short
description : http://ceph.com/docs/master/cephfs/mds-config-ref/

Can you tell us how your cluster behave after the mds-cache-size
change ? What is your MDS ram consumption, before and after ?

Thanks !
-- 
Thomas Lemarchand
Cloud Solutions SAS - Responsable des systèmes d'information

On lun., 2014-11-17 at 16:06 -0800, Kevin Sumner wrote:
> > On Nov 17, 2014, at 15:52, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> > 
> > On Mon, 17 Nov 2014, Kevin Sumner wrote:
> > > I?ve got a test cluster together with a ~500 OSDs and, 5 MON, and
> > > 1 MDS.  All
> > > the OSDs also mount CephFS at /ceph.  I?ve got Graphite pointing
> > > at a space
> > > under /ceph.  Over the weekend, I drove almost 2 million metrics,
> > > each of
> > > which creates a ~3MB file in a hierarchical path, each sending a
> > > datapoint
> > > into the metric file once a minute.  CephFS seemed to handle the
> > > writes ok
> > > while I was driving load.  All files containing each metric are at
> > > paths
> > > like this:
> > > /ceph/whisper/sandbox/cephtest-osd0013/2/3/4/5.wsp
> > > 
> > > Today, however, with the load generator still running, reading
> > > metadata of
> > > files (e.g. directory entries and stat(2) info) in the filesystem
> > > (presumably MDS-managed data) seems nearly impossible, especially
> > > deeper
> > > into the tree.  For example, in a shell cd seems to work but
> > > ls hangs,
> > > seemingly indefinitely.  After turning off the load generator and
> > > allowing a
> > > while for things to settle down, everything seems to behave
> > > better.
> > > 
> > > ceph status and ceph health both return good statuses the entire
> > > time.
> > >  During load generation, the ceph-mds process seems pegged at
> > > between 100%
> > > and 150%, but with load generation turned off, the process has
> > > some high
> > > variability from near-idle up to similar 100-150% CPU.
> > > 
> > > Hopefully, I?ve missed something in the CephFS tuning.  However,
> > > I?m looking for
> > > direction on figuring out if it is, indeed, a tuning problem or if
> > > this
> > > behavior is a symptom of the ?not ready for production? banner in
> > > the
> > > documentation.
> > 
> > My first guess is that the MDS cache is just too small and it is 
> > thrashing.  Try
> > 
> > ceph mds tell 0 injectargs '--mds-cache-size 1000000'
> > 
> > That's 10x bigger than the default, tho be aware that it will eat up
> > 10x 
> > as much RAM too.
> > 
> > We've also seen teh cache behave in a non-optimal way when evicting 
> > things, making it thrash more often than it should.  I'm hoping we
> > can 
> > implement something like MQ instead of our two-level LRU, but it
> > isn't 
> > high on the priority list right now.
> > 
> > sage
> 
> 
> Thanks!  I’ll pursue mds cache size tuning.  Is there any guidance on
> setting the cache and other mds tunables correctly, or is it an
> adjust-and-test sort of thing?  Cursory searching doesn’t return any
> relevant documentation for ceph.com.  I’m plowing through some other
> list posts now.
> --
> Kevin Sumner
> kevin@xxxxxxxxx
> 
> 
> 
> 
> -- 
> This message has been scanned for viruses and 
> dangerous content by MailScanner, and is 
> believed to be clean. 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com