RE: BlueStore in-memory Onode footprint

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 16 Dec 2016, Allen Samuels wrote:
> I'm not sure what the conclusion from this is.
> 
> The point of the sharding exercise was to eliminate the need to 
> serialize/deserialize all 1024 Extents/Blobs/SharedBlobs on each I/O 
> transaction.
> 
> This shows that a fully populated oNode with ALL of the shards present 
> is a large number. But that ought to be a rare occurrence.
> 
> This test shows that each Blob is 248 bytes and that each SharedBlob is 
> 216 bytes. That matches the sizeof(...), so the MemPool logic got the 
> right answer! Yay!
> 
> Looking at the Blob I see:
> 
> Bluestore_blob_t 72 bytes
> Bufferlist 88 bytes
> Extentrefmap 64 bytes
> 
> That's most the 248. I suspect that trying to fix this will require a 
> new strategy, etc.

Well, one thing we might consider: right now the prune cache content at 
the granularity of onode.  We might want to put the shards in an LRU too 
and prune old shards...

sage

 > 
> Allen Samuels
> SanDisk |a Western Digital brand
> 2880 Junction Avenue, San Jose, CA 95134
> T: +1 408 801 7030| M: +1 408 780 6416
> allen.samuels@xxxxxxxxxxx
> 
> 
> > -----Original Message-----
> > From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-
> > owner@xxxxxxxxxxxxxxx] On Behalf Of Sage Weil
> > Sent: Friday, December 16, 2016 7:20 AM
> > To: Igor Fedotov <ifedotov@xxxxxxxxxxxx>
> > Cc: ceph-devel <ceph-devel@xxxxxxxxxxxxxxx>
> > Subject: Re: BlueStore in-memory Onode footprint
> > 
> > On Fri, 16 Dec 2016, Igor Fedotov wrote:
> > > Hey All!
> > >
> > > Recently I realized that I'm unable to fit all my onodes ( 32768
> > > objects/4Mb each/4K alloc unit/no csum) in 15G RAM cache.
> > >
> > > Hence decided to estimate Onode in-memory size.
> > >
> > > At first I filled 4Mb object with a single 4M write - mempools
> > > indicate ~5K mem usage for total onode metadata. Good enough.
> > >
> > > Secondly I refill that object with 4K writes. Resulting memusage = 574K!!!
> > > Onode itself 704 bytes in 1 object.  And 4120 other metadata items
> > > occupies all other space.
> > >
> > > Then I removed SharedBlob from mempools. Resulting mem usage = 355K.
> > > The same Onode size. And 3096 other metadata objects. Hence we had
> > > 1024 SharedBlob instances thata took ~ 220K.
> > >
> > >
> > > And finally I removed Blob instances from measurements. Resulting mem
> > > usage = 99K. And 2072 other objects. Hence Blob instances take another
> > > ~250K
> > 
> > Yikes!
> > 
> > BTW you can get a full breakdown by type with 'mempool debug = true' in
> > ceph.conf (-o 'mempool debug = true' on vstart.sh command line) without
> > having to recompile.  Do you mind repeating the test and including the full
> > breakdown?
> > 
> > > Yeah, that's the worst case( actually csum enable will give even more
> > > mem use) but shouldn't we revisit some Onode internals due to such
> > numbers?
> > 
> > Yep!
> > 
> > sage
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the
> > body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at
> > http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux