RE: mempool

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Not sure why it faults out at fork time. But that's probably easy to dig into.

The slabs are certainly orthogonal to the accounting.  FWIW, if you set the stackSize to 0 and the heapSize to 1, you'll pretty much get the same effect. 

>>> See below

Allen Samuels
SanDisk |a Western Digital brand
2880 Junction Avenue, San Jose, CA 95134
T: +1 408 801 7030| M: +1 408 780 6416
allen.samuels@xxxxxxxxxxx

> -----Original Message-----
> From: Sage Weil [mailto:sweil@xxxxxxxxxx]
> Sent: Thursday, October 06, 2016 6:59 PM
> To: Allen Samuels <Allen.Samuels@xxxxxxxxxxx>
> Cc: ceph-devel@xxxxxxxxxxxxxxx
> Subject: Re: mempool
> 
> Hi Allen,
> 
> On Wed, 5 Oct 2016, Allen Samuels wrote:
> >
> > Ok, hereʼs something to look at:
> >
> >
> https://github.com/allensamuels/ceph/blob/master/src/include/mempool.h
> >
> https://github.com/allensamuels/ceph/blob/master/src/common/mempool
> .cc
> >
> > and the unit test
> >
> >
> https://github.com/allensamuels/ceph/blob/master/src/test/test_mempool
> > .cc
> >
> > The simple comment is at the top of mempool.h
> 
> I've pulled your core into wip-mempool in github.com:liewegas/ceph.git and
> switched several bluestore types to use it.  The unit tests work fine, but I
> have 2 problems:
> 
> 1) when ceph-osd forks it asserts out in ~slab_container.  commenting out
> the asserts for now is enough to proceed.  I assume it's because the
> mempool is in use at fork() time.
> 
> 2) After a few thousand ops I crash here:
> 
> #4  0x000055c6180a360b in mempool::slab_allocator<BlueStore::Extent, 0ul,
> 4ul>::freeslot (
>     freeEmpty=true, s=0x55c6253f73c8, this=0x55c618a2f6a0
> <_factory_bluestore_extent>)
>     at /home/sage/src/ceph2/src/include/mempool.h:485
> #5  mempool::slab_allocator<BlueStore::Extent, 0ul, 4ul>::deallocate (s=0,
> p=0x55c6253f73d0,
>     this=0x55c618a2f6a0 <_factory_bluestore_extent>)
>     at /home/sage/src/ceph2/src/include/mempool.h:602
> #6  mempool::factory<(mempool::pool_index_t)2, BlueStore::Extent, 0ul,
> 4ul>::free (
>     p=0x55c6253f73d0, this=0x55c618a2f6a0 <_factory_bluestore_extent>)
>     at /home/sage/src/ceph2/src/include/mempool.h:1072
> #7  BlueStore::Extent::operator delete (p=0x55c6253f73d0)
>     at /home/sage/src/ceph2/src/os/bluestore/BlueStore.cc:37
> #8  0x000055c6180cb7b0 in BlueStore::ExtentMap::rm (p=...,
> this=0x55c6254416a8)
>     at /home/sage/src/ceph2/src/os/bluestore/BlueStore.h:673
> #9  BlueStore::ExtentMap::punch_hole (this=this@entry=0x55c6231dbc78,
>     offset=offset@entry=217088, length=length@entry=4096,
>     old_extents=old_extents@entry=0x7ff6a70324a8)
>     at /home/sage/src/ceph2/src/os/bluestore/BlueStore.cc:2140
> 
> (gdb)
> #4  0x000055c6180a360b in mempool::slab_allocator<BlueStore::Extent, 0ul,
> 4ul>::freeslot (
>     freeEmpty=true, s=0x55c6253f73c8, this=0x55c618a2f6a0
> <_factory_bluestore_extent>)
>     at /home/sage/src/ceph2/src/include/mempool.h:485
> 485              slab->slabHead.next->prev = slab->slabHead.prev;
> (gdb) p slab
> $1 = (mempool::slab_allocator<BlueStore::Extent, 0ul, 4ul>::slab_t *)
> 0x55c6253f7300
> (gdb) p slab->slabHead
> $2 = {
>   prev = 0x0,
>   next = 0x0
> }

>>>>> Nulls here indicate that this SLAB isn't on the list of slabs that have a free slot, i.e., it indicates that this slab was completely allocated.

But that seems plausible. 

The freelist works as follows:

Each slab has a singly-linked list of free slots in that slab. This is linked through the "freeHead" member of slab_t.

Each slab that isn't fully allocated is put on a doubly-linked list hung off of the container irself (slab_t::slabHead).

The basically, if slab_t->freeSlotCount == 0, then slot_t::slabHead is null -> because the slab is fully allocated, it's not on the freelist.
If slab_t->freeSlotCount != 0 (i.e, there's a free slot), then slot_t::slabHead shouldn't be null, it should be in the double-linked list off of the contiainer head.

The transition case from 0 to != 0, is handled in the code immediately preceeding this. Plus, since slabSize == 4 here, AND we've satisfied slab->freeSlots == slab->slabSize, that code shouldn't have been triggered.
There's only one place in the code where freeSlots is incremented (it's a few lines before this). That code seems to clearly catch the transition from 0 to 1....
There's only one place in the code where freeSlots is decremented, and it handles the == 0 case, pretty clearly.

So I'm stumped.

A print of *slab might be helpful.


> (gdb) list
> 480           }
> 481           if (freeEmpty && slab->freeSlots == slab->slabSize && slab !=
> &stackSlab) {
> 482              //
> 483              // Slab is entirely free
> 484              //
> 485              slab->slabHead.next->prev = slab->slabHead.prev;
> 486              slab->slabHead.prev->next = slab->slabHead.next;
> 487              assert(freeSlotCount >= slab->slabSize);
> 488              freeSlotCount -= slab->slabSize;
> 489              assert(slabs > 0);
> 
> Any bright ideas before I dig in?
> 
> BTW, we discussed this a bit last night at CDM and the main concern is that
> this approach currently wraps up a slab allocator with the mempools.
> It may be that these are both going to be good things, but they are
> conceptually orthogonal, and it's hard to tell whether the slab allocator is
> necessary without also having a simpler approach that does *just* the
> accounting.  Is there any reason these have to be tied together?
> 
> Thanks!
> sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux