Re: [PATCH RFC 0/2] skiplists for range indexes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Dave Chinner (2013-05-05 03:33:57)
> On Sat, May 04, 2013 at 07:11:51AM -0400, Chris Mason wrote:
> > Quoting Dave Chinner (2013-05-03 23:25:36)
> > > 
> > > I've got two cases I care about. The first is the buffer cache
> > > indexes which have a 1000:1 read:modify ratio and I'd really like the
> > > lookups to be lockless. The other case is the extent tree, where we
> > > do lots of inserts when the extent tree is first read, and after
> > > than it's typically 2 lookups for every insert/remove. Having one
> > > tree that works for both would be handy...
> > 
> > Ok, we're in a similar boat then.  I'll finish off some of the API and
> > test the pure RCU side harder.
> > 
> > For the extent tree, are you doing a lot of merging once things are in
> > the tree?  I'm not planning on doing pure-rcu for items that get merged
> > quiet yet.
> 
> Yes, we merge extents where ever possible. Almost all contiguous
> allocations and unwritten extent conversions merge extents in some
> manner...

Ok, I'll make sure those helpers are generic.  The helpers need to
search down to the leaf with the items we care about, take the
lock and then start merging things together.  For btrfs, the decision to
merge is pretty complex, so it'll end up driven by the FS code.

The skiplist doesn't do the copy part of rcu.  It carefully orders the
updates instead, but the merging should still be possible because I'm
making sure the keys in the leaf and the slot structure match before
trusting what I read.

> 
> > Also, I'm using unsigned longs right now.  My guess is we'll both want
> > u64s, which means I have to do an i_size_read/write trick in a few
> > spots.
> 
> Yup, definitely needs to be u64 for XFS...

Fair enough.  The i_size_read equiv will slow down searches some on
32 bit.  I think the hit is worth it though, much better than two trees.

Is your buffer cache radix now or rbtree?  It's worth mentioning that
radix is still 2x-3x faster than rbtree if you aren't doing range
searches. 

I'm not sure if that carries from the microbenchmark into something we
can notice, but I definitely wouldn't use skiplists for something
like the page cache unless we have a good case for a major range based
overhaul.


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux