Re: [PATCH RFC 0/2] skiplists for range indexes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Jan Kara (2013-05-03 05:19:24)
> Hi,
> 
> On Thu 02-05-13 22:02:11, Chris Mason wrote:
> > I've been working on skiplists for indexing off/on for a little while
> > now.  For btrfs, I really need a fast way to manage extents that doesn't
> > bog down in lock contention for reads or writes.  Dave Chinner has
> > mentioned this as well, so I've cc'd him.
>   But I guess you still need writer-writer exclusion, don't you? Or are you
> able to do some partial locking of the structure to allow parallel
> modifications?

Hi Jan,

Yes, insert/delete still lock, but they only take locks for the levels
required for the operation.  So if we're inserting at level 0, we'll
walk all the way down to level 0 before taking any locks.

If an insert happens at level 5, it'll start locking at level 5 and
build a list of nodes we'll have to update in order to do the insert.
Everything in the list is locked until the insert is done.

There's room for optimization there, since chances are good the
insertion point will have some free room and I won't need all those
locks.  The real goal is a little less fine grained, just because all
the locking does slow things down when there's no contention.

For the iommu code, the biggest difference between skiplists and rbtrees
is the rbtrees are trying to remember a spot they are likely to find
free ranges to hand out.  The skiplists pick a random starting point and
hope the locking spread saves us.

> 
> > Liu Bo started this skiplist code, but at the time it didn't make
> > a huge difference.  I've reworked things a bit and used the IOMMU
> > to test things.  I ran some benchmarks on a single socket
> > server with two SSD cards to get a baseline of how much it is helping.
> > 
> > I tried to keep the basic structure of the IOMMU code, and there are probably
> > many better ways to fix the IOMMU contention.  Really I'm just trying to
> > compare with rbtrees, not fix the IOMMU.
>   There also exist some research papers on RCU friendly RB-trees (or other
> trees). Maybe they would be interesting to try? The basic trick is to use a
> copy-on-write when you need to do a rotation. This is slightly impractical
> because it requires memory allocation (or alternatively doubles memory
> footprint of an rb node) but you need to do a similar stuff in skip lists
> anyway. Just an idea...

Definitely interested if you know of alternatives.  Most of the ones I
saw were RCU for reads but not very fine grained for writes.  For btrfs
at least, the indexes will be getting a lot of updates and I need higher
concurrency on writes.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux