Re: [RFC] RCU Judy array with distributed locking for FS extents

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Chris Mason (clmason@xxxxxxxxxxxx) wrote:
> Quoting Mathieu Desnoyers (2013-06-03 01:27:58)
> > Hi Chris,
> > 
> > I stumbled the LWN article "A kernel skiplist implementation (Part 1)",
> > and really thought I should tell you about the side-project I am
> > currently working on: RCU Judy array, with locking distributed within
> > the data structure. I developed my own design of this structure
> > specifically to be usable with RCU. (ref. my LPC2012 presentation:
> > https://www.efficios.com/lpc2012-scaling-rcu-judy-arrays-cache-efficient-compact-fast-and-scalable-trie)
> > 
> > I think it would fit your extents use-case perfectly, with excellent
> > cache usage, constant lookup time, near-linear scalability for lookups,
> > and pretty good update scability.
> > 
> > The code is available in a development branch of the Userspace RCU
> > library:
> > 
> > git://git.lttng.org/userspace-rcu.git branch: urcu/rcuja
> > https://git.lttng.org/?p=userspace-rcu.git;a=shortlog;h=refs/heads/urcu/rcuja
> > 
> > It runs entirely in user-space, although it can be ported easily to the
> > Linux kernel. I'd be curious to see how it behaves with your workload.
> > 
> > Relevant files:
> > - rcuja/design.txt: design document of the data structure,
> > - urcu/rcuja.h (public API)
> > - rcuja/*.[ch]: implementation
> > - tests/test_urcu_ja.c: stress-tests, example usage.
> > 
> > After reading the article on skiplists, I added a new API specifically
> > to handle non-overlapping ranges:
> > 
> > struct cds_hlist_head cds_ja_lookup_lower_equal(struct cds_ja *ja,
> >                 uint64_t key);
> > 
> > AFAIU, "extents", as far as keys are concerned, are nothing more than
> > segments, with start and end values.
> > 
> > Extents can be indexed by the Judy array in the following way: we use
> > the start of segment as node key, and keep the end of segment value
> > within the leaf node. We add those nodes into the judy array, indexed by
> > start-of-segment value. Then, when we want to figure out if a value
> > matches a segment, we do the following:
> > 
> > 1) lookup the possible segment match with cds_ja_lookup_lower_equal(),
> >    passing the key to lookup as parameter,
> > 
> > 2) check that our key fits within the range of the segment using the
> >    end-of-segment value within the leaf node.
> > 
> > Of course, this approach is limited to non-overlapping segments, so I
> > hope this is what you need.
> 
> Hi Mathieu,
> 
> One problem here is that XFS wants to allow duplicate keys in the tree.
> This is possible with some modifications to the skiplist code, but I'm
> not sure if it fits into your description above.

Are those segments that completely overlap, or partially overlap ?

> 
> Regardless, it is worth trying out.  I'll pull my current code back into
> userland and try out the userspace-rcu port (I had planned this anyway).
> That way we can have a proper head-to-head benchmark ;)

Cool! Don't hesitate to ping me if I can be of any help.

Thanks,

Mathieu

> 
> -chris
> 

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux