Re: [RFC] RCU Judy array with distributed locking for FS extents

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Mathieu Desnoyers (2013-06-11 21:12:31)
> * Chris Mason (clmason@xxxxxxxxxxxx) wrote:
> [...]
> > Ouch, ok.  In private email yesterday I talked with Mathieu about how
> > his current setup can't prevent the concurrent insertion of overlapping
> > extents.  He does have a plan to address this where the insertion is
> > synchronized by keeping placeholders in the tree for the free space.  I
> > think it'll work, but I'm worried about doubling the cost of the insert.
> 
> Hi Chris,
> 
> The weekend and early week has been productive on my side. My updated
> work is available on this new branch:
> 
> git://git.lttng.org/userspace-rcu.git
> branch: urcu/rcuja-range
> 
> Since last week, I managed to:
> 
> - expand the RCU Judy Array API documentation:
>   https://git.lttng.org/?p=userspace-rcu.git;a=blob;f=urcu/rcuja.h;h=82e272bd4ede1aec436845aef287754dd1dab8b6;hb=03a50ae89ec4d7f39e91d0d49c4639c4cf6e894c

Nice

> 
> - create an API for Judy Array Ranges, as discussed via email privately:
> 
> API:
> https://git.lttng.org/?p=userspace-rcu.git;a=blob;f=urcu/rcuja-range.h;h=63035a1660888aa5f9b20548046571dcb54ad193;hb=03a50ae89ec4d7f39e91d0d49c4639c4cf6e894c
> 
> Implementation:
> https://git.lttng.org/?p=userspace-rcu.git;a=blob;f=rcuja/rcuja-range.c;h=7e4585ef942d76f1811f3c958fff3138ac120ca3;hb=03a50ae89ec4d7f39e91d0d49c4639c4cf6e894c
> 
> Please keep in mind that this code has only been moderately
> stress-tested (with up to 24 cores, on small keyspaces of 3, 5, 10, 100
> keys, so races occur much more frequently). It should not be considered
> production-ready yet.

Ok, I'll definitely take a look.

> 
> The test code (and thus examples usage) is available here:
> https://git.lttng.org/?p=userspace-rcu.git;a=blob;f=tests/test_urcu_ja_range.c;h=12abcc51465b64a7124fb3e48a2150e225e145af;hb=03a50ae89ec4d7f39e91d0d49c4639c4cf6e894c
> https://git.lttng.org/?p=userspace-rcu.git;a=blob;f=tests/test_urcu_ja_range.h;h=e9bbdbc3ed7eb8f57e30c26b8789ba609a6bfdd9;hb=03a50ae89ec4d7f39e91d0d49c4639c4cf6e894c
> 
> So far, my benchmarks shows near-linear read-side scalability (as
> expected from RCU). However, early results does not show the scalability
> I would have expected for concurrent updates. It's not as bad as, e.g.,
> a global lock making performances crawl due to ping-pong between
> processors, but so far, roughly speaking, if I multiply the number of
> cores doing updates by e.g. 12, the per-core throughput of update
> stress-test gets divided by approximately 12. Therefore, the number of
> updates system-wide seems to stay constant as we increase the number of
> cores. I will try to get more info as I dig into more benchmarking,
> which may point at some memory-throughput bottlenecks.

We're benchmarking different workloads, and I'm not sure how much of the
scalability difference is from being in the kernel.  One test I have
here is a batch of deletion and reinsertion of keys at random.

I'm running on a key space of 10 million keys.  If I run the same number
of random operations on 100,000 keys I get similar (but slightly faster)
numbers:

100,000 random insertion and deletions batches:

skiplist: 3.01s per thread
rbtree:   2.1s  per thread

With 16 threads:

skiplist: 5.8s per thread
rbtree:   ~70s per thread (ranges from 15s to 76s)

The random part is crucial for scaling with the skiplists.  The locks
are per node, and as long as all the threads are working in different
places things scale fairly well. 

> 
> I stopped working on the range implementation thinking that I should
> wait to get some feedback before I start implementing more complex
> features like RCU-friendly range resize.

I really wanted to send out my code this morning, but I also wanted to
match rbtrees single threaded first.  It's much closer now, so I'm
commenting and cleaning up what I have for posting tomorrow.

I'll talk with Liu Bo about putting the skiplists under LGPL, but I'd
love some help getting numbers against librcu.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux