Quoting Dave Chinner (2013-06-27 01:19:18) > On Wed, Jun 26, 2013 at 10:29:36PM -0400, Mathieu Desnoyers wrote: > > > Also, my benchmarks were not just inserting keys but keys pointing to > > > things. So a lookup walked the tree and found an object and then > > > returned the object. radix can just return a key/value without > > > dereferencing the value, but that wasn't the case in my runs. > > > > In the specific test I ran, I'm looking up the "range" object, which is > > the dereferenced "value" pointer in terms of Judy lookup. My Judy array > > implementation represents items as a linked list of structures matching > > a given key. This linked list is embedded within the structures, > > similarly to the linux/list.h API. Then, if the lookup succeeds, I take > > a mutex on the range, and check if it has been concurrently removed. > > Does that mean that each "extent" that is indexed has a list head > embedded in it? That blows the size of the index out when all I > might want to store in the tree is a 64 bit value for a block > mapping... For the skiplists, it might make sense to take the optimizations a little farther and put the start/len/value triplet directly in the leaf. Right now I push the len/value part into the user object. For btrfs this is always bigger than a single block mapping (some kind of flags etc). > > FWIW, when a bunch of scalability work was done on xfs_repair years > ago, judy arrays were benchmarked for storing extent lists that > tracked free/used space. We ended up using a btree, because while it > was slower than the original bitmap code, it was actually faster > than the highly optimised judy array library and at the scale we > needed there was no memory usage advantage to using a judy array, > either... > > So I'm really starting to wonder if it'd be simpler for me just to > resurrect the old RCU friendly btree code Peter Z wrote years ago > (http://programming.kicks-ass.net/kernel-patches/vma_lookup/) and > customise it for the couple of uses I have in XFS.... I did start with his rcu btree, but the problem for me was concurrent updates. For xfs, the skiplists need two things: i_size_read() style usage of u64 for keys instead of unsigned long. Helper to allow duplicate keys. Both are pretty easy, but I'm trying things out in btrfs first to make sure I've worked out any problems. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html