Re: [RFC ] dictionary optimizations

Xavier Hernandez <xhernandez@xxxxxxxxxx> · Wed, 04 Sep 2013 15:37:36 +0200

Al 04/09/13 14:05, En/na Jeff Darcy ha escrit:
On 09/04/2013 04:27 AM, Xavier Hernandez wrote:
I would also like to note that each node can store multiple elements.
Current implementation creates a node for each byte in the key. In my
implementation I only create a node if there is a prefix coincidence 
between
2 or more keys. This reduces the number of nodes and the number of
indirections.

Whatever we do, we should try to make sure that the changes are profiled
against real usage.  When I was making my own dict optimizations back 
in March
of last year, I started by looking at how they're actually used. At 
that time,
a significant majority of dictionaries contained just one item. That's 
why I
only implemented a simple mechanism to pre-allocate the first 
data_pair instead
of doing something more ambitious.  Even then, the difference in actual
performance or CPU usage was barely measurable.  Dict usage has certainly
changed since then, but I think you'd still be hard pressed to find a 
case
where a single dict contains more than a handful of entries, and 
approaches
that are optimized for dozens to hundreds might well perform worse 
than simple
ones (e.g. because of cache aliasing or branch misprediction).

If you're looking for other optimization opportunities that might 
provide even
bigger "bang for the buck" then I suggest that stack-frame or 
frame->local
allocations are a good place to start.  Or string copying in places like
loc_copy.  Or the entire fd_ctx/inode_ctx subsystem.  Let me know and 
I'll come
up with a few more.  To put a bit of a positive spin on things, the 
GlusterFS
code offers many opportunities for improvement in terms of CPU and memory
efficiency (though it's surprisingly still way better than Ceph in 
that regard).

Yes. The optimizations on dictionary structures are not a big 
improvement in the overall performance of GlusterFS. I tried it on a 
real situation and the benefit was only marginal. However I didn't test 
new features like an atomic lookup and remove if found (because I would 
have had to review all the code). I think this kind of functionalities 
could improve a bit more the results I obtained.

However this is not the only reason to do these changes. While I've been 
writing code I've found that it's tedious to do some things just because 
there isn't such functions in dict_t. Some actions require multiple 
calls, having to check multiple errors and adding complexity and 
limiting readability of the code. Many of these situations could be 
solved using functions similar to what I proposed.

On the other side, if dict_t must be truly considered a concurrent 
structure, there are a lot of race conditions that might appear when 
doing some operations. It would require a great effort to take care of 
all these possibilities everywhere. It would be better to pack most of 
these situations into functions inside the dict_t itself where it is 
easier to combine some operations.

By the way, I've made some tests with multiple bricks and it seems that 
there is a clear speed loss on directory listings as the number of 
bricks increases. Since bricks should be independent and they can work 
in parallel, I didn't expected such a big performance degradation. 
However the tests have not been exhaustive nor made in best conditions 
so they might be misleading. Anyway it seems to me that there might be a 
problem with some mutexes that force too much serialization of requests 
(though I have no real proves it's only a feeling). Maybe some more 
"asynchronousity" on calls between translators could help.

Only some thoughts...

Best regards,

Xavi

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
https://lists.nongnu.org/mailman/listinfo/gluster-devel