> That's weird, since the only purpose of the mem-pool was precisely to > improve performance of allocation of objects that are frequently > allocated/released. Very true, and I've long been an advocate of this approach. Unfortunately, for this to work our allocator has to be more efficient than the system's, and it's not - especially wrt locking. Overhead is high and contention is even higher, heavily outweighing any advantage. Unless/until we put in the work to make mem-pools perform better at high thread counts, avoiding them seems like the practical choice. > * Consider http://review.gluster.org/15036/. With all communications > going through the same socket, the problem this patch tries to solve > could become worse. I'll look into this. Thanks! > * We should consider the possibility of implementing a global thread > pool, which would replace io-threads, epoll threads and maybe others. > Synctasks should also rely on this thread pool. This has the benefit > of better controlling the total number of threads. Otherwise when we > have more threads than processor cores, we waste resources > unnecessarily and we won't get a real gain. Even worse, it could start > to degrade due to contention. Also a good idea, though perhaps too hard/complex to tackle in the short term. I did take a stab at making io-threads use a single global set of queues instead of per instance, to address a similar concern. To make a long story short, it didn't seem to make things any better for this test. I still think it's a good idea, though. > * There are *too many* mutexes in the code. Hear, hear. > We should drastically reduce its use. Sometimes by using better > structures that do not require blocking at all or even introducing RCU > and/or rwlocks. One case that I've always had doubts is dict_t. Why > does it need locks ? Once xlator should not modify a dict_t once it > has been passed to another xlator, and if we assume that a dict can > only be modified by a single xlator at a time, it's very unlikely that > it needs to modify it from multiple threads. I think in general you're right about dicts, but I also think it would be interesting to disable dict locking and see what breaks. I'll bet there's something *somewhere* that tries to access dicts concurrently. Callbacks for children of a cluster translator using the "fan out" pattern seem particularly suspect. What worries me is the classic problem with race conditions; it's easy to have something that *appears* to work when things aren't running in parallel enough to hit tiny timing windows, but it's a lot harder to be *sure* you're safe even when they do. I think I'd lean toward a more conservative approach of finding the particularly egregious high-contention cases, examining those particular code paths carefully, and changing them to use a lock-free dict variant or alternative. _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel