> What level of such change do we expect in the 3.x development stream? > There are problems with glusterd that even reader-writer locks or RCU > won't solve, which is why there's already work in progress toward a 4.x > version. Perhaps it's selfish of me, but I'd like to see as much of our > effort as possible directed toward the longer-term solution. Perhaps a > more detailed list of problems we have or anticipate in 3.x would help > us reason about how much effort there is justified. This effort is an attempt to solve the shared memory consistency issues related to a single process. This doesn't overlap with our longer-term effort in solving configuration store consistency across distributed processes. At this juncture, I don't think it is wise to rewrite parts of or whole of glusterd in a higher level language (HLL). We should definitely consider this later. We are starting many small efforts to stabilise and modularise glusterd. This is to make longer-term improvements less difficult. This could also make rewriting parts of glusterd in HLL easier and incremental. We are still exploring how well RCU would fit as a synchronization solution for the concurrency related issues in glusterd. We would soon be sharing our estimate on changes RCU would bring with it. I hope that would help us decide if we should do this along with the longer-term efforts. Does that make sense? > I agree that finer-grain locking is not the answer. Computing history > is full of stories about reducing lock granularity only to find that > the result is both slower and more prone to deadlock. One important example of the kind of concurrency related issues is, When a glusterd comes back up (or reconnects to the cluster), it receives information about the current view of cluster (crudely, the list of volumes and the list of peers) as seen by a peer, from all available peers. glusterd initiates the transition to the view provided by a peer while receiving 'newer' updates. As part of the transition to the 'newer' view glusterd destroys data structures associated with the 'older' view. If the 'older' view is in 'use', i.e if a thread is spawning a brick in the 'older' view, glusterd may crash (classic "use after free"). RCU's "copy and update" could be used to prevent premature freeing of data structures associated with the 'older view'. We expect the code changes to be lesser if we used RCU in comparison with other granular locking mechanism, where we would need to do a whole lot* by ourselves. * - we need to make all our data structures that are shared to be ref-counted. - we need to add code, nearly everywhere to ref-count properly. It is only harder but not impossible. Does that provide a better context to this proposal? _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel