On Fri, Sep 21, 2007 at 11:15:42PM -0400, Steven Rostedt wrote: > On Fri, 21 Sep 2007, Paul E. McKenney wrote: > > On Fri, Sep 21, 2007 at 09:15:03PM -0400, Steven Rostedt wrote: > > > On Fri, 21 Sep 2007, Paul E. McKenney wrote: > > > > On Fri, Sep 21, 2007 at 10:40:03AM -0400, Steven Rostedt wrote: > > > > > On Mon, Sep 10, 2007 at 11:34:12AM -0700, Paul E. McKenney wrote: [ . . . ] > > > Are we sure that adding all these grace periods stages is better than just > > > biting the bullet and put in a memory barrier? > > > > Good question. I believe so, because the extra stages don't require > > much additional processing, and because the ratio of rcu_read_lock() > > calls to the number of grace periods is extremely high. But, if I > > can prove it is safe, I will certainly decrease GP_STAGES or otherwise > > optimize the state machine. > > But until others besides yourself understand that state machine (doesn't > really need to be me) I would be worried about applying it without > barriers. The barriers may add a bit of overhead, but it adds some > confidence in the code. I'm arguing that we have barriers in there until > there's a fine understanding of why we fail with 3 stages and not 4. > Perhaps you don't have a box with enough cpus to fail at 4. > > I don't know how the higher ups in the kernel command line feel, but I > think that memory barriers on critical sections are justified. But if you > can show a proof that adding extra stages is sufficient to deal with > CPUS moving memory writes around, then so be it. But I'm still not > convinced that these extra stages are really solving the bug instead of > just making it much less likely to happen. > > Ingo praised this code since it had several years of testing in the RT > tree. But that version has barriers, so this new verison without the > barriers has not had that "run it through the grinder" feeling to it. Fair point... Though the -rt variant has its shortcomings as well, such as being unusable from NMI/SMI handlers. How about this: I continue running the GP_STAGES=3 run on the pair of POWER machines (which are both going strong, and I also get a document together describing the new version (and of course apply the changes we have discussed, and merge with recent CPU-hotplug changes -- Gautham Shenoy is currently working this), work out a good answer to "how big exactly does GP_STAGES need to be", test whatever that number is, assuming it is neither 3 nor 4, and figure out why the gekko-lp1 machine choked on GP_STAGES=3. Then we can work out the best path forward from wherever that ends up being. [ . . . ] Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html