On Mon, 2008-12-08 at 17:12 -0500, Theodore Tso wrote: > On Sun, Dec 07, 2008 at 08:52:50PM -0800, Andrew Morton wrote: > > > > The first patch which was added (pre-2.6.27) was "percpu_counter: new > > function percpu_counter_sum_and_set". This added the broken-by-design > > percpu_counter_sum_and_set() function, **and used it in ext4**. > > > > Mea culpa, I was the one who reviewed Mingming's patch, and missed > this. Part of the problem was that percpu_counter.c isn't well > documented, and I so saw the spinlock, but didn't realize it only > protected reference counter, and not the per-cpu array. I should have > read through code more thoroughly before approving the patch. > > I suppose if we wanted we could add a rw spinlock which mediates > access to a "foreign" cpu counter (i.e., percpu_counter_add gets a > shared lock, and percpu_counter_set needs an exclusive lock) but it's > probably not worth it. rwlocks are utter suck and should be banished from the kernel - adding one would destroy the whole purpose of the code. > Actually, if all popular architectures had a hardware-implemented > atomic_t, I wonder how much ext4 really needs the percpu counter, > especially given ext4's multiblock allocator; with ext3, given that > each block allocation required taking a per-filesystem spin lock, > optimizing away that spinlock was far more important for improving > ext3's scalability. But with the multiblock allocator, it may that > we're going through a lot more effort than what is truly necessary. atomic_t is pretty good on all archs, but you get to keep the cacheline ping-pong. -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html