Ted Ts'o wrote:
On Mon, Aug 02, 2010 at 07:52:29PM -0700, john stultz wrote:
With the non-vfs scalability patched kernels, we see that the j_state
lock and atomic changes pull start_this_handle out of the top contender
handle, but there is still quite a large amount of contention on the
dput paths.
So yea, the change does help, but its just not the top cause of
contention when aren't using the vfs patches, so we don't see as much
benefit at this point.
Great, thanks for uploading the lockstats. Since dbench is so
metadata heavy, it makes a lot of sense that further jbd2
optimizations probably won't make much difference until the VFS
bottlenecks can be solved.
Other benchmarks, such as the FFSB benchmarks used by Steven Pratt and
Eric Whitney, would probably show more of a difference.
In any case, I've just sent two more patches which completely remove
any exclusive spinlocks from start_this_handle() by converting
j_state_lock to a rwlock_t, and dropping the need to take
t_handle_lock. This will add more cache line bouncing, so on NUMA
workloads this may make things worse, but I guess we'll have to see.
Anyone have access to an SGI Altix? I'm assuming the old Sequent NUMA
boxes are long gone by now...
- Ted
Ted:
The 48 core system I'm running on is an eight node NUMA box with three
hop worst case latency, and it tends to let you know if you're bouncing
cache lines too enthusiastically. If someone's got access to a larger
system with more hops across the topology, that would naturally be even
better.
I'm taking my 2.6.35 ext4 baseline now. It takes me about 36 hours of
running time to get in a complete set of runs, so with luck I should
have data and lockstats to post in a few days.
Eric
P.S. And yes, I think I'll make a set of non-accidental no-journal runs
this time as well...
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html