On Tue, 3 Jun 2014, Namjae Jeon wrote: > Date: Tue, 03 Jun 2014 15:04:32 +0900 > From: Namjae Jeon <namjae.jeon@xxxxxxxxxxx> > To: 'Theodore Ts'o' <tytso@xxxxxxx> > Cc: 'Lukáš Czerner' <lczerner@xxxxxxxxxx>, > 'linux-ext4' <linux-ext4@xxxxxxxxxxxxxxx>, > 'Ashish Sangwan' <a.sangwan@xxxxxxxxxxx> > Subject: RE: [PATCH 1/2] ext4: introduce new i_write_mutex to protect > fallocate > > > > > On Sat, May 31, 2014 at 03:45:36PM +0900, Namjae Jeon wrote: > > > ext4 file write is already serialized with inode mutex. > > > > Right, I had forgotten about that. The case where we really care > > about parallel writes is in the direct I/O case, and eventually I'd > > like for us to be able to support non-overwriting/non-isize-extending > > writes in parallel but we're not there yet. > Okay. > > > > > So I think the impact of adding another lock will be very very less.. > > > When I run parallel write test of fio to prove it, I can not see the difference on w/wo > > i_write_mutex. > > > > If there is an impact, it won't show up there. Where it will show up > > will be in high scalability workloads. For people who don't have the > > half-million dollars (and up) expensive RAID arrays, a fairly good > > facsimile is to use a > 16 core system, preferably a system at least 4 > > sockets, and say 32 or 64 gigs of memory, of which you can dedicate > > half to a ramdisk. Then run the fio scalability benchmark in that > > scenario. That way, things like cache line bounces and lock > > contentions will be much more visible when the system is no longer > > bottleneck by the HDD. > Yes, Right. I agree that result should be measured on high-end server > as you pointed again. Unfortunately I don't have such equipment yet.. > > > > > Yes, Right. We can use shared lock to remove a little bit lock contention in ext4 file write. > > > I will share rwsem lock patch.. Could you please revert i_write_mutex patch ? > > > > So the shared lock will help somewhat (since writes will be far more > > common than fallocate calls) but I suspect, not all that much. And if > > I revert the i_write_mutex call, now, we won't have time to replace it > > with a different patch since the merge window is already open. > > > > And since this patch is needed to fix a xfstests failure (although > > it's for collapse range in data journalling mode, so not a common > > case), if we can't really see a performance loss in the much more > > common server configurations, I'm inclined to leave it in for now, and > > we can try to improve performance in the next kernel revision. > IMHO, If our goal is to solve the problem of xfstests, we can use only > "ext4: fix ZERO_RANGE test failure in data journalling" patch without > i_write_mutex patch. And we can add lock for fallocate on next kernel > after checking with sufficient time. I would rather go with this solution. The race is not terribly critical and this way we would have more time to come up with a proper locking also with proper locking for AIO/DIO because from my measurement I can see only about 50% of performance that xfs can achieve. I believe that the reason is that we're actually using the stock VFS locking, but we should be able to do something smarter than that. Thanks! -Lukas > > Thanks! > > > > > What do other people think? > > > > - Ted > >