RE: [PATCH 1/2] ext4: introduce new i_write_mutex to protect fallocate

Lukáš Czerner <lczerner@xxxxxxxxxx> · Tue, 3 Jun 2014 12:49:35 +0200 (CEST)

On Tue, 3 Jun 2014, Namjae Jeon wrote:

> Date: Tue, 03 Jun 2014 15:04:32 +0900
> From: Namjae Jeon <namjae.jeon@xxxxxxxxxxx>
> To: 'Theodore Ts'o' <tytso@xxxxxxx>
> Cc: 'Lukáš Czerner' <lczerner@xxxxxxxxxx>,
>     'linux-ext4' <linux-ext4@xxxxxxxxxxxxxxx>,
>     'Ashish Sangwan' <a.sangwan@xxxxxxxxxxx>
> Subject: RE: [PATCH 1/2] ext4: introduce new i_write_mutex to protect
>     fallocate
> 
> > 
> > On Sat, May 31, 2014 at 03:45:36PM +0900, Namjae Jeon wrote:
> > > ext4 file write is already serialized with inode mutex.
> > 
> > Right, I had forgotten about that.  The case where we really care
> > about parallel writes is in the direct I/O case, and eventually I'd
> > like for us to be able to support non-overwriting/non-isize-extending
> > writes in parallel but we're not there yet.
> Okay.
> > 
> > > So I think the impact of adding another lock will be very very less..
> > > When I run parallel write test of fio to prove it, I can not see the difference on w/wo
> > i_write_mutex.
> > 
> > If there is an impact, it won't show up there.  Where it will show up
> > will be in high scalability workloads.  For people who don't have the
> > half-million dollars (and up) expensive RAID arrays, a fairly good
> > facsimile is to use a > 16 core system, preferably a system at least 4
> > sockets, and say 32 or 64 gigs of memory, of which you can dedicate
> > half to a ramdisk.  Then run the fio scalability benchmark in that
> > scenario.  That way, things like cache line bounces and lock
> > contentions will be much more visible when the system is no longer
> > bottleneck by the HDD.
> Yes, Right. I agree that result should be measured on high-end server
> as you pointed again. Unfortunately I don't have such equipment yet..
> > 
> > > Yes, Right. We can use shared lock to remove a little bit lock contention in ext4 file write.
> > > I will share rwsem lock patch.. Could you please revert i_write_mutex patch ?
> > 
> > So the shared lock will help somewhat (since writes will be far more
> > common than fallocate calls) but I suspect, not all that much.  And if
> > I revert the i_write_mutex call, now, we won't have time to replace it
> > with a different patch since the merge window is already open.
> > 
> > And since this patch is needed to fix a xfstests failure (although
> > it's for collapse range in data journalling mode, so not a common
> > case), if we can't really see a performance loss in the much more
> > common server configurations, I'm inclined to leave it in for now, and
> > we can try to improve performance in the next kernel revision.
> IMHO, If our goal is to solve the problem of xfstests, we can use only
> "ext4: fix ZERO_RANGE test failure in data journalling" patch without
> i_write_mutex patch. And we can add lock for fallocate on next kernel
> after checking with sufficient time.

I would rather go with this solution. The race is not terribly
critical and this way we would have more time to come up with a
proper locking also with proper locking for AIO/DIO because from my
measurement I can see only about 50% of performance that xfs can
achieve. I believe that the reason is that we're actually using the
stock VFS locking, but we should be able to do something smarter
than that.

Thanks!
-Lukas

> 
> Thanks!
> 
> > 
> > What do other people think?
> > 
> > 						- Ted
> 
>