Re: [PATCH 2/2] fs: Make write(2) interruptible by a signal

Jan Kara <jack@xxxxxxx> · Tue, 15 Nov 2011 12:23:20 +0100

On Mon 14-11-11 14:19:54, Andrew Morton wrote:
> On Mon, 14 Nov 2011 17:15:24 +0100
> Jan Kara <jack@xxxxxxx> wrote:
> 
> > Currently write(2) to a file is not interruptible by a signal. Sometimes this
> > is desirable (e.g. when you want to quickly kill a process hogging your disk or
> > when some process gets blocked in balance_dirty_pages() indefinitely due to a
> > filesystem being in an error condition).
> > 
> > Reported-by: Kazuya Mio <k-mio@xxxxxxxxxxxxx>
> > Tested-by: Kazuya Mio <k-mio@xxxxxxxxxxxxx>
> > Signed-off-by: Jan Kara <jack@xxxxxxx>
> > ---
> >  mm/filemap.c |   11 +++++++++--
> >  1 files changed, 9 insertions(+), 2 deletions(-)
> > 
> > diff --git a/mm/filemap.c b/mm/filemap.c
> > index c0018f2..166b30e 100644
> > --- a/mm/filemap.c
> > +++ b/mm/filemap.c
> > @@ -2407,7 +2407,6 @@ static ssize_t generic_perform_write(struct file *file,
> >  						iov_iter_count(i));
> >  
> >  again:
> > -
> >  		/*
> >  		 * Bring in the user page that we will copy from _first_.
> >  		 * Otherwise there's a nasty deadlock on copying from the
> > @@ -2463,7 +2462,15 @@ again:
> >  		written += copied;
> >  
> >  		balance_dirty_pages_ratelimited(mapping);
> > -
> > +		/*
> > +		 * We check the signal independently of balance_dirty_pages()
> > +		 * because we need not wait and check for signal there although
> > +		 * this loop could have taken significant amount of time...
> > +		 */
> > +		if (fatal_signal_pending(current)) {
> > +			status = -EINTR;
> > +			break;
> > +		}
> >  	} while (iov_iter_count(i));
> >  
> >  	return written ? written : status;
> 
> Will this permit the interruption of things like fsync() or sync()?  If
> so, considerable pondering is needed.
  No, fsync() or sync() are not influenced.

> Also I worry about stuff like the use of buffered write to finish off
> loose ends in direct-IO writing.  Sometimes these writes MUST complete,
> to prevent exposing unwritten disk blocks to a subsequent read.  Will a
> well-timed ^C break this?  If "no" then does this change introduce risk
> that we'll later accidentally introduce such a security hole?
  That's a good question! Well timed SIGKILL can stop buffered write
completing bits of direct IO write. But looking into the code I fail to see
where we rely on buffered write being completed. I find it somewhere on a
boundary of a filesystem bug to return from direct IO with a potential of
stale data exposure...

For writes beyond i_size, we increase i_size only by the amount which was
successfully written so there is no risk of exposure. Otherwise, direct IO
would have to fill a hole to allocate a block - ext? filesystems, reiserfs,
ocfs2, and other simple filesystems avoid doing that. XFS fills a hole but
takes care to init everything. So I don't think anyone really expects
buffered write to cover his back.

To sum up the only result of interrupted buffered write after direct IO
write seems to be that some other application could see only part of the
write completed. That is possible in case of multipage buffered writes as
well and e.g. with ext3/4 such situation is possible even without these
patches (due to conditions like ENOSPC or due to a system crash -
admittedly more serious conditions than someone sending SIGKILL).

Thanks for your comments.

								Honza
-- 
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html