Re: [PATCH] writeback: add a safety limit to the SIGKILL abort

Jan Kara <jack@xxxxxxx> · Tue, 29 Nov 2011 15:18:06 +0100



On Mon 28-11-11 11:33:40, Wu Fengguang wrote:
> On Mon, Nov 28, 2011 at 11:08:42AM +0800, Wu Fengguang wrote:
> > On Wed, Nov 23, 2011 at 11:06:56PM +0800, Theodore Ts'o wrote:
> > > 
> > > On Nov 23, 2011, at 8:27 AM, Wu Fengguang wrote:
> > > 
> > > > 
> > > > Reading Ted's information feed, I tend to disregard the partial write
> > > > issue: since the "broken" applications will already fail and get
> > > > punished in various other cases, I don't care adding one more penalty
> > > > case to them :-P
> > > 
> > > Just wait until you have a bunch of rabid application programmers,
> > > questioning your judgement, your morality, and even your paternity.
> > > :-)
> > 
> > Ah OK, that sounds frightening. Hmm, till now every one have
> > acknowledged the possibility of data corruption, only that
> > people have different perceptions of the severeness.
> > 
> > Let's rethink things this way: "Is it a _worthwhile_ risk at all?"
> > I'm afraid not. Considering the origin of this patch
> > 
> > [BUG] aborted ext4 leads to inifinity loop in balance_dirty_pages
> > http://www.spinics.net/lists/linux-ext4/msg28464.html
> > 
> > I *think* Jan's first patch is already enough for fixing the bug. IWO
> > the patch we worried/discussed so much is really an optional one. I
> > would imagine the easy and safe solution is to just drop it. Any
> > objections?
> 
> Here is the replacement patch.
> 
> ---
> Subject: writeback: add a safety limit to the SIGKILL abort
> Date: Mon Nov 28 11:16:54 CST 2011
> 
> This adds a safety limit to the SIGKILL abort in commit 499d05ecf990
> ("mm: Make task in balance_dirty_pages() killable"). This will avoid
> dirty pages rushing arbitrarily high in the case some task receives
> SIGKILL and hence become *unthrottled* when doing a huge sized write.
> 
> The alternative way is to check SIGKILL and return partial write in
> generic_perform_write(). However it will lead to data corruption as
> put by Andrew Morton:
> 
>  Previously if an app did write(file, 128k) and was hit with SIGKILL, it
>  would write either 0 bytes or 128k bytes.  Now, it can write 36k bytes,
>  yes?  If the target file consisted of a stream of 128k records then the
>  user will claim, with some justification, that Linux corrupted it.
> 
> Signed-off-by: Wu Fengguang <fengguang.wu@xxxxxxxxx>
  The patch looks good. You can add:
Reviewed-by: Jan Kara <jack@xxxxxxx>

									Honza

> ---
>  mm/page-writeback.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> --- linux-next.orig/mm/page-writeback.c	2011-11-28 11:13:58.000000000 +0800
> +++ linux-next/mm/page-writeback.c	2011-11-28 11:16:52.000000000 +0800
> @@ -1136,7 +1136,8 @@ pause:
>  		if (task_ratelimit)
>  			break;
>  
> -		if (fatal_signal_pending(current))
> +		if (fatal_signal_pending(current) &&
> +		    nr_dirty <= dirty_thresh + dirty_thresh / 2)
>  			break;
>  	}
>  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html