I actually forgot to cc linux-fsdevel on this one. Vladimir found a corner case bug in the case of faulting source address, which is since fixed, but might be interesting to anyone else following development... ----- Forwarded message from Nick Piggin <npiggin@xxxxxxxxxxxxx> ----- Date: Wed, 16 May 2007 09:14:06 +0200 From: Nick Piggin <npiggin@xxxxxxxxxxxxx> To: "Vladimir V. Saveliev" <vs@xxxxxxxxxxx> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Subject: Re: [patch 00/41] Buffered write deadlock fix and new aops for 2.6.21-mm2 In-Reply-To: <200705152200.38257.vs@xxxxxxxxxxx> User-Agent: Mutt/1.5.9i cc'ed linux-fsdevel again... On Tue, May 15, 2007 at 10:00:38PM +0400, Vladimir V. Saveliev wrote: > Hello > > On Tuesday 15 May 2007 02:42, Nick Piggin wrote: > > On Mon, May 14, 2007 at 10:28:45PM +0400, Vladimir V. Saveliev wrote: > > > Hello > > > > > > There is a problem with new write. > > > > > > If you expand an empty file with truncate and then write so that in one write file tail is overwritten and something is appended to the file - the write loops forever > > > writing to page containing file tail. Something wrong happens writing to uptodate last page of a file, I guess. > > > > > > I can send a simple program if necessary. > > > > Is this reiserfs with your reiserfs patch? > > No, this is common problem. I get it easily on ext3 and ext2. > > > Yes, please send a simple > > program and I'll have a look. Thanks, that was really helpful! What the program does is to write a non-faulted page into pagecache, which means we're relying on fault_in_pages_readable to bring it in for us. Usually it does, however your specific write pattern required that 2 pages be brought in, and _also_ that the total number of bytes to write went past that 2nd page. This caused the 1st and an Nth (>2) page to be faulted in, but the atomic copy_from_user really needed the 2nd page :) This fixes it for me. --- Index: linux-2.6/include/linux/fs.h =================================================================== --- linux-2.6.orig/include/linux/fs.h 2007-05-16 16:58:41.000000000 +1000 +++ linux-2.6/include/linux/fs.h 2007-05-16 16:58:47.000000000 +1000 @@ -419,7 +419,7 @@ size_t iov_iter_copy_from_user(struct page *page, struct iov_iter *i, unsigned long offset, size_t bytes); void iov_iter_advance(struct iov_iter *i, size_t bytes); -int iov_iter_fault_in_readable(struct iov_iter *i); +int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes); size_t iov_iter_single_seg_count(struct iov_iter *i); static inline void iov_iter_init(struct iov_iter *i, Index: linux-2.6/mm/filemap.c =================================================================== --- linux-2.6.orig/mm/filemap.c 2007-05-16 14:11:18.000000000 +1000 +++ linux-2.6/mm/filemap.c 2007-05-16 17:03:29.000000000 +1000 @@ -1806,11 +1806,10 @@ } EXPORT_SYMBOL(iov_iter_advance); -int iov_iter_fault_in_readable(struct iov_iter *i) +int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes) { - size_t seglen = min(i->iov->iov_len - i->iov_offset, i->count); char __user *buf = i->iov->iov_base + i->iov_offset; - return fault_in_pages_readable(buf, seglen); + return fault_in_pages_readable(buf, bytes); } EXPORT_SYMBOL(iov_iter_fault_in_readable); @@ -2102,7 +2101,7 @@ * to check that the address is actually valid, when atomic * usercopies are used, below. */ - if (unlikely(iov_iter_fault_in_readable(i))) { + if (unlikely(iov_iter_fault_in_readable(i, bytes))) { status = -EFAULT; break; } @@ -2276,7 +2275,7 @@ * to check that the address is actually valid, when atomic * usercopies are used, below. */ - if (unlikely(iov_iter_fault_in_readable(i))) { + if (unlikely(iov_iter_fault_in_readable(i, bytes))) { status = -EFAULT; break; } ----- End forwarded message ----- - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html