Re: Announce: new-aops-1 for 2.6.21-rc3

Nick Piggin <npiggin@xxxxxxx> · Sun, 1 Apr 2007 07:42:18 +0200

Hi,

Progress is coming along: as well as the several fixes Mark has posted, we've
got a GFS2 patch from Steve as well, which also enables us to get rid of the
AOPS_TRUNCATED_PAGE handling in the legacy ->prepare_write paths. And I've
been auditing code and doing stress and code coverage tests for various
error paths...

I was hoping to have a cont_prepare_write filesystem converted before doing
a next patchset, but there are enough fixes and new stuff now that I'll
probably do one tomorrow.

Anyway my recent testing involved things like poisioning the memory of new
buffers, and injecting failures into various places (eg. the atomic
usercopies), then running fsx-linux and stuff on top of ext2 with various
block sizes. This found one subtle problem (the first, below), and
inspection found a couple of others.

With these fixes, the stress tests now survive as long as I've cared to wait.
Any comments always appreciated.

Thanks,
Nick

--

- The first hunk fixes a case where new buffers that were only partially
  cleared (because the write partially overlaps a buffer, so that region is
not cleared), and then the write fails before filling up this portion of the
buffer. Because we have already cleared buffer_new, page_zero_new_buffers
does not clear it properly. This results in uninitialised data going onto
disk (and in pagecache).  The solution is to only clear_buffer_new when we
know everything has been initialised, which makes sense.

- The second hunk fixes a thinko.

- The last hunk fixes a problem where we turn any short write to a !uptodate
page into a zero-length write, for the reason explained in comments. However
it was not taking this contraction into account in page_zero_new_buffers and
__block_commit_write, which is inconsistent.

Index: linux-2.6/fs/buffer.c
===================================================================

--- linux-2.6.orig/fs/buffer.c
+++ linux-2.6/fs/buffer.c
@@ -1852,17 +1852,8 @@ static int __block_prepare_write(struct 
 		if (!buffer_uptodate(*wait_bh))
 			err = -EIO;
 	}
-	if (!err) {
-		bh = head;
-		do {
-			if (buffer_new(bh))
-				clear_buffer_new(bh);
-		} while ((bh = bh->b_this_page) != head);
-		return 0;
-	}
-
-	/* Error case: */
-	page_zero_new_buffers(page, from, to);
+	if (unlikely(err))
+		page_zero_new_buffers(page, from, to);
 	return err;
 }
 
@@ -1895,7 +1886,7 @@ void page_zero_new_buffers(struct page *
 					end = min(to, block_end);
 
 					kaddr = kmap_atomic(page, KM_USER0);
-					memset(kaddr+start, 0, block_end-end);
+					memset(kaddr+start, 0, end - start);
 					flush_dcache_page(page);
 					kunmap_atomic(kaddr, KM_USER0);
 					set_buffer_uptodate(bh);
@@ -2015,30 +2006,29 @@ int block_write_end(struct file *file, s
 
 	start = pos & (PAGE_CACHE_SIZE - 1);
 
-	if (unlikely(copied < len))
+	if (unlikely(copied < len)) {
+		/*
+		 * The buffers that were written will now be uptodate, so we
+		 * don't have to worry about a readpage reading them and
+		 * overwriting a partial write. However if we have encountered
+		 * a short write and only partially written into a buffer, it
+		 * will not be marked uptodate, so a readpage might come in and
+		 * destroy our partial write.
+		 *
+		 * Do the simplest thing, and just treat any short write to a
+		 * non uptodate page as a zero-length write, and force the
+		 * caller to redo the whole thing.
+		 */
+		if (!PageUptodate(page))
+			copied = 0;
+
 		page_zero_new_buffers(page, start+copied, start+len);
+	}
 	flush_dcache_page(page);
 
 	/* This could be a short (even 0-length) commit */
 	__block_commit_write(inode, page, start, start+copied);
 
-	/*
-	 * The buffers that were written will now be uptodate, so we don't
-	 * have to worry about a readpage reading them and overwriting
-	 * a partial write. However if we have encountered a short write
-	 * and only partially written into a buffer, it will not be marked
-	 * uptodate, so a readpage might come in and destroy our partial
-	 * write.
-	 *
-	 * Do the simplest thing, and just treat any short write to a non
-	 * uptodate page as a zero-length write, and force the caller to
-	 * redo the whole thing.
-	 */
-	if (unlikely(copied < len)) {
-		if (!PageUptodate(page))
-			copied = 0;
-	}
-
 	unlock_page(page);
 	mark_page_accessed(page); /* XXX: put this in caller? */
 	page_cache_release(page);
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html