Re: Fw: crash on x86_64 - mm related?

Linus Torvalds <torvalds@xxxxxxxx> · Thu, 5 Jan 2006 13:18:57 -0800 (PST)

On Thu, 5 Jan 2006, Ryan Richter wrote:
> 
> Another one.  I can't keep running this kernel - nearly all of our
> backup tapes are erased now.  If a drive were to fail today, we would
> lose hundreds of GB of irreplacible data.  I'm going back to 2.6.11.3
> until we have a full set of backups again.

Yeah, don't trash your backups.

If/when you try something again, how about these two trivial one-liners?

I'm not 100% sure the mapcount sanity check is strictly speaking right (no 
locking between mapcount/pagecount comparison), but the page count really 
should never fall below the mapcount, so aside from races that I don't 
think can be triggered in practice, this might be very useful to find 
where those pesky page counts suddenly disappear..

Right now we get the oops "too late" - something has decremented the page
count way too far, but we don't know what it was, and the actual function 
that triggers it _seems_ to be harmless.

The other part of the patch is to clear "page" when it's being free'd, in 
case somebody tries to free the same thing twice. I don't see how that 
could happen either, but...

The patch is untested in every way. No guarantees.

		Linus

---

diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c
index c4aade8..18e60e1 100644
--- a/drivers/scsi/st.c
+++ b/drivers/scsi/st.c
@@ -4481,6 +4481,7 @@ static int sgl_unmap_user_pages(struct s
 	for (i=0; i < nr_pages; i++) {
 		struct page *page = sgl[i].page;
 
+		sgl[i].page = NULL;
 		if (dirtied)
 			SetPageDirty(page);
 		/* FIXME: cache flush missing for rw==READ
diff --git a/include/linux/mm.h b/include/linux/mm.h
index a06a84d..daf504d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -299,6 +299,7 @@ struct page {
 #define put_page_testzero(p)				\
 	({						\
 		BUG_ON(page_count(p) == 0);		\
+		BUG_ON(page_count(p) <= page_mapcount(p));	\
 		atomic_add_negative(-1, &(p)->_count);	\
 	})
 
-
: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html