On Thu, 5 Jan 2006, Ryan Richter wrote: > > Another one. I can't keep running this kernel - nearly all of our > backup tapes are erased now. If a drive were to fail today, we would > lose hundreds of GB of irreplacible data. I'm going back to 2.6.11.3 > until we have a full set of backups again. Yeah, don't trash your backups. If/when you try something again, how about these two trivial one-liners? I'm not 100% sure the mapcount sanity check is strictly speaking right (no locking between mapcount/pagecount comparison), but the page count really should never fall below the mapcount, so aside from races that I don't think can be triggered in practice, this might be very useful to find where those pesky page counts suddenly disappear.. Right now we get the oops "too late" - something has decremented the page count way too far, but we don't know what it was, and the actual function that triggers it _seems_ to be harmless. The other part of the patch is to clear "page" when it's being free'd, in case somebody tries to free the same thing twice. I don't see how that could happen either, but... The patch is untested in every way. No guarantees. Linus --- diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c index c4aade8..18e60e1 100644 --- a/drivers/scsi/st.c +++ b/drivers/scsi/st.c @@ -4481,6 +4481,7 @@ static int sgl_unmap_user_pages(struct s for (i=0; i < nr_pages; i++) { struct page *page = sgl[i].page; + sgl[i].page = NULL; if (dirtied) SetPageDirty(page); /* FIXME: cache flush missing for rw==READ diff --git a/include/linux/mm.h b/include/linux/mm.h index a06a84d..daf504d 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -299,6 +299,7 @@ struct page { #define put_page_testzero(p) \ ({ \ BUG_ON(page_count(p) == 0); \ + BUG_ON(page_count(p) <= page_mapcount(p)); \ atomic_add_negative(-1, &(p)->_count); \ }) - : send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html