On Sun, Nov 14, 2010 at 05:00:18PM -0500, Christoph Hellwig wrote: > On Sun, Nov 14, 2010 at 09:42:06PM +0100, Andrea Arcangeli wrote: > > btrfs misses this: > > > > + .migratepage = btree_migratepage, > > > > It's a bug that can trigger upstream too (not only with THP) if there > > are hugepage allocations (like while incrasing nr_hugepages). Chris > > already fixed it with an experimental patch. > > If the lack of an obscure method causes data corruption something > is seriously wrong with THP. At least from the 10.000 foot view I just wrote above that it can happen upstream without THP. It's not THP related at all. THP is the consumer, this is a problem in migrate that will trigger as well with migrate_pages or all other possible migration APIs. If more people would be using hugetlbfs they would have noticed without THP. > I can't quite figure what the exact issue is, though. > fallback_migrate_page seems to do the right thing to me for that > case. > > Btw, there's also another issue with the page migration code when used > for filesystem pages. If directly calls into ->writepage instead > of using the flusher threads. On most filesystems this will > "only" cause nasty I/O patterns, but on ext4 for example it will > be more nasty as ext3 doesn't do conversions from delayed allocations to > real ones. So unless you're doing a lot of overwrites it will be > hard to make any progress in writeout(). +static int btree_migratepage(struct address_space *mapping, + struct page *newpage, struct page *page) +{ + /* + * we can't safely write a btree page from here, + * we haven't done the locking hook + */ + if (PageDirty(page)) + return -EAGAIN; fallback_migrate_page would call writeout() which is apparently not ok in btrfs for locking issues leading to corruption. > Btw, what codepath does THP call migrate_pages from? If you don't > use an explicit thread writeout will be a no-op on btrfs and XFS, too. THP never calls migrate_pages, it's memory compaction that calls it from inside alloc_pages(order=9). It got noticed only with THP because it makes more frequent hugepage allocations than nr_hugepages in hugetlbfs (and maybe there are more THP users already). -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html