Re: Oops while rebalancing, now unmountable.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 15, 2010 at 07:46:57PM +0100, Andrea Arcangeli wrote:
> I've been reading the writeout() in mm/migrate.c and I wonder if maybe
> that should have been WB_SYNC_ALL or if we miss a
> wait_on_page_writeback in after ->writepage() returns? Can you have a
> look there? We check the PG_writeback bit when the page is not dirty
> (well before fallback_migrate_page is called), but after calling
> writeout() we don't return to wait on PG_writeback. We make sure to
> hold the page lock after ->writepage returns but that doesn't mean
> PG_writeback isn't still set.

I didn't even notice that, but the WB_SYNC_NONE does indeed seem
buggy to me.  If we set the sync_mode to WB_SYNC_NONE filesystem
can and frequently do trylock operations and might just skip to
write it out completely.

So we defintively do need to change writeout to do a WB_SYNC_ALL
writeback.  In addition to that we'll also need the
wait_on_page_writeback call to make sure we actually wait for I/O
to finish.

Also what protects us from updating the page while we write it out?
PG_writeback on many filesystems doesn't protect writes from modifying
the in-flight buffer, and just locking the page after ->writepage
is racy without a check that nothing changed.

> Compaction practically only happens in the context of the task
> allocating memory (in my tree it is also used by kswapd). Not
> immediate to ask a separate daemon to invoke it. Not sure why this
> should screw delalloc. Compaction isn't freeing any memory at all,
> it's not reclaim. It just defragments and moves stuff around and it
> may have to write dirty pages to do so.

kswapd is fine.  Other task allocation memory are direct reclaimers.
Direct reclaim through the filesystem delalloc conversion and the I/O
stack guarantees you stack overflows, that's why filesystems refuse
to do anything in ->writepage for this case.  btrfs and XFS have
explicit checks for PF_MEMALLOC (with a carve out for kswapd in XFS),
and ext4 only writes already allocated blocks in ->writepage but never
does delalloc conversions.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux