Re: [REGRESSION] 998ef75ddb and aio-dio-invalidate-failure w/ data=journal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/05/2015 08:22 AM, Theodore Ts'o wrote:
...
> I've bisected it down to commit 998ef75ddb: "fs: do not prefault
> sys_write() user buffer pages".  I've confirmed that 4.3-rc2 fails as
> detailed below, but with 998ef75ddb reverted, the problem goes away.
...
> Before commit 998ef75ddb, if we need to prefault in the page, we do so
> before we attempt the copy.  After this commit, we attempt the copy
> and if it fails because pagefaults have been turned off, we call
> write_end(), the unlock the page, prefault in the pages, and then
> retry the commit.

That's nasty.  Thanks for the bug report!

I'll go see if I can reproduce this.

> What I think is going on is that when we do attempt the copy, we end
> up marking the page dirty before we notice that we need to page fault
> in the page, which ends up triggering the warning that jbd2
> buffer_head that is supposed to be journaled has been marked dirty
> without calling ext4_handle_dirty_metadata() --- which is handled by
> ext4_journalled_write_end(), but which is now happening out of order
> given this commit.
> 
> Is it possible that we can change iov_iter_copy_from_user_atomic(), to
> check for the error case before it marks the page dirty?  Or can we
> create a light-weight function which checks to see if the page needs
> to be faulted in which is lighter weight than
> iov_iter_fault_in_readable?

Maybe I'm not following the macro magic, but I don't see where
iov_iter_copy_from_user_atomic() is setting 'page' dirty.  It'll set the
dirty bit in the PTEs of course, but I don't see it touching 'page'
except to kmap() it.

I do see some ->write_end() implementations doing set_page_dirty(), though.

Could we have just been confused and dirtied a page under ->write_end()
when we had copied=0 and it wasn't _really_ dirtied?
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux