Re: [Bug 199931] New: systemd/rtorrent file data corruption when using echo 3 >/proc/sys/vm/drop_caches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6 Jun 2018, at 9:38, Liu Bo wrote:

On Wed, Jun 6, 2018 at 8:18 AM, Chris Mason <clm@xxxxxx> wrote:


On 5 Jun 2018, at 16:03, Andrew Morton wrote:

(switched to email. Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Tue, 05 Jun 2018 18:01:36 +0000 bugzilla-daemon@xxxxxxxxxxxxxxxxxxx
wrote:

https://bugzilla.kernel.org/show_bug.cgi?id=199931

            Bug ID: 199931
Summary: systemd/rtorrent file data corruption when using echo
                    3 >/proc/sys/vm/drop_caches


A long tale of woe here. Chris, do you think the pagecache corruption
is a general thing, or is it possible that btrfs is contributing?

Also, that 4.4 oom-killer regression sounds very serious.


This week I found a bug in btrfs file write with how we handle stable pages.
Basically it works like this:

write(fd, some bytes less than a page)
write(fd, some bytes into the same page)
    btrfs prefaults the userland page
    lock_and_cleanup_extent_if_need()   <- stable pages
                wait for writeback()
                clear_page_dirty_for_io()

At this point we have a page that was dirty and is now clean.  That's
normally fine, unless our prefaulted page isn't in ram anymore.

        iov_iter_copy_from_user_atomic() <--- uh oh

If the copy_from_user fails, we drop all our locks and retry. But along the way, we completely lost the dirty bit on the page. If the page is dropped
by drop_caches, the writes are lost.  We'll just read back the stale
contents of that page during the retry loop. This won't result in crc
errors because the bytes we lost were never crc'd.


So we're going to carefully redirty the page under the page lock, right?

I don't think we actually need to clean it. We have the page locked, writeback won't start until we unlock.


It could result in zeros in the file because we're basically reading a hole, and those zeros could move around in the page depending on which part of the
page was dirty when the writes were lost.


I got a question, while re-reading this page, wouldn't it read
old/stale on-disk data?

If it was never written we should be treating it like a hole, but I'll double check.

-chris




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux