Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/20/24 3:54 PM, Chris Mason wrote:

[ ... ]

> xas_split_alloc() does the allocation and also shoves an entry into some of
> the slots.  When the tree changes, the entry we've stored is wildly 
> wrong, but xas_reset() doesn't undo any of that.  So when we actually
> use the xas->xa_alloc nodes we've setup, they are pointing to the
> wrong things.
> 
> Which is probably why the commits in 6.10 added this:
> 
> /* entry may have changed before we re-acquire the lock */
> if (alloced_order && (old != alloced_shadow || order != alloced_order)) {
> 	xas_destroy(&xas);
>         alloced_order = 0;
> }
> 
> The only way to undo the work done by xas_split_alloc() is to call
> xas_destroy().
> 
> To prove this theory, I tried making a minimal version that also
> called destroy, but it all ended up less minimal than the code
> that's actually in 6.10.  I've got a long test going now with
> an extra cond_resched() to make the race bigger, and a printk of victory.
> 
> It hasn't fired yet, and I need to hop on an airplane, so I'll just leave
> it running for now.  But long story short, I think we should probably
> just tag all of these for stable:
> 
> https://lore.kernel.org/all/20240415171857.19244-2-ryncsn@xxxxxxxxx/T/#mdb85922624c39ea7efb775a044af4731890ff776
> 
> Also, Willy's proposed changes to xas_split_alloc() seem like a good
> idea.

A few days of load later and some extra printks, it turns out that
taking the writer lock in __filemap_add_folio() makes us dramatically
more likely to just return EEXIST than go into the xas_split_alloc() dance.

With the changes in 6.10, we only get into that xas_destroy() case above
when the conflicting entry is a shadow entry, so I changed my repro to
use memory pressure instead of fadvise.

I also added a schedule_timeout(1) after the split alloc, and with all
of that I'm able to consistently make the xas_destroy() case trigger
without causing any system instability.  Kairui Song's patches do seem
to have fixed things nicely.

-chris





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux