Re: [PATCH] btrfs: remove btrfs_writepage_cow_fixup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri 24-06-22 21:56:57, Qu Wenruo wrote:
> On 2022/6/24 21:40, Jan Kara wrote:
> > On Fri 24-06-22 21:19:04, Qu Wenruo wrote:
> > > 
> > > 
> > > On 2022/6/24 21:07, Jan Kara wrote:
> > > > On Fri 24-06-22 14:51:18, Christoph Hellwig wrote:
> > > > > On Fri, Jun 24, 2022 at 08:30:00PM +0800, Qu Wenruo wrote:
> > > > > > But from my previous feedback on subpage code, it looks like it's some
> > > > > > hardware archs (S390?) that can not do page flags update atomically.
> > > > > > 
> > > > > > I have tested similar thing, with extra ASSERT() to make sure the cow
> > > > > > fixup code never get triggered.
> > > > > > 
> > > > > > At least for x86_64 and aarch64 it's OK here.
> > > > > > 
> > > > > > So I hope this time we can get a concrete reason on why we need the
> > > > > > extra page Private2 bit in the first place.
> > > > > 
> > > > > I don't think atomic page flags are a thing here.  I remember Jan
> > > > > had chased a bug where we'd get into trouble into this area in
> > > > > ext4 due to the way pages are locked down for direct I/O, but I
> > > > > don't even remember seeing that on XFS.  Either way the PageOrdered
> > > > > check prevents a crash in that case and we really can't expect
> > > > > data to properly be written back in that case.
> > > > 
> > > > I'm not sure I get the context 100% right but pages getting randomly dirty
> > > > behind filesystem's back can still happen - most commonly with RDMA and
> > > > similar stuff which calls set_page_dirty() on pages it has got from
> > > > pin_user_pages() once the transfer is done.
> > > 
> > > Just curious, things like RMDA can mark those pages dirty even without
> > > letting kernel know, but how could those pages be from page cache? By
> > > mmap()?
> > 
> > Yes, you pass virtual address to RDMA ioctl and it uses memory at that
> > address as a target buffer for RDMA. If the target address happens to be
> > mmapped file, filesystem has problems...
> 
> Oh my god, this is going to be disaster.
> 
> RDMA is really almost a blackbox which can do anything to the pages.
> 
> If some RDMA drivers choose to screw up with Private2, the btrfs
> workaround is also screwed up.
> 
> Another problem is related to subpage.
> 
> Btrfs (and iomap) all uses page->private to store extra bitmaps for
> subpage usage.
> If RDMA is changing page flags, it can easily lead to de-sync between
> subpage bitmaps with real page flags.

Well, RDMA could do this (in fact any kernel code can do this, can't it? ;)
but RDMA is not expected to mess with page state arbitrarily. The only
thing it should be doing (and that is kind of the whole point of RDMA) is
that it allows RDMA card to alter page contents through DMA and then
dirties those pages to tell the rest of the kernel that page contents
changed.

So practically we need to treat pages pinned by RDMA drivers as "writeably
mapped to userspace" without a chance to unmap them.

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux