Re: [PATCH] iomap: Address soft lockup in iomap_finish_ioend()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 03, 2022 at 11:08:54PM -0800, hch@xxxxxxxxxxxxx wrote:
> On Tue, Jan 04, 2022 at 12:22:15PM +1100, Dave Chinner wrote:
> > --- a/fs/iomap/buffered-io.c
> > +++ b/fs/iomap/buffered-io.c
> > @@ -1098,6 +1098,15 @@ iomap_ioend_can_merge(struct iomap_ioend *ioend, struct iomap_ioend *next)
> >  		return false;
> >  	if (ioend->io_offset + ioend->io_size != next->io_offset)
> >  		return false;
> > +	/*
> > +	 * Do not merge physically discontiguous ioends. The filesystem
> > +	 * completion functions will have to iterate the physical
> > +	 * discontiguities even if we merge the ioends at a logical level, so
> > +	 * we don't gain anything by merging physical discontiguities here.
> > +	 */
> > +	if (ioend->io_inline_bio.bi_iter.bi_sector + (ioend->io_size >> 9) !=
> 
> This open codes bio_end_sector()

No, it doesn't. The ioend can have chained bios or have others merged
and concatenated to the ioend->io_list, so ioend->io_size != length
of the first bio in the chain....

> > +	    next->io_inline_bio.bi_iter.bi_sector)
> 
> But more importantly I don't think just using the inline_bio makes sense
> here as the ioend can have multiple bios.  Fortunately we should always
> have the last built bio available in ->io_bio.

Except merging chains ioends and modifies the head io_size to
account for the chained ioends we add to ioend->io_list. Hence
ioend->io_bio is not the last bio in a contiguous ioend chain.

> > +		return false;
> >  	return true;
> >  }
> >  
> > @@ -1241,6 +1250,13 @@ iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t offset,
> >  		return false;
> >  	if (sector != bio_end_sector(wpc->ioend->io_bio))
> >  		return false;
> > +	/*
> > +	 * Limit ioend bio chain lengths to minimise IO completion latency. This
> > +	 * also prevents long tight loops ending page writeback on all the pages
> > +	 * in the ioend.
> > +	 */
> > +	if (wpc->ioend->io_size >= 4096 * PAGE_SIZE)
> > +		return false;
> 
> And this stops making sense with the impending additions of large folio
> support.  I think we need to count the pages/folios instead as the
> operations are once per page/folio.

Agree, but I was looking at this initially as something easy to test
and backport.

UNfortunately, we hide the ioend switching in a function that can be
called many times per page/folio and the calling function has no
real clue when ioends get switched. Hence it's much more invasive to
correctly account for size based on variable sized folios attached
to bios in an ioend compared to hard coding a simple IO size limit.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux