Re: io_submit() blocks for writes for substantial amount of time

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 19, 2017 at 07:31:04PM +0300, Avi Kivity wrote:
> 
> 
> On 09/19/2017 05:58 PM, Christoph Hellwig wrote:
> > On Tue, Sep 19, 2017 at 08:27:05AM -0400, Brian Foster wrote:
> > > > Please advise, is this a known bug? When can it happen? Is there a way
> > > > to work it around to avoid blocking?
> > > > 
> > > I'm not sure how either could be considered a bug based on the stack
> > > trace information alone. Allocations may require reading metadata and
> > > reads are synchronous. This all seems like pretty basic filesystem
> > > behavior.
> > > 
> > > I suppose performance may be a separate question. For the latter issue,
> > > I'd be curious whether leaving more free space available in the
> > > filesystem would help avoid running into busy extents. Perhaps having
> > > more memory and thus a larger buffer cache for btree blocks could help
> > > mitigate the former issue..? The deterministic workaround for both is to
> > > preallocate the associated file. If the file would be too large, another
> > > option may be to set an extent size hint to allocate the file in larger
> > > chunks and amortize the cost of the allocations over multiple writes.
> > Note that Linux 4.13 and later support a RWF_NOWAIT flag, that will
> > return -EAGAIN from io_submit for these conditions so they can be
> > handled by a thread pool.
> > 
> > Note that until a few years ago we performed all allocations from
> > a workqueue, this was changed by:
> > 
> > commit cf11da9c5d374962913ca5ba0ce0886b58286224
> > Author: Dave Chinner <dchinner@xxxxxxxxxx>
> > Date:   Tue Jul 15 07:08:24 2014 +1000
> > 
> >      xfs: refine the allocation stack switch
> > 
> > to only defer btree splits to a workqueue.  With that previous scheme
> > there might have been an option to defer AIO allocations to a workqueue,
> > but the main issue with that is that the worker thread which is then
> > going to do the actual data transfer would have to "borrow" the
> > mm_struct from the submitter.  That's the primary reason why something
> > like that was never implemented in mainline Linux.
> 
> For DIO, does it really need the mm_struct? It can just pin the pages and
> pass them to the workqueue function.
> 

I'm not sure what difference it makes regardless. We still have to wait
for an allocation to complete before we can issue an I/O. IIRC, the old
defer allocs to a wq thing was more about saving stack space than
providing async behavior.

Brian

> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux