On Thu, May 06, 2021 at 12:34:06PM -0700, Darrick J. Wong wrote: > On Tue, Oct 06, 2020 at 03:07:20PM +0100, Matthew Wilcox wrote: > > On Mon, Oct 05, 2020 at 08:55:37PM -0700, Darrick J. Wong wrote: > > > On Mon, Oct 05, 2020 at 11:21:02AM -0400, Brian Foster wrote: > > > > We've had reports of soft lockup warnings in the iomap ioend > > > > completion path due to very large bios and/or bio chains. Divert any > > > > ioends with 256k or more pages to process to the workqueue so > > > > completion occurs in non-atomic context and can reschedule to avoid > > > > soft lockup warnings. > > > > > > Hmmmm... is there any way we can just make end_page_writeback faster? > > > > There are ways to make it faster. I don't know if they're a "just" > > solution ... > > > > 1. We can use THPs. That will reduce the number of pages being operated > > on. I hear somebody might have a patch set for that. Incidentally, > > this patch set will clash with the THP patchset, so one of us is going to > > have to rebase on the other's work. Not a complaint, just acknowledging > > that some coordination will be needed for the 5.11 merge window. > > How far off is this, anyway? I assume it's in line behind the folio > series? Right. The folio series found all kinds of fun places where the accounting was wrong (eg accounting for an N-page I/O as a single page), so the THP work is all renamed folio now. The folio patchset I posted yesterday [1] is _most_ of what is necessary from an XFS point of view. There's probably another three dozen mm patches to actually enable multi-page folios after that, and a lot more patches to optimise the mm/vfs for multi-page folios, but your side of things is almost all published and reviewable. [1] https://lore.kernel.org/linux-fsdevel/20210505150628.111735-1-willy@xxxxxxxxxxxxx/ > > 2. We could create end_writeback_pages(struct pagevec *pvec) which > > calls a new test_clear_writeback_pages(pvec). That could amortise > > taking the memcg lock and finding the lruvec and taking the mapping > > lock -- assuming these pages are sufficiently virtually contiguous. > > It can definitely amortise all the statistics updates. > > /me kinda wonders if THPs arent the better solution for people who want > to run large ios. Yes, definitely. It does rather depend on their usage patterns, but if they're working on a file-contiguous chunk of memory, this can help a lot. > > 3. We can make wake_up_page(page, PG_writeback); more efficient. If > > you can produce this situation on demand, I had a patch for that which > > languished due to lack of interest. > > I can (well, someone can) so I'll talk to you internally about their > seeekret reproducer. Fantastic. If it's something that needs to get backported to a stable-ABI kernel ... this isn't going to be a viable solution.