[Bug 216007] XFS hangs in iowait when extracting large number of files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://bugzilla.kernel.org/show_bug.cgi?id=216007

--- Comment #23 from Mel Gorman (mgorman@xxxxxxx) ---
(In reply to Peter Pavlisko from comment #21)
> (In reply to Mel Gorman from comment #20)
> > (In reply to Peter Pavlisko from comment #19)
> > > (In reply to Mel Gorman from comment #18)
> > > > Created attachment 301044 [details]
> > > > Patch to always allocate at least one page
> > > > 
> > > > Hi Peter,
> > > > 
> > > > Could you try the attached patch against 5.18 please? I was unable to
> > > > reproduce the problem but I think what's happening is that an array for
> > > > receiving a bulk allocation is partially populated and the bulk
> allocator
> > > is
> > > > returning without allocating at least one page. Allocating even one
> page
> > > > should hit the path where kswapd is woken.
> > > 
> > > Hi Mel,
> > > 
> > > I tried this patch and it does indeed work with 5.18.0-rc7. Without the
> > > patch it freezes, after I apply the patch the archive extracts
> flawlessly.
> > 
> > Thanks Peter, I'll prepare a proper patch and post it today. You won't be
> > cc'd as I only have the bugzilla email alias for you but I'll post a
> > lore.kernel.org link here.
> 
> Thank you very much.
> 
> I don't know if this is the proper place to discuss this, but I am curious
> about the cause. Was it an issue with the way XFS is calling the allocator
> in a for(;;) loop when it does not get the expected result? Or was it an
> issue with the allocator itself not working in some obscure edge case? Or
> was it my .config, stretching the kernel ability to function in a bad way?

I think blaming XFS would be excessive even though the API says it only
attempts to allocate the requested number of pages with no guarantee the exact
number will be returned. A glance at the implementation would show it was
trying to return at least one page and the code flow of XFS hints that the XFS
developers expected that some progress would generally be made or kswapd would
be woken as appropriate. The original intention was that the caller did not
necessarily need all the pages but that's not true for XFS or NFS. While I
could have altered XFS, it would have encouraged boiler-plate code to be
created for NFS or any other user of the API that requires the exact number of
pages to be returned.

The allocator was working as intended but not necessarily working as desired
for callers.

I don't think your .config is at fault. While I could not reproduce the
problem, I could force the problem to occur by injecting failures. It's
possible that the problem is easier to trigger on your particular machine but
the corner case existed since 5.13. It's an interesting coincidence that a
similar problem was found on NFS at roughly the same time which might indicate
that something else changed to make this corner case easier to trigger but the
corner case was always there.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux