Re: [PATCH v2] madvise: MADV_SOFT_OFFLINE requests can return -EBUSY

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 28, 2024 at 12:35:48PM +0100, Alejandro Colomar wrote:
> Hi Tyonnchie,
> 
> On Tue, Nov 26, 2024 at 11:12:03AM -0500, tyberry@xxxxxxxxxx wrote:
> > If the page could not be offlined madvise will return -EBUSY. This might occur if the page is currently in use or locked.
> 
> Could you show this in a small example program (if possible)?
> Like 30 lines or so.  If not, it's okay.

Hi Alejandro!

Given the ongoing holidays, let me take the liberty of giving some context
in order to keep the conversation going.

We received reports of failed LTP madvise11[1] tests. The errors looked
like this:

    madvise11.c:409: TINFO: Spawning 4 threads, with a total of 640 memory pages
    madvise11.c:132: TFAIL: madvise failed: EBUSY (16)
    madvise11.c:163: TINFO: Thread  [0]  returned 16, failed.
    madvise11.c:191: TFAIL: thread  [0]  - exited with errors
    madvise11.c:163: TINFO: Thread  [2]  returned 0, succeeded.
    madvise11.c:163: TINFO: Thread  [3]  returned 0, succeeded.
    madvise11.c:163: TINFO: Thread  [1]  returned 0, succeeded.
    madvise11.c:361: TINFO: Restore 629 Soft-offlined pages
    madvise11.c:290: TWARN: write(3,0x7ffce114b8a0,8) failed: EBUSY (16)

Clearly the problem had to do with -EBUSY being returned by a madvise()
operation. The bug was initially reported on kernels with PREEMPT_RT
enabled but we soon observed that the problem also happened with the stock
kernel, though requiring more repetitions to trigger issue.

After debug and investigation we observed that the -EBUSY return was a valid
case in the kernel code and was not being handled by the test. A fix was
sent to the LTP project by Li Wang[2], specifically for the madvise11 test.

In this process, we noticed that the man pages did not mention -EBUSY as a
possible result of a failed offlining operation, as described by Tyonnchie.

I hope this helps!

Best regards,
Luis

[1] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/madvise/madvise11.c
[2] https://lists.linux.it/pipermail/ltp/2024-May/038310.html


> Have a lovely day!
> Alex
> 
> > 
> > Signed-off-by: Tyonnchie Berry <tyberry@xxxxxxxxxx>
> > 
> > ---
> > 
> > diff --git a/man/man2/madvise.2 b/man/man2/madvise.2
> > index 4f2210ee2..c10dcd599 100644
> > --- a/man/man2/madvise.2
> > +++ b/man/man2/madvise.2
> > @@ -702,6 +702,13 @@ The map exists, but the area maps something that isn't a file.
> >  .BR MADV_COLLAPSE )
> >  Could not charge hugepage to cgroup: cgroup limit exceeded.
> >  .TP
> > +.B EBUSY
> > +(for
> > +.B MADV_SOFT_OFFLINE )
> > +If any pages within the add+length range could not be offlined,
> > +madvise will return -EBUSY.
> > +This might occur if the page is currently in use or locked.
> > +.TP
> >  .B EFAULT
> >  .I advice
> >  is
> > 
> 
> -- 
> <https://www.alejandro-colomar.es/>


---end quoted text---

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux