On Wed, Dec 04, 2024 at 09:35:20PM +0100, Alejandro Colomar wrote: > Hi Luis, Tyonnchie, > > On Fri, Nov 29, 2024 at 06:43:39PM -0300, Luis Claudio R. Goncalves wrote: > > On Thu, Nov 28, 2024 at 12:35:48PM +0100, Alejandro Colomar wrote: > > > Hi Tyonnchie, > > > > > > On Tue, Nov 26, 2024 at 11:12:03AM -0500, tyberry@xxxxxxxxxx wrote: > > > > If the page could not be offlined madvise will return -EBUSY. This might occur if the page is currently in use or locked. > > > > > > Could you show this in a small example program (if possible)? > > > Like 30 lines or so. If not, it's okay. > > > > Hi Alejandro! > > > > Given the ongoing holidays, let me take the liberty of giving some context > > in order to keep the conversation going. > > > > We received reports of failed LTP madvise11[1] tests. The errors looked > > like this: > > > > madvise11.c:409: TINFO: Spawning 4 threads, with a total of 640 memory pages > > madvise11.c:132: TFAIL: madvise failed: EBUSY (16) > > madvise11.c:163: TINFO: Thread [0] returned 16, failed. > > madvise11.c:191: TFAIL: thread [0] - exited with errors > > madvise11.c:163: TINFO: Thread [2] returned 0, succeeded. > > madvise11.c:163: TINFO: Thread [3] returned 0, succeeded. > > madvise11.c:163: TINFO: Thread [1] returned 0, succeeded. > > madvise11.c:361: TINFO: Restore 629 Soft-offlined pages > > madvise11.c:290: TWARN: write(3,0x7ffce114b8a0,8) failed: EBUSY (16) > > > > Clearly the problem had to do with -EBUSY being returned by a madvise() > > operation. The bug was initially reported on kernels with PREEMPT_RT > > enabled but we soon observed that the problem also happened with the stock > > kernel, though requiring more repetitions to trigger issue. > > > > After debug and investigation we observed that the -EBUSY return was a valid > > case in the kernel code and was not being handled by the test. A fix was > > sent to the LTP project by Li Wang[2], specifically for the madvise11 test. > > > > In this process, we noticed that the man pages did not mention -EBUSY as a > > possible result of a failed offlining operation, as described by Tyonnchie. > > > > I hope this helps! > > Thanks! I've applied the patch, with some tweaks: > <https://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/commit/?h=contrib&id=3205359a3a7079d9d40a50388e851874729a827a> > > I added an Acked-by on your behalf, Luis. Thank you! You have all my respect for the great work you and many others do with the man pages! Luis > Have a lovely night! > Alex > > > > > Best regards, > > Luis > > > > [1] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/madvise/madvise11.c > > [2] https://lists.linux.it/pipermail/ltp/2024-May/038310.html > > > > > > > Have a lovely day! > > > Alex > > > > > > > > > > > Signed-off-by: Tyonnchie Berry <tyberry@xxxxxxxxxx> > > > > > > > > --- > > > > > > > > diff --git a/man/man2/madvise.2 b/man/man2/madvise.2 > > > > index 4f2210ee2..c10dcd599 100644 > > > > --- a/man/man2/madvise.2 > > > > +++ b/man/man2/madvise.2 > > > > @@ -702,6 +702,13 @@ The map exists, but the area maps something that isn't a file. > > > > .BR MADV_COLLAPSE ) > > > > Could not charge hugepage to cgroup: cgroup limit exceeded. > > > > .TP > > > > +.B EBUSY > > > > +(for > > > > +.B MADV_SOFT_OFFLINE ) > > > > +If any pages within the add+length range could not be offlined, > > > > +madvise will return -EBUSY. > > > > +This might occur if the page is currently in use or locked. > > > > +.TP > > > > .B EFAULT > > > > .I advice > > > > is > > > > > > > > > > -- > > > <https://www.alejandro-colomar.es/> > > > > > > ---end quoted text--- > > > > -- > <https://www.alejandro-colomar.es/> ---end quoted text---
Attachment:
signature.asc
Description: PGP signature