On Mon, 23 Oct 2023 at 09:35, Ian Kent <raven@xxxxxxxxxx> wrote: > > On 23/10/23 08:48, Ian Kent wrote: > > On 20/10/23 21:09, Ian Kent wrote: > >> On 20/10/23 19:23, Arnd Bergmann wrote: > >>> On Fri, Oct 20, 2023, at 12:45, Dan Carpenter wrote: > >>>> On Fri, Oct 20, 2023 at 11:55:57AM +0200, Anders Roxell wrote: > >>>>> On Fri, 20 Oct 2023 at 08:37, Arnd Bergmann <arnd@xxxxxxxx> wrote: > >>>>>> On Thu, Oct 19, 2023, at 17:27, Naresh Kamboju wrote: > >>>>>>> The qemu-x86_64 and x86_64 booting with 64bit kernel and 32bit > >>>>>>> rootfs we call > >>>>>>> it as compat mode boot testing. Recently it started to failed to > >>>>>>> get login > >>>>>>> prompt. > >>>>>>> > >>>>>>> We have not seen any kernel crash logs. > >>>>>>> > >>>>>>> Anders, bisection is pointing to first bad commit, > >>>>>>> 546694b8f658 autofs: add autofs_parse_fd() > >>>>>>> > >>>>>>> Reported-by: Linux Kernel Functional Testing <lkft@xxxxxxxxxx> > >>>>>>> Reported-by: Anders Roxell <anders.roxell@xxxxxxxxxx> > >>>>>> I tried to find something in that commit that would be different > >>>>>> in compat mode, but don't see anything at all -- this appears > >>>>>> to be just a simple refactoring of the code, unlike the commits > >>>>>> that immediately follow it and that do change the mount > >>>>>> interface. > >>>>>> > >>>>>> Unfortunately this makes it impossible to just revert the commit > >>>>>> on top of linux-next. Can you double-check your bisection by > >>>>>> testing 546694b8f658 and the commit before it again? > >>>>> I tried these two patches again: > >>>>> 546694b8f658 ("autofs: add autofs_parse_fd()") - doesn't boot > >>>>> bc69fdde0ae1 ("autofs: refactor autofs_prepare_pipe()") - boots > >>>>> > >>>> One difference that I notice between those two patches is that we no > >>>> long call autofs_prepare_pipe(). We just call autofs_check_pipe(). > >>> Indeed, so some of the f_flags end up being different. I assumed > >>> this was done intentionally, but it might be worth checking if > >>> the patch below makes any difference when the flags get put > >>> back the way they were. This is probably not the correct fix, but > >>> may help figure out what is going on. It should apply to anything > >>> from 546694b8f658 ("autofs: add autofs_parse_fd()") to the current > >>> linux-next: > >>> > >>> --- a/fs/autofs/inode.c > >>> +++ b/fs/autofs/inode.c > >>> @@ -358,6 +358,11 @@ static int autofs_fill_super(struct super_block > >>> *s, struct fs_context *fc) > >>> pr_debug("pipe fd = %d, pgrp = %u\n", > >>> sbi->pipefd, pid_nr(sbi->oz_pgrp)); > >>> + /* We want a packet pipe */ > >>> + sbi->pipe->f_flags |= O_DIRECT; > >>> + /* We don't expect -EAGAIN */ > >>> + sbi->pipe->f_flags &= ~O_NONBLOCK; > >>> + > >> > >> > >> That makes sense, we do want a packet pipe and that does also mean > >> > >> we don't want a non-blocking pipe, it will be interesting to see > >> > >> if that makes a difference. It's been a long time since Linus > >> > >> implemented that packet pipe and I can't remember now what the > >> > >> case was that lead to it. > > > > After thinking about this over the weekend I'm pretty sure my mistake > > > > is dropping the call to autofs_prepare_pipe() without adding the tail > > > > end of it into autofs_parse_fd(). > > > > > > To explain a bit of history which I'll include in the fix description. > > > > During autofs v5 development I decided to stay with the existing usage > > > > instead of changing to a packed structure for autofs <=> user space > > > > communications which turned out to be a mistake on my part. > > > > > > Problems arose and they were fixed by allowing for the 64 bit to 32 bit > > > > size difference in the automount(8) code. > > > > > > Along the way systemd started to use autofs and eventually encountered > > > > this problem too. systemd refused to compensate for the length difference > > > > insisting it be fixed in the kernel. Fortunately Linus implemented the > > > > packetized pipe which resolved the problem in a straight forward and > > > > simple way. > > > > > > So I pretty sure that the cause of the problem is the inadvertent > > dropping > > > > of the flags setting in autofs_fill_super() that Arnd spotted although I > > > > don't think putting it in autofs_fill_super() is the right thing to do. > > > > > > I'll produce a patch today which includes most of this explanation for > > > > future travelers ... > > So I have a patch. > > > I'm of two minds whether to try and use the instructions to reproduce this > > or not because of experiences I have had with other similar testing > automation > > systems that claim to provide a reproducer and end up a huge waste of > time and > > are significantly frustrating. > > > Can someone please perform a test for me once I provide the patch? Just tested it, and it passed our tests. Added a tested by flag in your patch. Thanks for the prompt fix. Cheers, Anders > > > Ian > > > > > > >> > >> > >> Ian > >> > >>> sbi->flags &= ~AUTOFS_SBI_CATATONIC; > >>> /*