Re: generic/650 makes v6.0-rc client unusable

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Wed, Nov 9, 2022 at 4:22 AM Shinichiro Kawasaki
<shinichiro.kawasaki@xxxxxxx> wrote:
>
> On Sep 04, 2022 / 21:15, Zorro Lang wrote:
> > On Sat, Sep 03, 2022 at 06:43:29PM +0000, Chuck Lever III wrote:
> > > While investigating some of the other issues that have been
> > > reported lately, I've found that my v6.0-rc3 NFS/TCP client
> > > goes off the rails often (but not always) during generic/650.
> > >
> > > This is the test that runs a workload while offlining and
> > > onlining CPUs. My test client has 12 physical cores.
> > >
> > > The test appears to start normally, but then after a bit
> > > the NFS server workload drops to zero and the NFS mount
> > > disappears. I can't run programs (sudo, for example) on
> > > the client. Can't log in, even on the console. The console
> > > has a constant stream of "can't rotate log: Input/Output
> > > error" type messages.
>
> I also observe this failure when I ran fstests using btrfs on my HDDs.
> The failure is recreated almost always.

I'm wondering what do you get in dmesg, any traces?

I've excluded the test from my runs for over an year now, due to some
crash that I reported
to the mm and cpu hotplug people here:

https://lore.kernel.org/linux-mm/CAL3q7H4AyrZ5erimDyO7mOVeppd5BeMw3CS=wGbzrMZrp56ktA@xxxxxxxxxxxxxx/

Unfortunately I had no reply from anyone who works or maintains those
subsystems.

It didn't happen very often, and I haven't tested again with recent kernels.

>
> > >
> > > I haven't looked further into this yet. Actually I'm not
> > > quite sure where to start looking.
> > >
> > > I recently switched this client from a local /home to an
> > > NFS-mounted one, and that's where the xfstests are built
> > > and run from, fwiw.
> >
> > If most of users complain generic/650, I'd like to exclude g/650 from the
> > "auto" default run group. Any more points?
>
> +1. I wish to remove it from the "auto" group. Since I can not login to the test
> machine after the failure, I suggest to put it in the "dangerous" group.
>
> --
> Shin'ichiro Kawasaki



[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux