Re: drop bfq scheduler, instead use mq-deadline across the board

Zbigniew Jędrzejewski-Szmek <zbyszek@xxxxxxxxx> · Tue, 30 Jun 2020 21:01:08 +0000

On Tue, Jun 30, 2020 at 07:28:53PM +0100, Ankur Sinha wrote:
> On Tue, Jun 30, 2020 17:23:16 +0000, Zbigniew Jędrzejewski-Szmek wrote:
> > On Tue, Jun 30, 2020 at 04:25:23PM +0100, Ankur Sinha wrote:
> > > On Mon, Jun 29, 2020 15:01:24 -0600, Chris Murphy wrote:
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=1851783
> > > > 
> > > > The main argument is that for typical and varied workloads in Fedora,
> > > > mostly on consumer hardware, we should use mq-deadline scheduler
> > > > rather than either none or bfq.
> > > > 
> > > > It may be true most folks with NVMe won't see anything bad with none,
> > > > but those who have heavier IO workloads are likely to be better off
> > > > with mq-deadline.
> > > > 
> > > > Further details are in the bug, but let's discuss it on list. Thanks!
> > > 
> > > There was this thread about our systems hanging, and the workaround was
> > > to revert to mq-deadline from bfq:
> > > 
> > > https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx/thread/MJJFT5AOYUFZ3SO2EDVLJSDAZMZI4HAP/#DA7RCQFIAD4Z3Q7HQBW2ELPTLPYDKJMT
> > 
> > To clarify: you could reliably reproduce the issue when building steps in mock.
> > Did you verify that it is reliably fixed simply by switching bfq→mq-deadline?
> 
> Yes, that was the first change I had made and it had stopped the
> hanging. As a permanent fix, though, I switched to using isolation =
> simple in mock, and since that works, I've not changed it since.

OK, thanks.

> (I make it a point to provide the needed information for bugs, but this
> release my quota is currently being used up on getting Docker + minikube
> to work on F32 for $dayjob)
> 
> > > There are a few threads on AskFedora about systems hanging. They're not
> > > the easiest to debug but we did suggest people try switching to
> > > mq-deadline to see if it helps:
> > > 
> > > https://ask.fedoraproject.org/t/whole-os-freezes-watching-a-video-with-mpv/6770/10
> > > 
> > > I don't know enough about this to say if it's a bug and if it has been
> > > fixed.
> > 
> > There's a lot of noise in those bug reports. For heisenbugs, the fact
> > that something was an issue and after a flurry of half-random changes
> > to the system isn't, does not allow us conclude _anything_. We need
> > somebody who understands what they are doing to isolate the issue. In
> > particular, if this is a kernel hang, than we need a proper traceback
> > from the kernel, and not just assume it's the scheduler.
> 
> There is a kernel trace in the related bug that was cited there:
> https://bugzilla.redhat.com/show_bug.cgi?id=1767097#c7
> 
> which links to another bfq bug here that's currently needinfo:
> https://bugzilla.redhat.com/show_bug.cgi?id=1767539
> 
> > (In particular, if this is a race condition, changing the scheduler
> > could be just making the condition less likely because the system is
> > slower or faster or just schedules processes in a different order,
> > without the scheduler being relevant to the bug).
> 
> Like I said, I don't know. I'm a fairly advanced Linux user but you can
> hardly me to also be kernel hacker.  :)
> 
> For kernel bugs, I'd strongly suggest giving reporters steps by step
> instructions or links to using a "serial console" or a "netconsole".
> These are not part of my working vocabulary (I cannot speak for others).

Thanks for the links. This seems to be a tough cookie and I hope it
gets resolved as some point. And to clarify: my comment about
debugging was not directed to you in particular, apart from the
question above which you have already answered.

Zbyszek
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx