On Tue, Jun 30, 2020 17:23:16 +0000, Zbigniew Jędrzejewski-Szmek wrote: > On Tue, Jun 30, 2020 at 04:25:23PM +0100, Ankur Sinha wrote: > > On Mon, Jun 29, 2020 15:01:24 -0600, Chris Murphy wrote: > > > https://bugzilla.redhat.com/show_bug.cgi?id=1851783 > > > > > > The main argument is that for typical and varied workloads in Fedora, > > > mostly on consumer hardware, we should use mq-deadline scheduler > > > rather than either none or bfq. > > > > > > It may be true most folks with NVMe won't see anything bad with none, > > > but those who have heavier IO workloads are likely to be better off > > > with mq-deadline. > > > > > > Further details are in the bug, but let's discuss it on list. Thanks! > > > > There was this thread about our systems hanging, and the workaround was > > to revert to mq-deadline from bfq: > > > > https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx/thread/MJJFT5AOYUFZ3SO2EDVLJSDAZMZI4HAP/#DA7RCQFIAD4Z3Q7HQBW2ELPTLPYDKJMT > > To clarify: you could reliably reproduce the issue when building steps in mock. > Did you verify that it is reliably fixed simply by switching bfq→mq-deadline? Yes, that was the first change I had made and it had stopped the hanging. As a permanent fix, though, I switched to using isolation = simple in mock, and since that works, I've not changed it since. (I make it a point to provide the needed information for bugs, but this release my quota is currently being used up on getting Docker + minikube to work on F32 for $dayjob) > > There are a few threads on AskFedora about systems hanging. They're not > > the easiest to debug but we did suggest people try switching to > > mq-deadline to see if it helps: > > > > https://ask.fedoraproject.org/t/whole-os-freezes-watching-a-video-with-mpv/6770/10 > > > > I don't know enough about this to say if it's a bug and if it has been > > fixed. > > There's a lot of noise in those bug reports. For heisenbugs, the fact > that something was an issue and after a flurry of half-random changes > to the system isn't, does not allow us conclude _anything_. We need > somebody who understands what they are doing to isolate the issue. In > particular, if this is a kernel hang, than we need a proper traceback > from the kernel, and not just assume it's the scheduler. There is a kernel trace in the related bug that was cited there: https://bugzilla.redhat.com/show_bug.cgi?id=1767097#c7 which links to another bfq bug here that's currently needinfo: https://bugzilla.redhat.com/show_bug.cgi?id=1767539 > (In particular, if this is a race condition, changing the scheduler > could be just making the condition less likely because the system is > slower or faster or just schedules processes in a different order, > without the scheduler being relevant to the bug). Like I said, I don't know. I'm a fairly advanced Linux user but you can hardly me to also be kernel hacker. :) For kernel bugs, I'd strongly suggest giving reporters steps by step instructions or links to using a "serial console" or a "netconsole". These are not part of my working vocabulary (I cannot speak for others). -- Thanks, Regards, Ankur Sinha "FranciscoD" (He / Him / His) | https://fedoraproject.org/wiki/User:Ankursinha Time zone: Europe/London
Attachment:
signature.asc
Description: PGP signature
_______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx