Ahmed, Karim Allah karahmed@xxxxxxxxx > On Apr 10, 2017, at 3:57 PM, Juergen Gross <jgross@xxxxxxxx> wrote: > > On 10/04/17 15:47, Boris Ostrovsky wrote: >> On 04/07/2017 06:11 PM, Stefano Stabellini wrote: >>> On Fri, 7 Apr 2017, Boris Ostrovsky wrote: >>>> On 04/07/2017 01:36 PM, Stefano Stabellini wrote: >>>>> On Fri, 7 Apr 2017, Boris Ostrovsky wrote: >>>>>> On 04/07/2017 07:58 AM, Ian Jackson wrote: >>>>>>> tl;dr: >>>>>>> Please apply >>>>>>> >>>>>>> da72ff5bfcb02c6ac8b169a7cf597a3c8e6c4de1 >>>>>>> partially revert "xen: Remove event channel notification through >>>>>>> Xen PCI platform device" >>>>>>> >>>>>>> to all stable branches which have a version of the original broken >>>>>>> commit. This includes at least 4.9.y. >>>>>>> >>>>>>> Background: >>>>>>> >>>>>>> osstest service owner writes ("[linux-4.9 baseline test] 107238: tolerable FAIL"): >>>>>>> ... >>>>>>>> test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail never pass >>>>>>> osstest doesn't consider this a regresion because it looks for >>>>>>> regressions within a branch, and this is the first test of Linux 4.9. >>>>>>> However, this is a regression from the kernel we are currently using. >>>>>>> >>>>>>> L1 dom0 console log: >>>>>>> http://logs.test-lab.xenproject.org/osstest/logs/107238/test-amd64-amd64-qemuu-nested-intel/huxelrebe0---var-log-xen-osstest-serial-l1.guest.osstest.log >>>>>>> >>>>>>> It seems to have got stuck halfway through booting. >>>>>>> >>>>>>> The message >>>>>>> (XEN) *** Serial input -> Xen (type 'CTRL-x' three times to switch input to DOM0) >>>>>>> shows where osstest timed out on this test, and started its log >>>>>>> capture process (including collecting debug key output). >>>>>>> >>>>>>> Complete logs for this job here: >>>>>>> http://logs.test-lab.xenproject.org/osstest/logs/107238/test-amd64-amd64-qemuu-nested-intel/info.html >>>>>>> >>>>>>> Juergen Gross tells me that this is due to the lack of >>>>>>> da72ff5bfcb02c6ac8b169a7cf597a3c8e6c4de1. >>>>>>> >>>>>>> Thanks, >>>>>>> Ian. >>>>>>> >>>>>>> PS: Stefano, Boris: did you already request a backport of this commit? >>>>>>> If not, why not ? >>>>>> No, but this should indeed be backported to 4.9+ >>>>> Boris, are you going to do that? >>>> Is there anything that needs to be done beyond just applying it to 4.9 >>>> (4.10 apparently already has it). >>> No, I don't think so. 4.9 already has the offending commit. >> >> >> Looks like there will be a new version of the original patch >> (72a9b186292) so we should hold off with backport request to 4.9: >> >> https://lists.xenproject.org/archives/html/xen-devel/2017-04/msg01468.html > > TBH: I'm not convinced by the reasoning why 72a9b186292 has to be > reworked: Do we really care for Xen versions < 4.0 and a theoretical > problem (after all the author admitted the bug isn't being hit in > reality due to a short-circuit in the code)? IMHO, even if 72a9b186292 has not been reworked we should completely revert it not only partially revert it. Before this commit at least kernel 4.9+ would work on older Xen versions (< 4.0) while now, it will not even boot. I do agree however that fixing INTx sounds completely useless since there is no combination of Xen+Linux that would lead to the bug by default (unless you forced the use of INTx even when vector injection is supported which is what I did during testing the original patch). > > And even if we do: I'd rather add another patch to stable later than > keeping a real bug in Linux 4.9 which has been hit at least 3 times > up to now (by Stefano, George and Ian). > > > Juergen > Amazon Development Center Germany GmbH Berlin - Dresden - Aachen main office: Krausenstr. 38, 10117 Berlin Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger Ust-ID: DE289237879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B