Hi Guilherme, On Mon, Mar 23, 2020 at 8:16 PM Guilherme G. Piccoli <gpiccoli@xxxxxxxxxxxxx> wrote: > > On 22/03/2020 18:16, Bhupesh Sharma wrote: > > Hello Guilherme, > > > > On Fri, Mar 20, 2020 at 9:10 PM Guilherme G. Piccoli > > <gpiccoli@xxxxxxxxxxxxx> wrote: > > > > Thanks for writing again. I was caught up in trying several other > > suggestions/code-snippets to further debug this. > > I tried several combinations - turning iommu off, turning off swiotlb > > in the kexec kernel and testing various combinations with > > retain_initrd added to the kexec kernel's bootargs. > > > > But nothing seems to fix the nested repetitive kexec reboot attempts > > on the aws t3 machines I have. It just becomes better on few instances > > (i.e. the kexec reboots would survive around 10 nested repetitive > > attempts), while on the other(s) the failure can be seen quite > > frequently (approx ~3 kexec reboot attempts). > > Hi Bhupesh, thanks for the tests! Indeed, this problem is difficult to > prevent with those parameters, and it's quite interesting to see how it > may vary among instances. Indeed. > > [...] > > This is a really good debug and resulting patch. > > I ran almost ~60 kexec repetitive attempts last night and also > > repeated the same today morning and > > the issue seems to get fixed for me with upstream kernel 5.6.0-rc6+ > > with this patch. > > > > I am leaving a test running with RHEL kernel + this patch overnight > > and will have more updates to share by tomorrow morning. > > Thanks a lot =) > I couldn't fail to give due credit to my friend Gavin Shan for the great > suggestion that resulted in the patch! Let me know your results with the > patch Bhupesh, and your Tested-by on it is much appreciated. > > > > > >> Bhupesh, I've noticed that suddenly the Red Hat bugzilla got private - > > > > Oops. I will check. > > > >> is it okay to add me in CC list so I can see it? > > > > Sure. I tried doing it, but seems Bugzilla is not happy as it keeps > > complaining that you are not registered on BZ, > > I will try to find out internally how to get around the issue. > > > > Great! If you need me to sign-up in Bugzilla, I can do it. Just let me > know the steps and I'd be glad in doing that. Yes, please. I checked internally. If you can sign-up for Bugzilla, I can directly add you to the Cc field of the Bugzilla work-item. > >> Thanks for all the collaboration, I hope the issue was figured and solved! > > > > Sure. Thanks a lot for your inputs and trying the suggestions I posted > > on the Bugzilla ticket. > > I will soon share an update with RHEL/Fedora kernel kexec tests with > > this patch applied and also reply with a Tested-by for the upstream > > patch in the relevant thread. > > > > Thanks, > > Bhupesh > > > > Thank you, I appreciate the tests and collaboration =) > Cheers, No problem. The good news is that two runs of approx. ~200 runs of nested kexec reboots worked even with RHEL/Fedora + your patch on the aws t3 instance for me. So, this looks like a real good patch to have upstream. Thanks a lot for sharing and working on it. I will go ahead and add my Tested-by for the upstream patch as well. Thanks for all your help, Bhupesh _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec