Re: About kexec issues in AWS nitro instances (RH bz 1758323)

Bhupesh Sharma <bhsharma@xxxxxxxxxx> · Mon, 2 Mar 2020 00:20:59 +0530

Hi Guilherme,

On Sat, Feb 29, 2020 at 10:37 PM Guilherme G. Piccoli
<gpiccoli@xxxxxxxxxxxxx> wrote:
>
> Hi Bhupesh and Dave (and everybody CC'ed here), I'm Guilherme Piccoli
> and I'm working in the same issue observed in RH bugzilla 1758323 [0] -
> or at least, it seems to be the the same heh

Ok.

> The reported issue in my case was that the 2nd kexec fails on Nitro
> instanced, and indeed it's reproducible. More than this, it shows as an
> initrd corruption. I've found 2 workarounds, using the "new" kexec
> syscall (by doing kexec -s -l) and keep the initrd memory "un-freed",
> using the kernel parameter "retain_initrd".

I have a couple of questions:
- How do you conclude that you see an initrd corruption across kexec?
Do you print the initial hex contents of initrd across kexec?
- Also do you try repeated/nested kexec and see initrd corruption
after several kexec reboot attempts?

I have the following observations on my Nitro instance:
- With upstream kernel (5.6.0-rc3), I am seeing that the repeated
kexec attempts even with 'kexec -s -l' and using 'retain_initrd' in
the kernel bootargs, can I lead to kexec reboot failures. Although the
frequency of the failure goes down drastically with these, as compared
to vanilla 'kexec -s' invocation.

Here are the aws console logs on the nitro console with kernel
5.6.0-rc3+ on an x86_64 instance when the 'kexec -s -l' or 'kexec -l'
with 'retain_initrd' fails:

login: [   80.077578] Unregister pv shared memory for cpu 1
[   80.081755] Unregister pv shared memory for cpu 0
[   80.209953] kexec_core: Starting new kernel
        2020-02-29T19:20:16+00:00
<.. no console logs after this (even after adding earlycon) ..>

- Note that there are no updated console log from the kexec kernel in
the failure case, so I am not sure if this was caused by some other
issue or the initrd corruption only.

- With the above, one needs to execute kexec reboot repeatedly and
normally in the ~ 11-15 kexec reboot run, you can see a kexec reboot
failure.

> I've noticed that your interesting investigation in the BZ led to
> SWIOTLB as a potential culprit, but trying with "swiotlb=noforce" or
> even "iommu=off" didn't help me.
> Also, worth notice a weird behavior: seems Amazon Linux 2 (based on
> kernel 4.14) sometimes works, or better saying, in some instances it
> works. I have 2x t3.large instances, in one of them I can make the
> Amazon Linux works (and to isolate potential out-of-tree patches, I've
> used Amazon Linux 2 config file and built a mainline 4.14, which also
> works in that particular instance).

That's good news, I am not sure about Amazon Linux (I am not sure if
the source for the same is available without buying a license).

I can share that "swiotlb=noforce" worked for me on one instance, but
the same was not reproducible on other nitro instances, so I think the
background issue is initrd corruption, but not able to pin-point at
the root-cause of the corruption yet.

BTW, have you been able to try the following kexec-tools fix as well
(see [1]) and see if this fixes the initrd corruption with 'kexec -s
-l' and 'kexec -l' (i.e. without using 'retain_initrd' bootargs)

[1]. http://lists.infradead.org/pipermail/kexec/2020-February/024531.html

> The reason for this email is to ask if you managed to figure the issue
> root-cause, or have some leads. I continue the debug here, but it's a
> bit difficult without access to AWS hypervisor (and it seems like a
> hypervisor issue for me). The fact that preserving the initrd memory
> prevents the problem seems to indicate that after freeing such
> high-address memory, the hypervisor somewhat manages to use that
> regardless if some other code is using that...ending up corrupting the
> initrd.
>
> I've also looped the kexec list in order to grow the audience, maybe
> somebody already faced that kind of issues and have some ideas.
> A collaboration in this debug would be greatly appreciate by me, it's a
> quite interesting issue and I'm looking forward to understand what's
> going on.
>
> Thanks in advance,

Thanks a lot for your email.
Let's continue discussing and hopefully we will have a fix for the issue soon.

Regards,
Bhupesh

_______________________________________________
kexec mailing list
kexec@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/kexec