Re: Latest F-21 updates cause non-booting system on some Haswel systems + workaround

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 26, 2014 at 3:29 PM, Andrew Lutomirski <luto@xxxxxxx> wrote:
> On Fri, Sep 26, 2014 at 12:16 PM, Josh Boyer <jwboyer@xxxxxxxxxxxxxxxxx> wrote:
>> On Fri, Sep 26, 2014 at 3:05 PM, Andrew Lutomirski <luto@xxxxxxx> wrote:
>>> On Fri, Sep 26, 2014 at 11:58 AM, Josh Boyer <jwboyer@xxxxxxxxxxxxxxxxx> wrote:
>>>> On Fri, Sep 26, 2014 at 2:53 PM, Andrew Lutomirski <luto@xxxxxxx> wrote:
>>>>> On Fri, Sep 26, 2014 at 11:37 AM, Hans de Goede <hdegoede@xxxxxxxxxx> wrote:
>>>>>> Hi All,
>>>>>>
>>>>>> Just spend some time debugging this and thought I should share this, see:
>>>>>>
>>>>>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=762195
>>>>>>
>>>>>> for details, I've filed a bug to track fixing this in Fedora:
>>>>>>
>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1147062
>>>>>>
>>>>>> There are 2 ways this problem shows itself:
>>>>>>
>>>>>> 1) If using an initrd which has been generated with the troublesome microcode
>>>>>> update into it, things may already crash during the initrd, e.g. in my case
>>>>>> some luks volumes would not unlock because of this
>>>>>>
>>>>>> 2) When booting an older kernel (and thus an older initrd) things start crashing
>>>>>> (mostly systemd* processes, grinding everything to a halt) as soon as udev
>>>>>> from the rootfs loads the microcode update
>>>>>>
>>>>>> 2. often will still get you to an emergency shell, at which point one can
>>>>>> create a /etc/modprobe.conf.d/no_microcode.conf file with:
>>>>>>
>>>>>> blacklist microcode
>>>>>>
>>>>>> In there to work around the problem, then regenerate the initrds for newer
>>>>>> kernels, and you should be good to go until bug 1147062 gets fixed properly.
>>>>>
>>>>> Yeah, sorry, I should have tried to notify the right Fedora people in advance.
>>>>
>>>> I knew about it.  This is basically a breakdown in communication
>>>> between the kernel people and the microcode_ctl owner.  it was
>>>> compounded by the fact that I didn't realize our main dracut
>>>> maintainer was on PTO so I didn't fix dracut myself until this
>>>> morning.
>>>>
>>>>> This is a nasty issue, and no one knows how to solve it for real yet.
>>>>> See this long thread:
>>>>>
>>>>> http://thread.gmane.org/gmane.linux.kernel/1790211
>>>>
>>>> Yeah.  FWIW, I just filed an update to have Fedora use early microcode
>>>> by default.  A COPR of this was tested successfully by a few people
>>>> that had the Haswell issue.
>>>>
>>>> https://admin.fedoraproject.org/updates/dracut-038-29.git20140903.fc21,kernel-3.16.3-302.fc21
>>>
>>> Everyone will still crash their system once, though, unless
>>> microcode_ctl learns to blacklist this particular update.
>>
>> The microcode_ctl package that added this update was karma'd out.  So
>> it shouldn't be in updates-testing any longer, which will help
>> mitigate how wide spread it is.
>>
>> Also "Everyone" here is "everyone with a Haswell CPU that has HLE
>> enabled and managed to get microcode_ctl-2.1-8 from updates-testing in
>> the window it was available", which fortunately is not anywhere near
>> "everyone in Fedora".  Please use care when throwing around
>> generalities.
>
> I missed the fact that the microcode_ctl update was karma'd out.
>
> Nonetheless, at some point it, or something like it, may return, and
> unless we're careful, it will crash the system.  With the dracut fix,
> the system will reboot successfully, so long as the system crashes
> *after* the new initramfs gets generated during microcode_ctl
> installation.

The kernel was switched to using early microcode loading.  If the
microcode isn't prepended to the initramfs, it isn't loaded
automatically at all.  (In this specific instance, that would mean the
broken TSX is still enabled but they're no worse off than they were
before the ucode update in that case.)

> Or am I missing some reason that this will work better than I think it will?

With the kernel not loading it automatically, users should be able to
update the microcode_ctl package whenever they'd like.  It won't take
effect until they either:

1) Install a new kernel (which builds a new initramfs with the new microcode)
2) Regenerate their initramfs (which grabs the new microcode)
3) Manually load it via sysfs  (this probably isn't a good idea in
light of this scenario)

So things should be fine.

josh
-- 
devel mailing list
devel@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]
  Powered by Linux