On Wed Sep 11, 2024 at 3:21 PM EEST, James Bottomley wrote: > On Wed, 2024-09-11 at 10:53 +0200, Roberto Sassu wrote: > > On Tue, 2024-09-10 at 16:28 +0300, Jarkko Sakkinen wrote: > > > On Tue Sep 10, 2024 at 3:57 PM EEST, James Bottomley wrote: > > > > On Tue, 2024-09-10 at 15:48 +0300, Jarkko Sakkinen wrote: > > > > > On Tue Sep 10, 2024 at 3:39 PM EEST, Jarkko Sakkinen wrote: > > > > > > On Tue Sep 10, 2024 at 12:05 PM EEST, Roberto Sassu wrote: > > > > > > > On Tue, 2024-09-10 at 11:01 +0200, Linux regression > > > > > > > tracking > > > > > > > (Thorsten > > > > > > > Leemhuis) wrote: > > > > > > > > Hi, Thorsten here, the Linux kernel's regression tracker. > > > > > > > > > > > > > > > > James, Jarkoo, I noticed a report about a regression in > > > > > > > > bugzilla.kernel.org that appears to be caused by this > > > > > > > > change of > > > > > > > > yours: > > > > > > > > > > > > > > > > 6519fea6fd372b ("tpm: add hmac checks to > > > > > > > > tpm2_pcr_extend()") > > > > > > > > [v6.10-rc1] > > > > > > > > > > > > > > > > As many (most?) kernel developers don't keep an eye on > > > > > > > > the bug > > > > > > > > tracker, > > > > > > > > I decided to forward it by mail. To quote from > > > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=219229 : > > > > > > > > > > > > > > > > > When secureboot is enabled, > > > > > > > > > the kernel boot time is ~20 seconds after 6.10 kernel. > > > > > > > > > it's ~7 seconds on 6.8 kernel version. > > > > > > > > > > > > > > > > > > When secureboot is disabled, > > > > > > > > > the boot time is ~7 seconds too. > > > > > > > > > > > > > > > > > > Reproduced on both AMD and Intel platform on ThinkPad > > > > > > > > > X1 and > > > > > > > > > T14. > > > > > > > > > > > > > > > > > > It probably caused autologin failure and micmute led > > > > > > > > > not > > > > > > > > > loaded on AMD platform. > > > > > > > > > > > > > > > > It was later bisected to the change mentioned above. See > > > > > > > > the > > > > > > > > ticket for > > > > > > > > more details. > > > > > > > > > > > > > > Hi > > > > > > > > > > > > > > I suspect I encountered the same problem: > > > > > > > > > > > > > > https://lore.kernel.org/linux-integrity/b8a7b3566e6014ba102ab98e10ede0d574d8930e.camel@xxxxxxxxxxxxxxx/ > > > > > > > > > > > > > > Going to provide more info there. > > > > > > > > > > > > I suppose you are going try to acquire the tracing data I > > > > > > asked? > > > > > > That would be awesome, thanks for taking the troube. Let's > > > > > > look > > > > > > at the data and draw conclusions based on that. > > > > > > > > > > > > Workaround is pretty simple: CONFIG_TCG_TPM2_HMAC=n to the > > > > > > kernel > > > > > > configuration disables the feature. > > > > > > > > > > > > For making decisions what to do with the we are talking > > > > > > about ~2 > > > > > > week window estimated, given the Vienna conference slows > > > > > > things > > > > > > down, so I hope my workaround is good enough before that. > > > > > > > > > > I can enumerate three most likely ways to address the issue: > > > > > > > > > > 1. Strongest: drop from defconfig. > > > > > 2. Medium: leave to defconfig but add an opt-in kernel command- > > > > > line > > > > > parameter. > > > > > 3. Lightest: if we can based on tracing data nail the > > > > > regression in > > > > > sustainable schedule, fix it. > > > > > > > > Actually, there's a fourth: not use sessions for the PCR extend > > > > (if > > > > we'd got the timings when I asked, this was going to be my > > > > suggestion > > > > if they came back problematic). This seems only to be a problem > > > > for > > > > IMA measured boot (because it does lots of extends). If > > > > necessary this > > > > could even be wrapped in a separate config or boot option that > > > > only > > > > disables HMAC on extend if IMA (so we still get security for > > > > things > > > > like sd-boot) > > > > > > I can buy that but with a twist that make it an opt-in kernel > > > command > > > line option. We don't want to take already existing functionality > > > away > > > from those who might want to use it (given e.g. hardening > > > requirements), > > > and with that basis opt-in (by default disabled) would be more > > > balanced > > > way to address the issue. > > > > > > Please do a send a patch! > > > > I made few measurements. I have a Fedora 38 VM with TPM passthrough. > > > > Kernels: 6.11-rc2+ (guest), 6.5.0-45-generic (host) > > > > QEMU: > > > > rc qemu-kvm 1:4.2- > > 3ubuntu6.27 > > ii qemu-system-x86 1:6.2+dfsg- > > 2ubuntu6.22 > > > > > > TPM2_PT_MANUFACTURER: > > raw: 0x49465800 > > value: "IFX" > > TPM2_PT_VENDOR_STRING_1: > > raw: 0x534C4239 > > value: "SLB9" > > TPM2_PT_VENDOR_STRING_2: > > raw: 0x36373000 > > value: "670" > > > > > > No HMAC: > > > > # tracer: function_graph > > # > > # CPU DURATION FUNCTION CALLS > > # | | | | | | | > > 0) | tpm2_pcr_extend() { > > 0) 1.112 us | tpm_buf_append_hmac_session(); > > 0) # 6360.029 us | tpm_transmit_cmd(); > > 0) # 6415.012 us | } > > > > > > HMAC: > > > > # tracer: function_graph > > # > > # CPU DURATION FUNCTION CALLS > > # | | | | | | | > > 1) | tpm2_pcr_extend() { > > 1) | tpm2_start_auth_session() { > > 1) * 36976.99 us | tpm_transmit_cmd(); > > 1) * 84746.51 us | tpm_transmit_cmd(); > > 1) # 3195.083 us | tpm_transmit_cmd(); > > 1) @ 126795.1 us | } > > 1) 2.254 us | tpm_buf_append_hmac_session(); > > 1) 3.546 us | tpm_buf_fill_hmac_session(); > > 1) * 24356.46 us | tpm_transmit_cmd(); > > 1) 3.496 us | tpm_buf_check_hmac_response(); > > 1) @ 151171.0 us | } > > Well, unfortunately, that tells us that it's the TPM itself that's > taking the time processing the security overhead. The ordering of the > commands in tpm2_start_auth_session() shows > > 37ms for context restore of null key > 85ms for start session with encrypted salt > 3ms to flush null key > ----- > 125ms > > If we context save the session, we'd likely only bear a single 37ms > cost to restore it (replacing the total 125ms). However, there's > nothing we can do about the extend execution going from 6ms to 24ms, so > I could halve your current boot time with security enabled (it's > currently 149ms, it would go to 61ms, but it's still 10x slower than > the unsecured extend at 6ms) > > James I'll hold for better benchmarks. BR, Jarkko