On Thu Sep 12, 2024 at 4:26 PM EEST, James Bottomley wrote: > On Thu, 2024-09-12 at 16:16 +0300, Jarkko Sakkinen wrote: > > On Wed Sep 11, 2024 at 3:21 PM EEST, James Bottomley wrote: > > > On Wed, 2024-09-11 at 10:53 +0200, Roberto Sassu wrote: > [...] > > > > I made few measurements. I have a Fedora 38 VM with TPM > > > > passthrough. > > > > > > > > Kernels: 6.11-rc2+ (guest), 6.5.0-45-generic (host) > > > > > > > > QEMU: > > > > > > > > rc qemu-kvm 1:4.2- > > > > 3ubuntu6.27 > > > > ii qemu-system-x86 1:6.2+dfsg- > > > > 2ubuntu6.22 > > > > > > > > > > > > TPM2_PT_MANUFACTURER: > > > > raw: 0x49465800 > > > > value: "IFX" > > > > TPM2_PT_VENDOR_STRING_1: > > > > raw: 0x534C4239 > > > > value: "SLB9" > > > > TPM2_PT_VENDOR_STRING_2: > > > > raw: 0x36373000 > > > > value: "670" > > > > > > > > > > > > No HMAC: > > > > > > > > # tracer: function_graph > > > > # > > > > # CPU DURATION FUNCTION CALLS > > > > # | | | | | | | > > > > 0) | tpm2_pcr_extend() { > > > > 0) 1.112 us | tpm_buf_append_hmac_session(); > > > > 0) # 6360.029 us | tpm_transmit_cmd(); > > > > 0) # 6415.012 us | } > > > > > > > > > > > > HMAC: > > > > > > > > # tracer: function_graph > > > > # > > > > # CPU DURATION FUNCTION CALLS > > > > # | | | | | | | > > > > 1) | tpm2_pcr_extend() { > > > > 1) | tpm2_start_auth_session() { > > > > 1) * 36976.99 us | tpm_transmit_cmd(); > > > > 1) * 84746.51 us | tpm_transmit_cmd(); > > > > 1) # 3195.083 us | tpm_transmit_cmd(); > > > > 1) @ 126795.1 us | } > > > > 1) 2.254 us | tpm_buf_append_hmac_session(); > > > > 1) 3.546 us | tpm_buf_fill_hmac_session(); > > > > 1) * 24356.46 us | tpm_transmit_cmd(); > > > > 1) 3.496 us | tpm_buf_check_hmac_response(); > > > > 1) @ 151171.0 us | } > > > > > > Well, unfortunately, that tells us that it's the TPM itself that's > > > taking the time processing the security overhead. The ordering of > > > the commands in tpm2_start_auth_session() shows > > > > > > 37ms for context restore of null key > > > 85ms for start session with encrypted salt > > > 3ms to flush null key > > > ----- > > > 125ms > > > > > > If we context save the session, we'd likely only bear a single 37ms > > > cost to restore it (replacing the total 125ms). However, there's > > > nothing we can do about the extend execution going from 6ms to > > > 24ms, so I could halve your current boot time with security enabled > > > (it's currently 149ms, it would go to 61ms, but it's still 10x > > > slower than the unsecured extend at 6ms) > > > > > > James > > > > I'll hold for better benchmarks. > > Well, yes, I'd like to see this for a variety of TPMs. > > This one clearly shows it's the real time wait for the TPM (since it > dwarfs the CPU time calculation there's not much optimization we can do > on the kernel end). The one thing that's missing in all of this is > what was the TPM? but even if it's an outlier that's really bad at > crypto what should we do? We could have a blacklist that turns off the > extend hmac (or a whitelist that turns it on), but we can't simply say > too bad you need a better TPM. > > James I'm pasting here my yesterday's one-liner ;-) sudo bpftrace -e 'k:tpm_transmit { @start[tid] = nsecs; } kr:tpm_transmit { @[kstack, ustack, comm] = sum(nsecs - @start[tid]); delete(@start[tid]); } END { clear(@start); }' If you have a fix candidate, snippet of the output before/after would work as rationale too. Looking into the data Roberto put me tomorrow. BR, Jarkko