On Wed Sep 11, 2024 at 11:53 AM EEST, Roberto Sassu wrote: > I made few measurements. I have a Fedora 38 VM with TPM passthrough. I was thinking more like sudo bpftrace -e 'k:tpm_transmit { @start[tid] = nsecs; } kr:tpm_transmit { @[kstack, ustack, comm] = sum(nsecs - @start[tid]); delete(@start[tid]); } END { clear(@start); }' For example when running "tpm2_createprimary --hierarchy o -G rsa2048 -c owner.txt", I get: Attaching 3 probes... ^C @[ tpm_transmit_cmd+46 tpm2_flush_context+120 tpm2_commit_space+197 tpm_dev_transmit.constprop.0+137 tpm_dev_async_work+102 process_one_work+374 worker_thread+614 kthread+207 ret_from_fork+49 ret_from_fork_asm+26 , , kworker/4:2]: 2860677 @[ tpm_dev_transmit.constprop.0+111 tpm_dev_async_work+102 process_one_work+374 worker_thread+614 kthread+207 ret_from_fork+49 ret_from_fork_asm+26 , , kworker/16:1]: 3890693 @[ tpm_transmit_cmd+46 tpm2_load_context+195 tpm2_prepare_space+410 tpm_dev_transmit.constprop.0+54 tpm_dev_async_work+102 process_one_work+374 worker_thread+614 kthread+207 ret_from_fork+49 ret_from_fork_asm+26 , , kworker/4:2]: 9058524 @[ tpm_transmit_cmd+46 tpm2_save_context+179 tpm2_commit_space+314 tpm_dev_transmit.constprop.0+137 tpm_dev_async_work+102 process_one_work+374 worker_thread+614 kthread+207 ret_from_fork+49 ret_from_fork_asm+26 , , kworker/4:2]: 11426260 @[ tpm_transmit_cmd+46 tpm2_load_context+195 tpm2_prepare_space+318 tpm_dev_transmit.constprop.0+54 tpm_dev_async_work+102 process_one_work+374 worker_thread+614 kthread+207 ret_from_fork+49 ret_from_fork_asm+26 , , kworker/4:2]: 14182972 @[ tpm_transmit_cmd+46 tpm2_save_context+179 tpm2_commit_space+155 tpm_dev_transmit.constprop.0+137 tpm_dev_async_work+102 process_one_work+374 worker_thread+614 kthread+207 ret_from_fork+49 ret_from_fork_asm+26 , , kworker/4:2]: 22597059 @[ tpm_dev_transmit.constprop.0+111 tpm_dev_async_work+102 process_one_work+374 worker_thread+614 kthread+207 ret_from_fork+49 ret_from_fork_asm+26 , , kworker/4:2]: 1958500581 This results stacks to compare with "real" time spent total in each stack (in nsecs). CPU time is relevant measure in the problem we're dealing. BR, Jarkko