On 2/01/24 12:41, Jorge Ramirez-Ortiz, Foundries wrote: > On 11/12/23 09:00:06, Jorge Ramirez-Ortiz, Foundries wrote: >> On 06/12/23 11:00:47, Jorge Ramirez-Ortiz, Foundries wrote: >>> On 06/12/23 07:02:43, Avri Altman wrote: >>>>> >>>>> On 4/12/23 17:01, Jorge Ramirez-Ortiz wrote: >>>>>> Requesting a retune before switching to the RPMB partition has been >>>>>> observed to cause CRC errors on the RPMB reads (-EILSEQ). >>>>> >>>>> There are still 2 concerns: >>>>> 1) We don't really know the root cause. Have you determined if here are >>>>> CRC errors in the main partition also? >>> >>> right, and I don't disagree with that. >>> >>> As a test I created a 4GB file from /dev/random which I then copied >>> several times (dd if= ....) >>> >>> root@uz3cg-dwg-sec:/sys/kernel/debug/mmc0# cat err_stats >>> # Command Timeout Occurred: 0 >>> # Command CRC Errors Occurred: 0 >>> # Data Timeout Occurred: 0 >>> # Data CRC Errors Occurred: 0 >>> # Auto-Cmd Error Occurred: 0 >>> # ADMA Error Occurred: 0 >>> # Tuning Error Occurred: 0 >>> # CMDQ RED Errors: 0 >>> # CMDQ GCE Errors: 0 >>> # CMDQ ICCE Errors: 0 >>> # Request Timedout: 0 >>> # CMDQ Request Timedout: 0 >>> # ICE Config Errors: 0 >>> # Controller Timedout errors: 0 >>> # Unexpected IRQ errors: 0 >>> >>> However as soon as I access RPMB and fails (it takes just a few tries) I see: >>> >>> I/TC: RPMB: Using generated key >>> [ 86.902118] sdhci-arasan ff160000.mmc: __mmc_blk_ioctl_cmd: data error -84 >>> E/TC:? 0 >>> E/TC:? 0 TA panicked with code 0xffff0000 >>> E/LD: Status of TA 22250a54-0bf1-48fe-8002-7b20f1c9c9b1 >>> E/LD: arch: aarch64 >>> E/LD: region 0: va 0xc0004000 pa 0x7e200000 size 0x002000 flags rw-s (ldelf) >>> E/LD: region 1: va 0xc0006000 pa 0x7e202000 size 0x008000 flags r-xs (ldelf) >>> E/LD: region 2: va 0xc000e000 pa 0x7e20a000 size 0x001000 flags rw-s (ldelf) >>> E/LD: region 3: va 0xc000f000 pa 0x7e20b000 size 0x004000 flags rw-s (ldelf) >>> E/LD: region 4: va 0xc0013000 pa 0x7e20f000 size 0x001000 flags r--s >>> E/LD: region 5: va 0xc0014000 pa 0x7e22c000 size 0x005000 flags rw-s (stack) >>> E/LD: region 6: va 0xc0019000 pa 0x818ea9ba8 size 0x002000 flags rw-- (param) >>> E/LD: region 7: va 0xc001b000 pa 0x818e97ba8 size 0x001000 flags rw-- (param) >>> E/LD: region 8: va 0xc004f000 pa 0x00001000 size 0x014000 flags r-xs [0] >>> E/LD: region 9: va 0xc0063000 pa 0x00015000 size 0x008000 flags rw-s [0] >>> E/LD: [0] 22250a54-0bf1-48fe-8002-7b20f1c9c9b1 @ 0xc004f000 >>> E/LD: Call stack: >>> E/LD: 0xc0051a14 >>> E/LD: 0xc004f31c >>> E/LD: 0xc0052d40 >>> E/LD: 0xc004f624 >>> >>> root@uz3cg-dwg-sec:/var/rootdirs/home/fio# cat /sys/kernel/debug/mmc0/err_stats >>> # Command Timeout Occurred: 0 >>> # Command CRC Errors Occurred: 0 >>> # Data Timeout Occurred: 0 >>> # Data CRC Errors Occurred: 1 >>> # Auto-Cmd Error Occurred: 0 >>> # ADMA Error Occurred: 0 >>> # Tuning Error Occurred: 0 >>> # CMDQ RED Errors: 0 >>> # CMDQ GCE Errors: 0 >>> # CMDQ ICCE Errors: 0 >>> # Request Timedout: 0 >>> # CMDQ Request Timedout: 0 >>> # ICE Config Errors: 0 >>> # Controller Timedout errors: 0 >>> # Unexpected IRQ errors: 0 >>> >>>>> 2) Forcing this on everyone >>>>> >>>>> The original idea was that because re-tuning cannot be done in RPMB, the >>>>> need to re-rune in RPMB could be avoided by always re-tuning before >>>>> switching to RPMB and then switching straight back. IIRC re-tuning should >>>>> guarantee at least 4MB more I/O without issue. >>>> Performance is hardly an issue in the context of RPMB access - >>>> For most cases it’s a single frame. >>> >>> Yes, the security use case typically stores hashes, variables >>> (bootcount, upgrade_available, versions, that sort of thing) and >>> certificates in RPMB. >>> >>> Since you mentioned, I am seeing that tuning before switching to RPMB >>> has an impact on performance. As a practical test, just reading a 6 byte >>> variable incurs in 50ms penalty in kernel space due to the need to >>> retune 5 times. Not great since the request is coming from a Trusted >>> Application via OP-TEE through the supplicant meaning this TEE thread >>> (they are statically allocated CFG_NUM_THREADS) will be reserved for >>> quite a bit of time. >>> >>> Roughly: >>> TA --> OP-TEE (core) --> TEE-supplicant --> Kernel (>50ms) --> OP-TEE --> TA >> >> To add more detail to the timing above, when using RPMB, OP-TEE stores >> the secure filesystem on RPMB as well, so accessing one of the variables >> stored in the filesystem consists on a number (~5) of individual RPMB >> requests (each one forcing a retune, each retune taking around 10ms). > > Adrian, please could you comment on the above. > > The current code is a performance drag for systems that implement their > secure filesystems on RPMB (i.e: OP-TEE) causing each read operation (of > variables consisting of a few bytes stored in such a filesystem) to > perform 5 consecutive retune requests. > > I am just thinking whether the original use case that forces a call to > retune prior to processing the RPMB request remains valid. I am not sure what you are asking. For some transfer modes, re-tuning is expected to deal with sampling point drift over time, mainly due to temperature changes. It is done either periodically (tuning timer) or after a CRC error. There is no reason to assume RPMB is immune from that. Certainly re-tuning before switching to RPMB is not optimal for performance, and we can leave that out, but a CRC error before or during RPMB operations will *still* result in re-tuning after switching back from RPMB. In your case, re-tuning makes things worse, which is a bit of a mystery. Running the new re-tuning test would tell us whether it makes things worse in general, or only for RPMB. > > Independently of the fact that not doing so fixes the problem I was > working on - and with the information I have - I dont think RPMB is > generally used to store larger files (maybe you have more information > about the average use case? are you aware of systems using RPMB to store > binaries or images?) > > I still I have to execute the test you shared some weeks ago. Bit of a > pain to NFS boot this system...will try to do it this week. Other options are to boot with an initrd only, or after boot switch to a RAM-based file system. I was waiting for this, since it is good to try to get closer to a root cause, but as you point out, the change is good for performance also, so I will Ack it. > > TIA > >> >> BTW, I also tried delaying the timing between those consecutive retunes >> (up to 1 second), but the issue still persisted. >> >>> >>> Adrian, I couldn't find the original performance justification for >>> enabling this feature globally. At which point do you think it becomes >>> beneficial to retune before accessing RPMB? >> >> How should we proceed with this patch then? can it be merged as I >> proposed? should I rewrite it differently? not sure what is next >> >> TIA >> Jorge