On Wed, 2025-03-05 at 12:03 +0100, AngeloGioacchino Del Regno wrote: > > External email : Please do not click links or open attachments until > you have verified the sender or the content. > > > Il 05/03/25 10:46, Jason-JH Lin (林睿祥) ha scritto: > > On Tue, 2025-03-04 at 10:35 +0100, AngeloGioacchino Del Regno > > wrote: > > > > > > External email : Please do not click links or open attachments > > > until > > > you have verified the sender or the content. > > > > > > > > > Il 18/02/25 06:41, Jason-JH Lin ha scritto: > > > > When GCE executes instructions, the corresponding hardware > > > > register > > > > can be found through the subsys ID. For hardware that does not > > > > support > > > > subsys ID, its subsys ID will be set to invalid value and its > > > > physical > > > > address needs to be used to generate GCE instructions. > > > > > > > > This commit adds a pa_base parsing flow to the cmdq_client_reg > > > > structure > > > > for these unsupported subsys ID hardware. > > > > > > > > > > Does this work only for the MMINFRA located GCEs, or does this > > > work > > > also for > > > the legacy ones in MT8173/83/88/92/95 // MT6795/6893/etc? > > > > > > In order to actually review and decide, I do need to know :-) > > > > > > > Yes, it's for the SoCs without subsys ID, it's not related to the > > MMINFRA. > > > > This can also work on MT8173/83/92/95 // MT6795/6893/etc. > > You can remove the `mediatek,gce-client-reg` properties in their > > dtsi > > and cherry-pick this series to verify it. :-) > > > > This is curious - and that brings more questions to the table (for > curiosity > more than anything else at this point). > > Since this is a way to make use of the CMDQ for address ranges that > are not tied > to any subsys id (hence no gce-client-reg and just physical address > parsing for > generating instructions), do you know what are the performance > implications of > using this, instead of subsys IDs on SoCs that do support them? > The main advantage of using subsys ID is to reduce the number of instruction. Without subsy ID, you will need one more `ASSIGN` instruction to assign the high bytes of the physical address. E,g. In mt8195-gce.h: #define SUBSYS_1c00XXXX 3 If you want GCE to write the value 0x0000000f to 0x1c00_002c. With subsys ID, you can use only one instruction to achieve it: 1. WRITE value: 0x000000f to subsys: 0x3 + offset: 0x0002c - OP code: WRTIE = 0x90 - subsys ID: 0x1c00XXXX = 0x03 - offset: 0x002c - value: 0x0000000f Without subsys ID, you will need 2 instructions to achieve it: 1. ASSIGN address high bytes: 0x1c00 to GCE temp register: SPR0 - OP code: LOGIC = 0xa0 - arg_type: register, value, value = (0x8) - sub OP: ASSIGN = 0x0 - register index to store the assign value: SPR0 = 0x0 - value to assign: 0x1c00 2. WRITE value: 0x0000000f to temp register: SPR0 + offset:0x002c - OP code: WRITE = 0x90 - sub OP(temp register index): SPR0 = 0x0 - offset for temp register: 0x002c - value: 0x0000000f > Being clear: if we were to migrate a SoC like MT8195 to using this > globally > instead of using subsys ids, would the performance be degraded? > And if yes, do you know by how much? > E,g. If the inst number with subsys ID is N. 1. If CMDQ is implement like this, then inst number will be (N * 2): assign SPR0 = 0x1c00 write A to SPR0 + offset: 0x2c assign SPR0 = 0x1c00 write B to SPR0 + offset: 0x3c assign SPR0 = 0x1c00 write C to SPR0 + offset: 0x4c ... 2. If CMDQ is implement like this, the inst number will be (N + 1 * n): assign SPR0 = 0x1c00 write A to SPR0 + offset: 0x2c write B to SPR0 + offset: 0x3c write C to SPR0 + offset: 0x4c ... When the same cmd buffer changes the base address for n times: assign SPR0 = 0x1c00 write A to SPR0 + offset: 0x2c assign SPR0 = 0x1c01 write B to SPR0 + offset: 0x2c assign SPR0 = 0x1c02 write C to SPR0 + offset: 0x2c assign SPR0 = 0x1c00 write D to SPR0 + offset: 0x3c ... So you can imagine the performance will increase, but maybe not too much if we use it in the right way... Except the old SoC that didn't support SPR and CPR. The reason will be addressed in the next paragraph. > What you're proposing almost looks like being too good to be true - > and makes > me wonder, at this point, why the subsys id was used in the first > place :-) > That's because of the old GCE version in the old SoC only support GPR, it didn't support SPR and CPR. GPR: All 32 GCE threads share the same GPR0~GPR15, GPR will be affected by other GCE threads if they use it at the same time. SPR: Each GCE thread has 4 SPR, SPR won't be affected by another GCE thread. CPR: All 32 GCE threads share the same CPR, there are over 1000 CPR can be used. It need to be managed properly to avoid the resource conflicting. Due to the GPR resource restriction in the old GCE version, the usage of subsys ID can avoid GPR conflicting issues when multiple GCE threads are using GPR to physical assign high bytes all the time. I have simplified some complicate instruction rules, so the description above may not be 100% matched to the CMDQ helper driver code. But I think the main concept is correct. Hope these explanation can help well :-) Regards, Jason-JH Lin > Cheers! > Angelo > > > Regards, > > Jason-JH Lin > > > > > Thanks, > > > Angelo > > > > > > > > > > >