Re: [PATCH v4 3/8] mailbox: mtk-cmdq: Add driver data to support for MT8196

Jason-JH Lin (林睿祥) <Jason-JH.Lin@xxxxxxxxxxxx> · Thu, 6 Mar 2025 11:00:54 +0000

[snip]
> 
> > CPR_GSIZE is the setting for allocating the CPR SRAM size to each
> > VM.
> 
> Would be awesome if you could then clarify the comment that you have
> later in
> the code here, from...
> 
> /* config cpr size for host vm */
> 
> to
> 
> /* Set the amount of CPR SRAM to allocate to each VM */
> 
> ...that could be a way of more properly describing what the writel
> there is doing.
> 

OK, I'll change it.

> > 
> > > The GCE stuff isn't even properly described in datasheets - I do
> > > (probably!)
> > > understand what those are for, but asking people to get years of
> > > experience on
> > > MediaTek to understand what's going on would be a bit rude,
> > > wouldn't
> > > it? :-D
> > > 
> > 
> > I agree with you :-)
> > I'll put them in the VM patch and add some brief description for
> > them.
> > 
> 
> Thanks, much appreciated!
> 
> > > > +
> > > >    #define CMDQ_THR_ACTIVE_SLOT_CYCLES 0x3200
> > > >    #define CMDQ_THR_ENABLED            0x1
> > > >    #define CMDQ_THR_DISABLED           0x0
> > > > @@ -87,11 +98,24 @@ struct cmdq {
> > > >    struct gce_plat {
> > > >        u32 thread_nr;
> > > >        u8 shift;
> > > > +     dma_addr_t mminfra_offset;
> > > 
> > > It looks like this is exactly the DRAM's iostart... at least, I
> > > can
> > > see that in the
> > > downstream devicetree that's where it starts.
> > > 
> > >          memory: memory@80000000 {
> > >                  device_type = "memory";
> > >                  reg = <0 0x80000000 0 0x40000000>;
> > >          };
> > > 
> > > It doesn't really look like being a coincidence, but, for the
> > > sake of
> > > asking:
> > > is this just a coincidence? :-)
> > > 
> > 
> > As the confirmation with the hardware designer in previous reply
> > mail
> > for CK:
> > https://patchwork.kernel.org/project/linux-mediatek/patch/20250218054405.2017918-4-jason-jh.lin@xxxxxxxxxxxx/*26258463
> > 
> 
> That explanation was simply wonderful.
> 
> > Since the MMINFRA remap subtracting 2G is done in the hardware
> > circuit
> > and cannot be configured by software, the address +2G adjustment is
> > necessary to implement in the CMDQ driver.
> > 
> > So that might not be a coincidence.
> > But even if DRAM start address changes, this mminfra_offset is
> > still
> > subtracting 2G, so I think it is a better choice to define it as
> > the
> > driver data for MT8196.
> > 
> 
> ....so, this makes me think the following:
> 
> 1. The DRAM start address cannot *ever* be less than 2G, because
> otherwise the
>     MMINFRA HW would have a hole in the usable address range;
>     1a. If the start address changes to less than 2G, then also the
> IOMMU would
>         get limitations, not only the mminfra..!
>     2b. This makes it very very very unlikely for the start address
> to be changed
>         to less than 0x80000000
> 
> 2. If the DRAM start address changes to be ABOVE 2G (so more than
> 0x80000000),
>     there would be no point for MMINFRA to start a "config path"
> write (or read)
>     in the SMMU DRAM block, would it? ;-)
> 

GCE is using IOMMU in MT8196, so all the address put into the GCE
instruction or GCE register for GCE access should be IOVA.

The DRAM start address is 2G(PA=0x80000000, IOVA=0x0) currently, so
when GCE want to access the IOVA=0x0, it will need to +2G into the
instruction, then the MMINFRA will see it as data path(IOVA > 2G) and
subtract 2G for that IOVA, so GCE can finally access the IOVA=0x0.

I'm not sure if I've misunderstood what you mean by ABOVE 2G. :-)
If DRAM start address is changed to 3G(PA=0xc0000000) the IOVA is still
0x0, so GCE still need to + 2G to make MMINFRA go to the data path.

But if you mean PA=0x80000000 and IOVA start address is 3G(0xc0000000),
then MMINFRA will go to the data path without GCE +2G.
However, MMINFRA will -2G when going to the data path and that will
cause GCE access the wrong IOVA.
So GCE still need to +2G no matter IOVA start address is already can
make MMINFRA go to the data path(IOVA > 2G).

We have already complained to our hardware designer that MMINFRA -2G
con not be changed, which will make software operation very
troublesome.
So in the next few generations of SoC will change this MMINFRA -2G to
software configurable. Then we can just make IOVA start address to 2G
without adding the mminfra_offset to the IOVA for GCE.

> I get it - if the DRAM moves up, MMINFRA is still at 2G because
> that's hard baked
> into the hardware, but I foresee that it'll be unlikely to see a
> platform changing
> the DRAM start address arbitrarily, getting out-of-sync with MMINFRA.
> 
> I propose to just get the address from the memory node for now, and
> to add a nice
> comment in the code that explains that "In at least MT8196, the
> MMINFRA hardware
> subtracts xyz etc etc" (and that explanation from the previous email
> is again
> wonderful and shall not be lost: either use that in the comment, or
> add it to
> the commit description, because it's really that good).
> 
> Should a new SoC appear in the future requiring an offset from the
> DRAM start
> address, we will think about how to make that work in the best
> possible way: in
> that case we could either reference something else to get the right
> address or
> we can just change this driver to just use the 2G offset statically
> for all.
> 
> What I'm trying to do here is to reduce the amount of changes that
> we'd need for
> adding new SoCs: since that 2G MMINFRA offset -> 2G DRAM start is not
> a coincidence
> I think that, should the DRAM start vary on new SoCs, the MMINFRA
> offset will
> follow the trend and vary with it.
> 
> So what I think is:
> 1. If I'm right, adding a new SoC (with different MMINFRA + DRAM
> offset) will be
>     as easy as adding a compatible string in the bindings, no effort
> in changing
>     this driver with new pdata offsets;
> 2. If I'm wrong, adding a new SoC means adding compat string and
> adding pdata and
>     one variable in the cmdq struct.
> 
> Where N.2 is what we would do anyway if we don't go with my proposed
> solution...
> 
> All this is just to give you my considerations about this topic -
> you're left
> completely free to disagree with me.
> If you disagree, I will trust your judgement, no problem here.
> 

Yes, I think your are right. No matter the IOVA start address changing,
MMINFRA will still -2G(the start address of DRAM PA).
Do you mean we can get the mminfra_offset from the start address of
memory in DTS, rather than defining it in pdata?

> > > >        bool control_by_sw;
> > > >        bool sw_ddr_en;
> > > > +     bool gce_vm;
> > > > +     u32 dma_mask_bit;
> > > >        u32 gce_num;
> > > >    };
> > > > 
> > > > +static inline u32 cmdq_reg_shift_addr(dma_addr_t addr, const
> > > > struct gce_plat *pdata)
> > > > +{
> > > > +     return ((addr + pdata->mminfra_offset) >> pdata->shift);
> > > > +}
> > > > +
> > > > +static inline u32 cmdq_reg_revert_addr(dma_addr_t addr, const
> > > > struct gce_plat *pdata)
> > > > +{
> > > > +     return ((addr << pdata->shift) - pdata->mminfra_offset);
> > > > +}
> > > 
> > > I'm not sure that you really need those two functions... probably
> > > it's simply
> > > cleaner and easier to just write that single line every time...
> > > and
> > > I'm
> > > saying that especially for how you're using those functions, with
> > > some readl()
> > > passed directly as param, decreasing human readability by "a
> > > whole
> > > lot" :-)
> > > 
> > 
> > The reason why I use API wrapper instead of writing it directly in
> > readl() is to avoid missing the shift or mminfra_offset conversion
> > in
> > some places.
> > This problem is not easy to debug, and I have encountered it at
> > least
> > twice...
> > 
> > I think the advantage of using function is that it can be uniformly
> > modified to all places that need to handle DRAM address conversion.
> > What do you think? :-)
> > 
> 
> Eh, if you put it like that... it makes sense, so.. yeah, okay :-)
> 
> Still, please cleanup those instances of
> 
> `cmdq_reg_revert_addr(readl(something), pdata)`
> 
> those might be hard to read, so please just do something like:
> 
> regval = readl(something);
> curr_pa = cmdq_revert_addr(regval, pdata);
> 
> ...reword to your own liking, of course.
> 

OK, I'll refine that. Thanks.

> > > > +
> > > >    static void cmdq_sw_ddr_enable(struct cmdq *cmdq, bool
> > > > enable)
> > > >    {
> > > >        WARN_ON(clk_bulk_enable(cmdq->pdata->gce_num, cmdq-
> > > > >clocks));
> > > > @@ -112,6 +136,30 @@ u8 cmdq_get_shift_pa(struct mbox_chan
> > > > *chan)
> > > >    }
> > > >    EXPORT_SYMBOL(cmdq_get_shift_pa);
> > > > 
> > > > +dma_addr_t cmdq_get_offset_pa(struct mbox_chan *chan)
> > > > +{
> > > > +     struct cmdq *cmdq = container_of(chan->mbox, struct cmdq,
> > > > mbox);
> > > > +
> > > > +     return cmdq->pdata->mminfra_offset;
> > > > +}
> > > > +EXPORT_SYMBOL(cmdq_get_offset_pa);
> > > 
> > > I think I remember this get_offset_pa from the old times, then CK
> > > removed it (and I
> > > was really happy about that disappearing), or am I confusing this
> > > with something
> > > else?
> > > 
> > > (of course, this wasn't used for mminfra, but for something
> > > else!)
> > > 
> > 
> > I can't find any remove history in mtk-cmdq-mailbox.c.
> > 
> > Maybe you mean the patch in this series?
> > https://lore.kernel.org/all/171213938049.123698.15573779837703602591.b4-ty@xxxxxxxxxxxxx/
> > 
> 
> Uhm, I think I may have confused something here, but yes I was
> remembering the
> patch series that you pointed out, definitely.
> 
> At the end, that series is doing something else, so nevermind, was
> just confusion.
> 

OK, no problem.

> > > > +
> > > > +bool cmdq_addr_need_offset(struct mbox_chan *chan, dma_addr_t
> > > > addr)
> > > > +{
> > > > +     struct cmdq *cmdq = container_of(chan->mbox, struct cmdq,
> > > > mbox);
> > > > +
> > > > +     if (cmdq->pdata->mminfra_offset == 0)
> > > > +             return false;
> > > > +
> > > > +     /*
> > > > +      * mminfra will recognize the addr that greater than the
> > > > mminfra_offset
> > > > +      * as a transaction to DRAM.
> > > > +      * So the caller needs to append mminfra_offset for the
> > > > true
> > > > case.
> > > > +      */
> > > > +     return (addr >= cmdq->pdata->mminfra_offset);
> > > 
> > > 
> > > /**
> > >    * cmdq_is_mminfra_gce() - Brief description
> > >    * @args.....
> > >    *
> > >    * The MMINFRA GCE will recognize an address greater than DRAM
> > > iostart as a
> > >    * DRAM transaction instead of ....xyz
> > >    *
> > >    * In order for callers to perform (xyz) transactions through
> > > the
> > > CMDQ, those
> > >    * need to know if they are using a GCE located in MMINFRA.
> > >    */
> > > bool cmdq_is_mminfra_gce(...)
> > > {
> > >          return cmdq->pdata->mminfra_offset &&
> > >                 (addr >= cmdq->pdata->mminfra_offset)
> > > 
> > > > +}
> > > > +EXPORT_SYMBOL(cmdq_addr_need_offset);
> > > > +
> > > 
> > 
> > OK, I'll modify the API like this.
> > 
> > > ...but then, is there really no way of just handling the GCE
> > > being in
> > > MMINFRA
> > > transparently from the callers? Do the callers really *need* to
> > > know
> > > that they're
> > > using a new GCE?!
> > > 
> > 
> > Since the address subtracting is done in MMINFRA hardware, I think
> > GCE
> > users really need to handle it in driver.
> > 
> 
> Since the users of this infrastructure are multimedia related
> (disp/MDP3),
> I'd also like to get an opinion from MediaTek engineers familiar with
> that.
> 
> CK, Moudy, any opinion on that, please?
> 
> > > Another way of saying: can't we just handle the address
> > > translation
> > > in here instead
> > > of instructing each and every driver about how to communicate
> > > with
> > > the new GCE?!
> > > 
> > 
> > The DRAM address may not only be the command buffer to GCE, but
> > also
> > the working buffer provided by CMDQ users and being a part of GCE
> > instruction, so we need to handle the address translation in CMDQ
> > helper driver for the instruction generation.
> > E.g. ISP drivers may use GCE to write a hardware settings to a DRAM
> > as
> > backup buffer. The GCE write instruction will be:
> > WRITE the value of ISP register to DRAM address + mminfra_offset.
> > 
> > But most of the CMDQ users only need to use GCE to write hardware
> > register, so I only keep the translation in cmdq_pkt_mem_move(),
> > cmdq_pkt_poll_addr() and cmdq_pkt_jump_abs() at the latest series.
> 
> Yeah you're choosing the best of both worlds in that case, I do
> agree, but
> still - if there's a way to avoid drivers to have different handling
> for
> mminfra vs no-mminfra, that'd still be preferred.
> 
> Having the handling for something *centralized* somewhere, instead of
> it
> being sparse here and there, would make maintenance way easier...
> 
> ...and that's why I'm asking for CK and Moudy's opinion, nothing else
> :-)
> 

Yes, I totally agree with you. Thanks for the asking!

Regards,
Jason-JH.Lin 

> Cheers!
> Angelo
>