On Wed, Jul 17, 2024 at 3:27 AM Bibek Kumar Patro <quic_bibekkum@xxxxxxxxxxx> wrote: > > > > On 7/16/2024 1:37 AM, Rob Clark wrote: > > On Mon, Jul 15, 2024 at 4:00 AM Bibek Kumar Patro > > <quic_bibekkum@xxxxxxxxxxx> wrote: > >> > >> > >> > >> On 7/10/2024 10:31 PM, Rob Clark wrote: > >>> On Tue, Jul 9, 2024 at 12:43 PM Bibek Kumar Patro > >>> <quic_bibekkum@xxxxxxxxxxx> wrote: > >>>> > >>>> > >>>> > >>>> On 7/4/2024 9:28 PM, Rob Clark wrote: > >>>>> On Thu, Jul 4, 2024 at 7:46 AM Rob Clark <robdclark@xxxxxxxxx> wrote: > >>>>>> > >>>>>> On Wed, Jul 3, 2024 at 4:38 AM Bibek Kumar Patro > >>>>>> <quic_bibekkum@xxxxxxxxxxx> wrote: > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On 7/2/2024 2:01 AM, Rob Clark wrote: > >>>>>>>> On Mon, Jul 1, 2024 at 4:01 AM Bibek Kumar Patro > >>>>>>>> <quic_bibekkum@xxxxxxxxxxx> wrote: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On 6/28/2024 9:14 PM, Rob Clark wrote: > >>>>>>>>>> On Fri, Jun 28, 2024 at 8:10 AM Bibek Kumar Patro > >>>>>>>>>> <quic_bibekkum@xxxxxxxxxxx> wrote: > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On 6/28/2024 7:47 PM, Rob Clark wrote: > >>>>>>>>>>>> On Fri, Jun 28, 2024 at 7:05 AM Bibek Kumar Patro > >>>>>>>>>>>> <quic_bibekkum@xxxxxxxxxxx> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>> Add an adreno-smmu-priv interface for drm/msm to call > >>>>>>>>>>>>> into arm-smmu-qcom and initiate the PRR bit setup or reset > >>>>>>>>>>>>> sequence as per request. > >>>>>>>>>>>>> > >>>>>>>>>>>>> This will be used by GPU to setup the PRR bit and related > >>>>>>>>>>>>> configuration registers through adreno-smmu private > >>>>>>>>>>>>> interface instead of directly poking the smmu hardware. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Suggested-by: Rob Clark <robdclark@xxxxxxxxx> > >>>>>>>>>>>>> Signed-off-by: Bibek Kumar Patro <quic_bibekkum@xxxxxxxxxxx> > >>>>>>>>>>>>> --- > >>>>>>>>>>>>> drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 23 ++++++++++++++++++++++ > >>>>>>>>>>>>> drivers/iommu/arm/arm-smmu/arm-smmu.h | 2 ++ > >>>>>>>>>>>>> include/linux/adreno-smmu-priv.h | 6 +++++- > >>>>>>>>>>>>> 3 files changed, 30 insertions(+), 1 deletion(-) > >>>>>>>>>>>>> > >>>>>>>>>>>>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > >>>>>>>>>>>>> index bd101a161d04..64571a1c47b8 100644 > >>>>>>>>>>>>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > >>>>>>>>>>>>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > >>>>>>>>>>>>> @@ -28,6 +28,7 @@ > >>>>>>>>>>>>> #define PREFETCH_SHALLOW (1 << PREFETCH_SHIFT) > >>>>>>>>>>>>> #define PREFETCH_MODERATE (2 << PREFETCH_SHIFT) > >>>>>>>>>>>>> #define PREFETCH_DEEP (3 << PREFETCH_SHIFT) > >>>>>>>>>>>>> +#define GFX_ACTLR_PRR (1 << 5) > >>>>>>>>>>>>> > >>>>>>>>>>>>> static const struct actlr_config sc7280_apps_actlr_cfg[] = { > >>>>>>>>>>>>> { 0x0800, 0x04e0, PREFETCH_DEFAULT | CMTLB }, > >>>>>>>>>>>>> @@ -235,6 +236,27 @@ static void qcom_adreno_smmu_resume_translation(const void *cookie, bool termina > >>>>>>>>>>>>> arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_RESUME, reg); > >>>>>>>>>>>>> } > >>>>>>>>>>>>> > >>>>>>>>>>>>> +static void qcom_adreno_smmu_set_prr(const void *cookie, phys_addr_t page_addr, bool set) > >>>>>>>>>>>>> +{ > >>>>>>>>>>>>> + struct arm_smmu_domain *smmu_domain = (void *)cookie; > >>>>>>>>>>>>> + struct arm_smmu_cfg *cfg = &smmu_domain->cfg; > >>>>>>>>>>>>> + struct arm_smmu_device *smmu = smmu_domain->smmu; > >>>>>>>>>>>>> + u32 reg = 0; > >>>>>>>>>>>>> + > >>>>>>>>>>>>> + writel_relaxed(lower_32_bits(page_addr), > >>>>>>>>>>>>> + smmu->base + ARM_SMMU_GFX_PRR_CFG_LADDR); > >>>>>>>>>>>>> + > >>>>>>>>>>>>> + writel_relaxed(upper_32_bits(page_addr), > >>>>>>>>>>>>> + smmu->base + ARM_SMMU_GFX_PRR_CFG_UADDR); > >>>>>>>>>>>>> + > >>>>>>>>>>>>> + reg = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR); > >>>>>>>>>>>>> + reg &= ~GFX_ACTLR_PRR; > >>>>>>>>>>>>> + if (set) > >>>>>>>>>>>>> + reg |= FIELD_PREP(GFX_ACTLR_PRR, 1); > >>>>>>>>>>>>> + arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR, reg); > >>>>>>>>>>>>> + > >>>>>>>>>>>> > >>>>>>>>>>>> nit, extra line > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Ack, will remove this. Thanks for pointing out. > >>>>>>>>>>> > >>>>>>>>>>>> Also, if you passed a `struct page *` instead, then you could drop the > >>>>>>>>>>>> bool param, ie. passing NULL for the page would disable PRR. But I > >>>>>>>>>>>> can go either way if others have a strong preference for phys_addr_t. > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Oh okay, this looks simple to reset the prr bit. > >>>>>>>>>>> But since this page is allocated and is used inside gfx driver > >>>>>>>>>>> before being utilized for prr bit operation, would it be safe for > >>>>>>>>>>> drm/gfx driver to keep a reference to this page in smmu driver? > >>>>>>>>>>> > >>>>>>>>>>> Since we only need the page address for configuring the > >>>>>>>>>>> CFG_UADDR/CFG_LADDR registers so passed the phys_addr_t. > >>>>>>>>>> > >>>>>>>>>> I don't think the smmu driver needs to keep a reference to the page.. > >>>>>>>>>> we can just say it is the responsibility of the drm driver to call > >>>>>>>>>> set_prr(NULL) before freeing the page > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> That makes sense. If we go by this NULL page method to disable the PRR, > >>>>>>>>> we would have to set the address registers to reset value as well. > >>>>>>>>> > >>>>>>>>> The sequence would be like the following as per my understaning: > >>>>>>>>> - Check if it's NULL page > >>>>>>>>> - Set the PRR_CFG_UADDR/PRR_CFG_LADDR to reset values i.e - 0x0 for > >>>>>>>>> these registers > >>>>>>>>> - Reset the PRR bit in actlr register > >>>>>>>>> > >>>>>>>>> Similar to this snippet: > >>>>>>>>> > >>>>>>>>> #PRR_RESET_ADDR 0x0 > >>>>>>>>> > >>>>>>>>> -------------- > >>>>>>>>> reg = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR); > >>>>>>>>> reg &= ~GFX_ACTLR_PRR; > >>>>>>>>> arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR, reg); > >>>>>>>>> > >>>>>>>>> if (!prr_page) { > >>>>>>>>> writel_relaxed(PRR_RESET_ADDR, > >>>>>>>>> smmu->base + ARM_SMMU_GFX_PRR_CFG_LADDR); > >>>>>>>>> writel_relaxed(PRR_RESET_ADDR), > >>>>>>>>> smmu->base + ARM_SMMU_GFX_PRR_CFG_UADDR); > >>>>>>>>> return; > >>>>>>>>> } > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> writel_relaxed(lower_32_bits(page_to_phys(prr_page)), > >>>>>>>>> smmu->base + ARM_SMMU_GFX_PRR_CFG_LADDR); > >>>>>>>>> > >>>>>>>>> writel_relaxed(upper_32_bits(page_to_phys(prr_page)), > >>>>>>>>> smmu->base + ARM_SMMU_GFX_PRR_CFG_UADDR); > >>>>>>>>> > >>>>>>>>> reg |= FIELD_PREP(GFX_ACTLR_PRR, 1); > >>>>>>>>> arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR, reg); > >>>>>>>>> ----------------- > >>>>>>>>> > >>>>>>>>> If looks good, will implement the same in next version. > >>>>>>>> > >>>>>>>> yeah, that looks like it could work.. > >>>>>>>> > >>>>>>>> you probably don't need to zero out the PRR_CFG_*ADDR when disabling, > >>>>>>>> and probably could avoid double writing ACTLR, but that is getting > >>>>>>>> into bikeshedding > >>>>>>>> > >>>>>>> > >>>>>>> Actually Rob, since you rightly pointed this out. > >>>>>>> I crosschecked again on these registers. > >>>>>>> PRR_CFG_*ADDR is a global register in SMMU space but > >>>>>>> ACTLR register including PRR bit is a per-domain register. > >>>>>>> There might also be a situation where PRR feature need to be > >>>>>>> disabled or enabled separately for each domain. > >>>>>>> So I think it would be cleaner to have two apis, set_prr_addr(), > >>>>>>> set_prr_bit(). > >>>>>>> set_prr_addr() will be used only to set this PRR_CFG_*ADDR > >>>>>>> register by passing a 'struct page *' > >>>>>>> set_prr_bit() will be used as a switch for PRR feature, > >>>>>>> where required smmu_domain will be passed along with > >>>>>>> the bool value to set/reset the PRR bit depending on which this > >>>>>>> feature will be enabled/disabled for the selected domain. > >>>>>> > >>>>>> on a related note, adreno has been using arm-smmu for a number of > >>>>>> generations, I guess not all support PRR? The drm driver will need to > >>>>>> know whether PRR is supported (and expose that to userspace to let the > >>>>>> UMD know whether to expose certain extensions). How should this work? > >>>>> > >>>>> So, I noticed in the x1e80100.dtsi that there is a gpu_prr_mem > >>>>> reserved section.. maybe we should be connecting this to the smmu > >>>>> driver in dt, and using that to detect presence of PRR? Ie. the smmu > >>>>> driver would configure PRR_CFG_*ADDR based on the reserved mem, and > >>>>> the interface to drm would just be to enable/disable PRR, returning an > >>>>> error code if the reserved mem section isn't there. > >>>>> > >>>>> This simplifies the interface, and handles the question of how to > >>>>> detect if PRR is supported. > >>>>> > >>>>> BR, > >>>>> -R > >>>>> > >>>> > >>>> As I checked gpu_prr_mem reserved mem section is not used for mobile > >>>> targets hence not present for other DT only compute targets like > >>>> x1e80100.dtsi has the same. PRR looks to be smmu version specific > >>>> property. > >>> > >>> I only see it in gpu_prr_mem in x1e80100.dtsi, but not documented > >>> anywhere. I'm only assuming based on the name that it is intended to > >>> be for PRR (but not sure why it is larger than 0x1000). Are the > >>> PRR_CFG_*ADDR regs programmed by the fw (and access blocked in EL1) on > >>> this device? > >>> > >> > >> As I checked, if the drm/gfx driver allocates the page for drm, then > >> this reserved-memory region is not required. > >> > >> PRR_CFG_*ADDR regs have read and write access in EL1 only for this > >> device, behavior is same as other devices as well. These are not > >> programmed by fw. > > > > If there is any device which _doesn't_ have EL1 access to these regs, > > I think going the reserved memory route seems more future proof > Otherwise we later on have to deal with two different ways to do > > things. But I'm not sure if there is any such device or risk. > > > > PRR is a bit in ACTLR register which is in SMMU space, > so is the PRR_CFG_*ADDR registers - with EL1 having access > to both the registers in all targets released till now with MMU-500. > It's unlikely that this design would change in future > for MMU-500 based targets, so I feel this risk is somewhat negligible. I wasn't worried about the ACTLR register, but the PRR_CFG_*ADDR regs ;-) IIRC those were in the SMMU global space, why hyp tends to like to own. > Also would the reserved memory route look a bit hackish? > Because, since this reserved-memory node is not used when page is > allocated through drm - so it might turn out to be redundant. > If we are aiming for a device-tree handle/node for reference then I > think a better way would be to create a bool parameter inside smmu-node > indicating presence of PRR ? tbh, I don't think there is anything better or worse about having the reserved-memory node vs dynamically allocating it. (If we dynamically allocate, we should remove the reserved memory node from x1e80100.dtsi) The thing I was more concerned about was whether there was any chance that some existing or future SoC+fw combo _relied_ on a reserved memory node and the fw programming PRR_CFG_*ADDR. If there was any chance of that, and we went the dynamic allocation route, then we'd have some devices with a reserved memory node, and some without. That seems a bit ugly to me. If there is no chance of this, then we can go either route. > Personally,I feel since the PRR enablement mechanism is same for all > MMU-500 targets - compat string would be a robust approach. I guess if it is all mmu-500, then we can just pick based on compat string. If it turns out some subset of smmu-v2 have PRR, we can just have a list of compat strings in arm-smmu-qcom.c.. there would only be a finite # of them ;-) BR, -R