Re: [PATCH V4 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

Rob Clark <robdclark@xxxxxxxxx> · Tue, 28 Nov 2017 15:05:14 -0500



On Tue, Nov 28, 2017 at 8:43 AM, Vivek Gautam
<vivek.gautam@xxxxxxxxxxxxxx> wrote:
>
>
> On 11/28/2017 05:13 AM, Rob Clark wrote:
>>
>> On Mon, Nov 27, 2017 at 5:22 PM, Stephen Boyd<sboyd@xxxxxxxxxxxxxx>
>> wrote:
>>>
>>> On 11/15, Vivek Gautam wrote:
>>>>
>>>> Hi,
>>>>
>>>>
>>>> On Mon, Aug 7, 2017 at 5:59 PM, Rob Clark<robdclark@xxxxxxxxx>  wrote:
>>>>>
>>>>> On Mon, Aug 7, 2017 at 4:27 AM, Vivek Gautam
>>>>> <vivek.gautam@xxxxxxxxxxxxxx>  wrote:
>>>>>>
>>>>>> On Thu, Jul 13, 2017 at 5:20 PM, Rob Clark<robdclark@xxxxxxxxx>
>>>>>> wrote:
>>>>>>>
>>>>>>> On Thu, Jul 13, 2017 at 1:35 AM, Sricharan
>>>>>>> R<sricharan@xxxxxxxxxxxxxx>  wrote:
>>>>>>>>
>>>>>>>> Hi Vivek,
>>>>>>>>
>>>>>>>> On 7/13/2017 10:43 AM, Vivek Gautam wrote:
>>>>>>>>>
>>>>>>>>> Hi Stephen,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 07/13/2017 04:24 AM, Stephen Boyd wrote:
>>>>>>>>>>
>>>>>>>>>> On 07/06, Vivek Gautam wrote:
>>>>>>>>>>>
>>>>>>>>>>> @@ -1231,12 +1237,18 @@ static int arm_smmu_map(struct
>>>>>>>>>>> iommu_domain *domain, unsigned long iova,
>>>>>>>>>>>    static size_t arm_smmu_unmap(struct iommu_domain *domain,
>>>>>>>>>>> unsigned long iova,
>>>>>>>>>>>                     size_t size)
>>>>>>>>>>>    {
>>>>>>>>>>> -    struct io_pgtable_ops *ops =
>>>>>>>>>>> to_smmu_domain(domain)->pgtbl_ops;
>>>>>>>>>>> +    struct arm_smmu_domain *smmu_domain =
>>>>>>>>>>> to_smmu_domain(domain);
>>>>>>>>>>> +    struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
>>>>>>>>>>> +    size_t ret;
>>>>>>>>>>>          if (!ops)
>>>>>>>>>>>            return 0;
>>>>>>>>>>>    -    return ops->unmap(ops, iova, size);
>>>>>>>>>>> +    pm_runtime_get_sync(smmu_domain->smmu->dev);
>>>>>>>>>>
>>>>>>>>>> Can these map/unmap ops be called from an atomic context? I seem
>>>>>>>>>> to recall that being a problem before.
>>>>>>>>>
>>>>>>>>> That's something which was dropped in the following patch merged in
>>>>>>>>> master:
>>>>>>>>> 523d7423e21b iommu/arm-smmu: Remove io-pgtable spinlock
>>>>>>>>>
>>>>>>>>> Looks like we don't  need locks here anymore?
>>>>>>>>
>>>>>>>>   Apart from the locking, wonder why a explicit pm_runtime is needed
>>>>>>>>   from unmap. Somehow looks like some path in the master using that
>>>>>>>>   should have enabled the pm ?
>>>>>>>>
>>>>>>> Yes, there are a bunch of scenarios where unmap can happen with
>>>>>>> disabled master (but not in atomic context).
>>>>>>
>>>>>> I would like to understand whether there is a situation where an unmap
>>>>>> is
>>>>>> called in atomic context without an enabled master?
>>>>>>
>>>>>> Let's say we have the case where all the unmap calls in atomic context
>>>>>> happen
>>>>>> only from the master's context (in which case the device link should
>>>>>> take care of
>>>>>> the pm state of smmu), and the only unmap that happen in non-atomic
>>>>>> context
>>>>>> is the one with master disabled. In such a case doesn it make sense to
>>>>>> distinguish
>>>>>> the atomic/non-atomic context and add pm_runtime_get_sync()/put_sync()
>>>>>> only
>>>>>> for the non-atomic context since that would be the one with master
>>>>>> disabled.
>>>>>>
>>>>> At least drm/msm needs to hold obj->lock (a mutex) in unmap, so it
>>>>> won't unmap anything in atomic ctx (but it can unmap w/ master
>>>>> disabled).  I can't really comment about other non-gpu drivers.  It
>>>>> seems like a reasonable constraint that either master is enabled or
>>>>> not in atomic ctx.
>>>>>
>>>>> Currently we actually wrap unmap w/ pm_runtime_get/put_sync(), but I'd
>>>>> like to drop that to avoid powering up the gpu.
>>>>
>>>> Since the deferring the TLB maintenance doesn't look like the best
>>>> approach [1],
>>>> how about if we try to power-up only the smmu from different client
>>>> devices such as,
>>>> GPU in the unmap path. Then we won't need to add pm_runtime_get/put()
>>>> calls in
>>>> arm_smmu_unmap().
>>>>
>>>> The client device can use something like - pm_runtime_get_supplier()
>>>> since
>>>> we already have the device link in place with this patch series. This
>>>> should
>>>> power-on the supplier (which is smmu) without turning on the consumer
>>>> (such as GPU).
>>>>
>>>> pm_runtime_get_supplier() however is not exported at this moment.
>>>> Will it be useful to export this API and use it in the drivers.
>>>>
>>> I'm not sure pm_runtime_get_supplier() is correct either. That
>>> feels like we're relying on the GPU driver knowing the internal
>>> details of how the device links are configured.
>>>
>> what does pm_runtime_get_supplier() do if IOMMU driver hasn't setup
>> device-link?
>
>
> It will be a no-op.
>
>> If it is a no-op, then I guess the GPU driver calling
>> pm_runtime_get_supplier() seems reasonable, and less annoying than
>> having special cases in pm_resume path.. I don't feel too bad about
>> having "just in case" get/put_supplier() calls in the unmap path.
>>
>> Also, presumably we still want to avoid powering up GPU even if we
>> short circuit the firmware loading and rest of "booting up the GPU"..
>> since presumably the GPU draws somewhat more power than the IOMMU..
>> having the pm_resume/suspend path know about the diff between waking
>> up / suspending the iommu and itself doesn't really feel less-bad than
>> just doing "just in case" get/put_supplier() calls.
>
>
> If it sounds okay, then i can send a patch that exports the
> pm_runtime_get/put_suppliers() APIs.
>

sounds good to me

BR,
-R


>
> Best regards
> Vivek
>
>> BR,
>> -R
>>
>>> Is there some way to have the GPU driver know in its runtime PM
>>> resume hook that it doesn't need to be powered on because it
>>> isn't actively drawing anything or processing commands? I'm
>>> thinking of the code calling pm_runtime_get() as proposed around
>>> the IOMMU unmap path in the GPU driver and then having the
>>> runtime PM resume hook in the GPU driver return some special
>>> value to indicate that it didn't really resume because it didn't
>>> need to and to treat the device as runtime suspended but not
>>> return an error. Then the runtime PM core can keep track of that
>>> and try to power the GPU on again when another pm_runtime_get()
>>> is called on the GPU device.
>>>
>>> This keeps the consumer API the same, always pm_runtime_get(),
>>> but leaves the device driver logic of what to do when the GPU
>>> doesn't need to power on to the runtime PM hook where the driver
>>> has all the information.
>>>
>>> --
>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>>> a Linux Foundation Collaborative Project
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-arm-msm"
>> in
>> the body of a message tomajordomo@xxxxxxxxxxxxxxx
>> More majordomo info athttp://vger.kernel.org/majordomo-info.html
>
>
> --
> The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>
> a Linux Foundation Collaborative Project
>
--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html