Hi Mike,
On 2020-06-03 19:21, Mike Leach wrote:
Hi,
On Wed, 3 Jun 2020 at 14:34, Robin Murphy <robin.murphy@xxxxxxx> wrote:
On 2020-06-03 14:22, Mike Leach wrote:
> Hi Sai,
>
> On Wed, 3 Jun 2020 at 13:14, Sai Prakash Ranjan
> <saiprakash.ranjan@xxxxxxxxxxxxxx> wrote:
>>
>> Hi Mike,
>>
>> On 2020-06-03 16:57, Mike Leach wrote:
>>> Hi,
>>>
>>> On Wed, 3 Jun 2020 at 11:24, Sai Prakash Ranjan
>>> <saiprakash.ranjan@xxxxxxxxxxxxxx> wrote:
>>>>
>>>> Hi Mike,
>>>>
>>>> Thanks again for looking at this.
>>>>
>>>> On 2020-06-03 03:42, Mike Leach wrote:
>>>> [...]
>>>>
>>>>>>
>>>>>> SMMU/IOMMU won't be able to do much here as it is the client's
>>>>>> responsiblity to
>>>>>> properly shutdown and SMMU device link just makes sure that
>>>>>> SMMU(supplier) shutdown is
>>>>>> called only after its consumers shutdown callbacks are called.
>>>>>
>>>>> I think this use case can be handled slightly differently than the
>>>>> general requirements for modular CoreSight drivers.
>>>>>
>>>>> What is needed here is a way of stopping the underlying ETR hardware
>>>>> from issuing data to the SMMU, until the entire device has been shut
>>>>> down, in a way that does not remove the driver, breaking existing
>>>>> references and causing a system crash.
>>>>>
>>>>> We could introduce a new mode to the ETR driver - e.g.
>>>>> CS_MODE_SHUTDOWN.
>>>>>
>>>>> At the end of the block tmc_shutdown(struct amba_device *adev), set
>>>>> drvdata->mode to CS_MODE_SHUTDOWN & remove the coresight_unregister().
>>>>> This new mode can be used to prevent the underlying hardware from
>>>>> being able to restart until the device is re-powered.
>>>>>
>>>>> This mode can be detected in the code that enables / disables the ETR
>>>>> and handled appropriately (updates to tmc_enable_etr_sink and
>>>>> tmc_disable_etr_sink).
>>>>> This mode will persist until the device is re-started - but because we
>>>>> are on the device shutdown path this is not an issue.
>>>>>
>>>>> This should leave the CoreSight infrastructure stable until the
>>>>> drivers are shut down normally as part of the device power down
>>>>> process.
>>>>>
>>>>
>>>> Sounds good to me, but if the coresight_unregister() is the trouble
>>>> point
>>>> causing these crashes, then can't we just remove that from
>>>> tmc_shutdown()
>>>> callback? This would be like maintaining the same behaviour as now
>>>> where
>>>> on reboot/shutdown we basically don't do anything except for disabling
>>>> ETR.
>>>
>>> No - the new mode prevents race conditions where the thread shutting
>>> down the SMMU does the ETR shutdown, but then another thread happens
>>> to be trying to start trace and restarts the ETR.
>>> It also prevents the condition Mathieu discussed where a thread might
>>> be attempting to shutdown trace - this could try to disable the
>>> hardware again re-releasing resources/ re-flushing and waiting for
>>> stop.
>>>
>>
>> I do not think there will a race between SMMU shutdown and ETR shutdown.
>> Driver core takes care of calling SMMU shutdown after its consumer
>> shutdown callbacks via device link, otherwise there would already be
>> bugs in all other client drivers.
>>
>
> I am not saying there could be a race between tmc_shutdowm and
> Smmu_shutdown - there may be a case if the coresight_disable_path
> sequence is running and gets to the point of disabling the ETR after
> the SMMU callback has disabled it.
I'm confused now - there is no "SMMU callback", we're talking about
the
system-wide cleanup from kernel_shutdown_prepare() or
kernel_restart_prepare(). As far as I'm aware userspace should be long
gone by that point, so although trace may have been left running ||
((offset >= TRCCIDCVRn(0)) && (offset <= TRCVMIDCVRn(7)),
the
chance of racing against other driver operations seems pretty
unlikely.
Sorry - bad choice of terminology. I was referring to the SMMU
ensuring that it had all its clients shut-down before if shut down. To
quote Sai...
SMMU device link just makes sure that
>>>>>> SMMU(supplier) shutdown is
>>>>>> called only after its consumers shutdown callbacks are called.
I agree it is unlikely, but if removing the device from the CoreSight
infrastructure via coresight_unregister() is a potential source of a
crash, it would seem that there is a potential path where some
CoreSight driver side work might be possible. therefore a mode to
prevent this crash, and ensure that the device hardware remains off
and not sending trace to SMMU until such time as shutdown / reboot
restart occurs, seemed prudent.
Actually I did not see any crash with coresight_unregister() during
reboot/shutdown as I mentioned previously to Mathieu's query on
this being similar to remove callback. I think the crash with
coresight_unregister() is only seen when we have coresight as module
and the userspace is pretty much there to enable/disable trace when
we try to bind/unbind. But here we only consider the system
reboot/shutdown
where pretty much everything is down by this point.
Thanks,
Sai
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member
of Code Aurora Forum, hosted by The Linux Foundation