Hi, On Wed, 3 Jun 2020 at 14:34, Robin Murphy <robin.murphy@xxxxxxx> wrote: > > On 2020-06-03 14:22, Mike Leach wrote: > > Hi Sai, > > > > On Wed, 3 Jun 2020 at 13:14, Sai Prakash Ranjan > > <saiprakash.ranjan@xxxxxxxxxxxxxx> wrote: > >> > >> Hi Mike, > >> > >> On 2020-06-03 16:57, Mike Leach wrote: > >>> Hi, > >>> > >>> On Wed, 3 Jun 2020 at 11:24, Sai Prakash Ranjan > >>> <saiprakash.ranjan@xxxxxxxxxxxxxx> wrote: > >>>> > >>>> Hi Mike, > >>>> > >>>> Thanks again for looking at this. > >>>> > >>>> On 2020-06-03 03:42, Mike Leach wrote: > >>>> [...] > >>>> > >>>>>> > >>>>>> SMMU/IOMMU won't be able to do much here as it is the client's > >>>>>> responsiblity to > >>>>>> properly shutdown and SMMU device link just makes sure that > >>>>>> SMMU(supplier) shutdown is > >>>>>> called only after its consumers shutdown callbacks are called. > >>>>> > >>>>> I think this use case can be handled slightly differently than the > >>>>> general requirements for modular CoreSight drivers. > >>>>> > >>>>> What is needed here is a way of stopping the underlying ETR hardware > >>>>> from issuing data to the SMMU, until the entire device has been shut > >>>>> down, in a way that does not remove the driver, breaking existing > >>>>> references and causing a system crash. > >>>>> > >>>>> We could introduce a new mode to the ETR driver - e.g. > >>>>> CS_MODE_SHUTDOWN. > >>>>> > >>>>> At the end of the block tmc_shutdown(struct amba_device *adev), set > >>>>> drvdata->mode to CS_MODE_SHUTDOWN & remove the coresight_unregister(). > >>>>> This new mode can be used to prevent the underlying hardware from > >>>>> being able to restart until the device is re-powered. > >>>>> > >>>>> This mode can be detected in the code that enables / disables the ETR > >>>>> and handled appropriately (updates to tmc_enable_etr_sink and > >>>>> tmc_disable_etr_sink). > >>>>> This mode will persist until the device is re-started - but because we > >>>>> are on the device shutdown path this is not an issue. > >>>>> > >>>>> This should leave the CoreSight infrastructure stable until the > >>>>> drivers are shut down normally as part of the device power down > >>>>> process. > >>>>> > >>>> > >>>> Sounds good to me, but if the coresight_unregister() is the trouble > >>>> point > >>>> causing these crashes, then can't we just remove that from > >>>> tmc_shutdown() > >>>> callback? This would be like maintaining the same behaviour as now > >>>> where > >>>> on reboot/shutdown we basically don't do anything except for disabling > >>>> ETR. > >>> > >>> No - the new mode prevents race conditions where the thread shutting > >>> down the SMMU does the ETR shutdown, but then another thread happens > >>> to be trying to start trace and restarts the ETR. > >>> It also prevents the condition Mathieu discussed where a thread might > >>> be attempting to shutdown trace - this could try to disable the > >>> hardware again re-releasing resources/ re-flushing and waiting for > >>> stop. > >>> > >> > >> I do not think there will a race between SMMU shutdown and ETR shutdown. > >> Driver core takes care of calling SMMU shutdown after its consumer > >> shutdown callbacks via device link, otherwise there would already be > >> bugs in all other client drivers. > >> > > > > I am not saying there could be a race between tmc_shutdowm and > > Smmu_shutdown - there may be a case if the coresight_disable_path > > sequence is running and gets to the point of disabling the ETR after > > the SMMU callback has disabled it. > > I'm confused now - there is no "SMMU callback", we're talking about the > system-wide cleanup from kernel_shutdown_prepare() or > kernel_restart_prepare(). As far as I'm aware userspace should be long > gone by that point, so although trace may have been left running || ((offset >= TRCCIDCVRn(0)) && (offset <= TRCVMIDCVRn(7)), the > chance of racing against other driver operations seems pretty unlikely. > Sorry - bad choice of terminology. I was referring to the SMMU ensuring that it had all its clients shut-down before if shut down. To quote Sai... >>>>> SMMU device link just makes sure that > >>>>>> SMMU(supplier) shutdown is > >>>>>> called only after its consumers shutdown callbacks are called. I agree it is unlikely, but if removing the device from the CoreSight infrastructure via coresight_unregister() is a potential source of a crash, it would seem that there is a potential path where some CoreSight driver side work might be possible. therefore a mode to prevent this crash, and ensure that the device hardware remains off and not sending trace to SMMU until such time as shutdown / reboot restart occurs, seemed prudent. Mike > Robin. -- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK