On Tue, Aug 03 2021, Vineeth Vijayan <vneethv@xxxxxxxxxxxxx> wrote: > On 7/24/21 3:24 PM, Christoph Hellwig wrote: >> On Tue, May 04, 2021 at 05:10:42PM +0200, Vineeth Vijayan wrote: > ...snip... >>> I just had a quick glance on the CIO layer drivers. And at first >>> look, you >>> are right. >>> It looks likewe need modifications in the event callbacks (referring css >>> here) >>> Let me go thoughthis thoroughly and update. >> Did this go anywhere? > Hello Christoph, > > Thank you for this reminder. Also, my apologies for the slow reply; This > was one of those item which really needed this reminder :-) > > Coming to the point, The event-callbacks are under sch->lock, which i > think is the right thing to do. But i also agree on your feedback about > the sch->driver accesses in the css_evaluate_known_subchannel() call. My > first impression was to add them under device_lock(). As Conny > mentioned, most of the drivers on the css-bus remained-stable during the > lifetime of the devices, and we never got this racy scenario. And then > having this change with device_lock(), as you mentioned,this code-base > would need significant change in the sch_event callbacks. I am not sure > if there is a straight forward solution for this locking-issue > scenario. Hm, I may have lost my way in the code, but I think ->sch_event is called _without_ the subchannel lock being held? It is only taken in e.g. io_subchannel_sch_event. ->chp_event is called with the subchannel lock held, though. > > Currently, i am trying to see the "minimal" change i can work on on the > event-callbacks and the css_evaluate_known_subchannel() call, to make > sure that, this racy condition can never occur. > > Conny, > > Please do let me know if you think i am missing something here. I would > like to concentrate more on the sch->driver() access scenario first and > would like to see how it can have minimal impact on the event-callbacks. > especially io_subchannel_sch_event. Given that the code changing sch->driver holds the device lock, but not the subchannel lock, you probably need to make sure that the device lock is held? It has been some time since I've done more complicated work in the common I/O layer, though, and I might be missing something.