On Tue, 2021-09-21 at 18:52 +0200, Halil Pasic wrote: > > > > lock already, > > > > > > I believe we have a misunderstanding here. I believe that Vineeth > > > is > > > trying to tell us, that online_store_handle_offline() and > > > online_store_handle_offline() are called under the a device lock > > > of > > > the ccw device. Right, Vineeth? > > Yes. I wanted to bring-out both the scenario.The > > set_offline/_online() > > calls and the unconditional-remove call. > > I don't understand the paragraph above. I can't map the terms > set_offline/_online() and unconditional-remove call to chunks of > code. > :( online_store() function can be used to set_online/set_offline manually from the sysfs entry. And an unconditional-remove call, for CIO, starts with a CRW which indicates there is a subchannel_event which needs to be taken care. This sch_event() (in device.c) then try to find the reason for this CRW and act accordingly. This would lead to device_del and end up calling the remove function of the driver. > > For the set_online The virtio_ccw_online() also invoked with > > ccwlock > > held. (ref: ccw_device_set_online) > > I don't think virtio_ccw_online() is invoked with the ccwlock held. I > think we call virtio_ccw_online() in this line: > https://elixir.bootlin.com/linux/v5.15-rc2/source/drivers/s390/cio/device.c#L394 > and we have released the cdev->ccwlock literally 2 lines above. My bad. I overlooked it! > > > > > Conny, I believe, by online/offline callbacks, you mean > > > virtio_ccw_online() and virtio_ccw_offline(), right? > > > > > > But the thing is that virtio_ccw_online() may get called (and is > > > typically called, AFAICT) with no locks held via: > > > virtio_ccw_probe() --> async_schedule(virtio_ccw_auto_online, > > > cdev) > > > -*-> virtio_ccw_auto_online(cdev) --> ccw_device_set_online(cdev) > > > --> > > > virtio_ccw_online() > > > > > > Furthermore after a closer look, I believe because we don't take > > > a reference to the cdev in probe, we may get > > > virtio_ccw_auto_online() > > > called with an invalid pointer (the pointer is guaranteed to be > > > valid > > > in probe, but because of async we have no guarantee that it will > > > be > > > called in the context of probe). > > > > > > Shouldn't we take a reference to the cdev in probe? > > We just had a quick look at the virtio_ccw_probe() function. > > Did you mean to have a get_device() during the probe() and > > put_device() > > just after the virtio_ccw_auto_online() ? > > Yes, that would ensure that cdev pointer is still valid when > virtio_ccw_auto_online() is executed, and that things are cleaned up > properly, I guess. But I'm not 100% sure about all the interactions. > AFAIR ccw_device_set_online(cdev) would bail out if !drv. But then > we have the case where we already assigned it to a new driver (e.g. > vfio for dasd). > > BTW I believe if we have a problem here, the dasd driver has the same > problem as well. The code looks very, very similar. You are right about that. I am trying to recreate that issue with DASD now. And working on the patch as well. > > And shouldn't this auto-online be common CIO functionality? What is > the > reason the char devices don't seem to have it? I am not sure about that. I dont understand why it should be a CIO functionality. > > Regards, > Halil