On Thu, 23 Apr 2020 16:52:24 +0200 Vineeth Vijayan <vneethv@xxxxxxxxxxxxxxxxxx> wrote: > Hi Corenelia, > > few cents on this, > > 1. css_register_subchannel() is called only for valid subchannels, which > is taken care in the > css_alloc_subchannel(). So Adding a second css_sch_is_valid() in > css_register_subchannel() > might not help us here. We still need to find a mechanism to avoid the > performance impact > because of the uevent-storm from IO-subchannels without any valid device > on them. Ah, I missed that. But I'm wondering whether the number of non-operational devices that will end up not being registered is actually that high in a normal setup. The really bad case, as I understand it, is 0 ... n ... m ... 0xffff <nothing> <dev> <nothing> <dev> <nothing> where we end up with large numbers of subchannels with !dnv prior to n and between n and m. (On LPAR; z/VM and QEMU will usually have mostly consecutive devices-on-subchannels, unless there has been a huge amount of hotplug been going on.) In this case, the !dnv check already prevents us from even registering the device, so the only problematic devices left are those where we fail to successfully drive I/O to -- are these very common on sane setups? (The code has seen some revisions since I introduced that suppression stuff, maybe I'm missing something.) > > 2. We will have to find a way to get the AVAILABLE-VALID-CCW-device > information from css layer, > which would help vfio-ccw drivers to work with the uevents when it is > not suppressed. But if we bind the subchannel to vfio-ccw, we do not have any ccw device, right? Or am I misunderstanding? > Then we could also change the way ccw_device_call_sch_unregister() > works, where > the subchannel-unregister is happening from an upper layer. Hm, what's the problem here? This seems to be mostly a case of "we did I/O to the device and it appeared not operational; so we go ahead and unregister the subchannel"? Childless I/O subchannels are a bit useless.