On Mon, Jul 04, 2022 at 02:40:16PM +0800, Jason Wang wrote: > On Mon, Jul 4, 2022 at 2:22 PM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: > > > > On Mon, Jul 04, 2022 at 12:23:27PM +0800, Jason Wang wrote: > > > > So if there are not examples of callbacks not ready after kick > > > > then let us block callbacks until first kick. That is my idea. > > > > > > Ok, let me try. I need to drain my queue of fixes first. > > > > > > Thanks > > > > If we do find issues, another option is blocking callbacks until the > > first add. A bit higher overhead as add is a more common operation > > but it has even less of a chance to introduce regressions. > > So I understand that the case of blocking until first kick but if we > block until add it means for drivers: > > virtqueue_add() > virtio_device_ready() > virtqueue_kick() > > We probably enlarge the window in this case. > > Thanks Yes but I don't know whether any drivers call add before they are ready to get a callback. The main thing with hardening is not to break drivers. Primum non nocere and all that. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >I couldn't ... except maybe bluetooth > > > > > > > > > > > > > but that's just maintainer nacking fixes saying he'll fix it > > > > > > > > > > > > > his way ... > > > > > > > > > > > > > > > > > > > > > > > > > > > And during remove(), we get another window: > > > > > > > > > > > > > > > > > > > > > > > > > > > > subsysrem_unregistration() > > > > > > > > > > > > > > /* the window */ > > > > > > > > > > > > > > virtio_device_reset() > > > > > > > > > > > > > > > > > > > > > > > > > > Same here. > > > > > > > > > > > > > > > > > > > > > > Basically for the drivers that set driver_ok before registration, > > > > > > > > > > > > > > > > > > > > I don't see what does driver_ok have to do with it. > > > > > > > > > > > > > > > > > > I meant for those driver, in probe they do() > > > > > > > > > > > > > > > > > > virtio_device_ready() > > > > > > > > > subsystem_register() > > > > > > > > > > > > > > > > > > In remove() they do > > > > > > > > > > > > > > > > > > subsystem_unregister() > > > > > > > > > virtio_device_reset() > > > > > > > > > > > > > > > > > > for symmetry > > > > > > > > > > > > > > > > Let's leave remove alone for now. I am close to 100% sure we have *lots* > > > > > > > > of issues around it, but while probe is unavoidable remove can be > > > > > > > > avoided by blocking hotplug. > > > > > > > > > > > > > > Unbind can trigger this path as well. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > so > > > > > > > > > > > we have a lot: > > > > > > > > > > > > > > > > > > > > > > blk, net, mac80211_hwsim, scsi, vsock, bt, crypto, gpio, gpu, i2c, > > > > > > > > > > > iommu, caif, pmem, input, mem > > > > > > > > > > > > > > > > > > > > > > So I think there's no easy way to harden the notification without > > > > > > > > > > > auditing the driver one by one (especially considering the driver may > > > > > > > > > > > use bh or workqueue). The problem is the notification hardening > > > > > > > > > > > depends on a correct or race-free probe/remove. So we need to fix the > > > > > > > > > > > issues in probe/remove then do the hardening on the notification. > > > > > > > > > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > So if drivers kick but are not ready to get callbacks then let's fix > > > > > > > > > > that first of all, these are racy with existing qemu even ignoring > > > > > > > > > > spec compliance. > > > > > > > > > > > > > > > > > > Yes, (the patches I've posted so far exist even with a well-behaved device). > > > > > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > patches you posted deal with DRIVER_OK spec compliance. > > > > > > > > I do not see patches for kicks before callbacks are ready to run. > > > > > > > > > > > > > > Yes. > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > MST > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >