Re: [PATCH v8 0/4] pci hotplug tracking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 02, 2023 at 04:28:43PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 02.11.23 15:12, Michael S. Tsirkin wrote:
> > On Thu, Nov 02, 2023 at 03:00:01PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > > On 02.11.23 14:31, Michael S. Tsirkin wrote:
> > > > On Thu, Oct 05, 2023 at 12:29:22PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > > > > Hi all!
> > > > > 
> > > > > Main thing this series does is DEVICE_ON event - a counter-part to
> > > > > DEVICE_DELETED. A guest-driven event that device is powered-on.
> > > > > Details are in patch 2. The new event is paried with corresponding
> > > > > command query-hotplug.
> > > > 
> > > > Several things questionable here:
> > > > 1. depending on guest activity you can get as many
> > > >      DEVICE_ON events as you like
> > > 
> > > No, I've made it so it may be sent only once per device
> > 
> > Maybe document that?
> 
> Right, my fault
> 
> > 
> > > > 2. it's just for shpc and native pcie - things are
> > > >      confusing enough for management, we should make sure
> > > >      it can work for all devices
> > > 
> > > Agree, I'm thinking about it
> > > 
> > > > 3. what about non hotpluggable devices? do we want the event for them?
> > > > 
> > > 
> > > I think, yes, especially if we make async=true|false flag for device_add, so that successful device_add must be always followed by DEVICE_ON - like device_del is followed by DEVICE_DELETED.
> > > 
> > > Maybe, to generalize, it should be called not DEVICE_ON (which mostly relate to hotplug controller statuses) but DEVICE_ADDED - a full counterpart for DEVICE_DELETED.
> > > 
> > > > 
> > > > I feel this needs actual motivation so we can judge what's the
> > > > right way to do it.
> > > 
> > > My first motivation for this series was the fact that successful device_add doesn't guarantee that hard disk successfully hotplugged to the guest. It relates to some problems with shpc/pcie hotplug we had in the past, and they are mostly fixed. But still, for management tool it's good to understand that all actions related to hotplug controller are done and we have "green light".
> > 
> > what does "successfully" mean though? E.g. a bunch of guests will not
> > properly show you the device if the disk is not formatted properly.
> 
> Yes, I understand, that we may say only about "some degree of success".
> 
> But here is some physical sense still: DEVICE_ON indicates, that it's now safe to call device_del. And calling device_del before DEVICE_ON is a kind of unexpected behavior.
> 

Is that really true? I really don't think we should introduce new types
of undefined behavior.


> > 
> > > 
> > > Recently new motivation come, as I described in my "ping" letter <6bd19a07-5224-464d-b54d-1d738f5ba8f7@xxxxxxxxxxxxxx>, that we have a performance degradation because of 7bed89958bfbf40df, which introduces drain_call_rcu() in device_add, to make it more synchronous. So, my suggestion is make it instead more asynchronous (probably with special flag) and rely on DEVICE_ON event.
> > 
> > This one?
> > 
> > commit 7bed89958bfbf40df9ca681cefbdca63abdde39d
> > Author: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
> > Date:   Tue Oct 6 14:38:58 2020 +0200
> > 
> >      device_core: use drain_call_rcu in in qmp_device_add
> >      Soon, a device removal might only happen on RCU callback execution.
> >      This is okay for device-del which provides a DEVICE_DELETED event,
> >      but not for the failure case of device-add.  To avoid changing
> >      monitor semantics, just drain all pending RCU callbacks on error.
> >      Signed-off-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
> >      Suggested-by: Stefan Hajnoczi <stefanha@xxxxxxxxx>
> >      Reviewed-by: Stefan Hajnoczi <stefanha@xxxxxxxxxx>
> >      Message-Id: <20200913160259.32145-4-mlevitsk@xxxxxxxxxx>
> >      [Don't use it in qmp_device_del. - Paolo]
> >      Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> > 
> > diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c
> > index e9b7228480..bcfb90a08f 100644
> > --- a/softmmu/qdev-monitor.c
> > +++ b/softmmu/qdev-monitor.c
> > @@ -803,6 +803,18 @@ void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp)
> >           return;
> >       }
> >       dev = qdev_device_add(opts, errp);
> > +
> > +    /*
> > +     * Drain all pending RCU callbacks. This is done because
> > +     * some bus related operations can delay a device removal
> > +     * (in this case this can happen if device is added and then
> > +     * removed due to a configuration error)
> > +     * to a RCU callback, but user might expect that this interface
> > +     * will finish its job completely once qmp command returns result
> > +     * to the user
> > +     */
> > +    drain_call_rcu();
> > +
> >       if (!dev) {
> >           qemu_opts_del(opts);
> >           return;
> > 
> > 
> > 
> > So maybe just move drain_call_rcu under if (!dev) then and be done with
> > it?
> > 
> 
> Hmm, I read the commit message thinking that it saying about device removal by mistake and actually want to say both about device_add and device_del.. But I was wrong?
> 
> Hmm, it directly say "just drain all pending RCU callbacks on error", but does that on success path as well.
> 
> Yes, moving drain_call_rcu makes sense for me, and will close the second "motivation". I can make a patch.
> 
> -- 
> Best regards,
> Vladimir
_______________________________________________
Devel mailing list -- devel@xxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxx




[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]

  Powered by Linux