> -----Original Message----- > From: Roger Pau Monné <roger.pau@xxxxxxxxxx> > Sent: 20 February 2020 16:49 > To: Durrant, Paul <pdurrant@xxxxxxxxxxxx> > Cc: Agarwal, Anchal <anchalag@xxxxxxxxxx>; Valentin, Eduardo > <eduval@xxxxxxxxxx>; len.brown@xxxxxxxxx; peterz@xxxxxxxxxxxxx; > benh@xxxxxxxxxxxxxxxxxxx; x86@xxxxxxxxxx; linux-mm@xxxxxxxxx; > pavel@xxxxxx; hpa@xxxxxxxxx; tglx@xxxxxxxxxxxxx; sstabellini@xxxxxxxxxx; > fllinden@xxxxxxxxxx; Kamata, Munehisa <kamatam@xxxxxxxxxx>; > mingo@xxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxxx; Singh, Balbir > <sblbir@xxxxxxxxxx>; axboe@xxxxxxxxx; konrad.wilk@xxxxxxxxxx; > bp@xxxxxxxxx; boris.ostrovsky@xxxxxxxxxx; jgross@xxxxxxxx; > netdev@xxxxxxxxxxxxxxx; linux-pm@xxxxxxxxxxxxxxx; rjw@xxxxxxxxxxxxx; > linux-kernel@xxxxxxxxxxxxxxx; vkuznets@xxxxxxxxxx; davem@xxxxxxxxxxxxx; > Woodhouse, David <dwmw@xxxxxxxxxxxx> > Subject: Re: [Xen-devel] [RFC PATCH v3 06/12] xen-blkfront: add callbacks > for PM suspend and hibernation > > On Thu, Feb 20, 2020 at 04:23:13PM +0000, Durrant, Paul wrote: > > > -----Original Message----- > > > From: Roger Pau Monné <roger.pau@xxxxxxxxxx> > > > Sent: 20 February 2020 15:45 > > > To: Durrant, Paul <pdurrant@xxxxxxxxxxxx> > > > Cc: Agarwal, Anchal <anchalag@xxxxxxxxxx>; Valentin, Eduardo > > > <eduval@xxxxxxxxxx>; len.brown@xxxxxxxxx; peterz@xxxxxxxxxxxxx; > > > benh@xxxxxxxxxxxxxxxxxxx; x86@xxxxxxxxxx; linux-mm@xxxxxxxxx; > > > pavel@xxxxxx; hpa@xxxxxxxxx; tglx@xxxxxxxxxxxxx; > sstabellini@xxxxxxxxxx; > > > fllinden@xxxxxxxxxx; Kamata, Munehisa <kamatam@xxxxxxxxxx>; > > > mingo@xxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxxx; Singh, Balbir > > > <sblbir@xxxxxxxxxx>; axboe@xxxxxxxxx; konrad.wilk@xxxxxxxxxx; > > > bp@xxxxxxxxx; boris.ostrovsky@xxxxxxxxxx; jgross@xxxxxxxx; > > > netdev@xxxxxxxxxxxxxxx; linux-pm@xxxxxxxxxxxxxxx; rjw@xxxxxxxxxxxxx; > > > linux-kernel@xxxxxxxxxxxxxxx; vkuznets@xxxxxxxxxx; > davem@xxxxxxxxxxxxx; > > > Woodhouse, David <dwmw@xxxxxxxxxxxx> > > > Subject: Re: [Xen-devel] [RFC PATCH v3 06/12] xen-blkfront: add > callbacks > > > for PM suspend and hibernation > > > > > > On Thu, Feb 20, 2020 at 08:54:36AM +0000, Durrant, Paul wrote: > > > > > -----Original Message----- > > > > > From: Xen-devel <xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx> On Behalf > Of > > > > > Roger Pau Monné > > > > > Sent: 20 February 2020 08:39 > > > > > To: Agarwal, Anchal <anchalag@xxxxxxxxxx> > > > > > Cc: Valentin, Eduardo <eduval@xxxxxxxxxx>; len.brown@xxxxxxxxx; > > > > > peterz@xxxxxxxxxxxxx; benh@xxxxxxxxxxxxxxxxxxx; x86@xxxxxxxxxx; > linux- > > > > > mm@xxxxxxxxx; pavel@xxxxxx; hpa@xxxxxxxxx; tglx@xxxxxxxxxxxxx; > > > > > sstabellini@xxxxxxxxxx; fllinden@xxxxxxxxxx; Kamata, Munehisa > > > > > <kamatam@xxxxxxxxxx>; mingo@xxxxxxxxxx; xen- > > > devel@xxxxxxxxxxxxxxxxxxxx; > > > > > Singh, Balbir <sblbir@xxxxxxxxxx>; axboe@xxxxxxxxx; > > > > > konrad.wilk@xxxxxxxxxx; bp@xxxxxxxxx; boris.ostrovsky@xxxxxxxxxx; > > > > > jgross@xxxxxxxx; netdev@xxxxxxxxxxxxxxx; linux-pm@xxxxxxxxxxxxxxx; > > > > > rjw@xxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > vkuznets@xxxxxxxxxx; > > > > > davem@xxxxxxxxxxxxx; Woodhouse, David <dwmw@xxxxxxxxxxxx> > > > > > Subject: Re: [Xen-devel] [RFC PATCH v3 06/12] xen-blkfront: add > > > callbacks > > > > > for PM suspend and hibernation > > > > > > > > > > Thanks for this work, please see below. > > > > > > > > > > On Wed, Feb 19, 2020 at 06:04:24PM +0000, Anchal Agarwal wrote: > > > > > > On Tue, Feb 18, 2020 at 10:16:11AM +0100, Roger Pau Monné wrote: > > > > > > > On Mon, Feb 17, 2020 at 11:05:53PM +0000, Anchal Agarwal > wrote: > > > > > > > > On Mon, Feb 17, 2020 at 11:05:09AM +0100, Roger Pau Monné > wrote: > > > > > > > > > On Fri, Feb 14, 2020 at 11:25:34PM +0000, Anchal Agarwal > > > wrote: > > > > > > > > Quiescing the queue seemed a better option here as we want > to > > > make > > > > > sure ongoing > > > > > > > > requests dispatches are totally drained. > > > > > > > > I should accept that some of these notion is borrowed from > how > > > nvme > > > > > freeze/unfreeze > > > > > > > > is done although its not apple to apple comparison. > > > > > > > > > > > > > > That's fine, but I would still like to requests that you use > the > > > same > > > > > > > logic (as much as possible) for both the Xen and the PM > initiated > > > > > > > suspension. > > > > > > > > > > > > > > So you either apply this freeze/unfreeze to the Xen suspension > > > (and > > > > > > > drop the re-issuing of requests on resume) or adapt the same > > > approach > > > > > > > as the Xen initiated suspension. Keeping two completely > different > > > > > > > approaches to suspension / resume on blkfront is not suitable > long > > > > > > > term. > > > > > > > > > > > > > I agree with you on overhaul of xen suspend/resume wrt blkfront > is a > > > > > good > > > > > > idea however, IMO that is a work for future and this patch > series > > > should > > > > > > not be blocked for it. What do you think? > > > > > > > > > > It's not so much that I think an overhaul of suspend/resume in > > > > > blkfront is needed, it's just that I don't want to have two > completely > > > > > different suspend/resume paths inside blkfront. > > > > > > > > > > So from my PoV I think the right solution is to either use the > same > > > > > code (as much as possible) as it's currently used by Xen initiated > > > > > suspend/resume, or to also switch Xen initiated suspension to use > the > > > > > newly introduced code. > > > > > > > > > > Having two different approaches to suspend/resume in the same > driver > > > > > is a recipe for disaster IMO: it adds complexity by forcing > developers > > > > > to take into account two different suspend/resume approaches when > > > > > there's no need for it. > > > > > > > > I disagree. S3 or S4 suspend/resume (or perhaps we should call them > > > power state transitions to avoid confusion) are quite different from > Xen > > > suspend/resume. > > > > Power state transitions ought to be, and indeed are, visible to the > > > software running inside the guest. Applications, as well as drivers, > can > > > receive notification and take whatever action they deem appropriate. > > > > Xen suspend/resume OTOH is used when a guest is migrated and the > code > > > should go to all lengths possible to make any software running inside > the > > > guest (other than Xen specific enlightened code, such as PV drivers) > > > completely unaware that anything has actually happened. > > > > > > So from what you say above PM state transitions are notified to all > > > drivers, and Xen suspend/resume is only notified to PV drivers, and > > > here we are speaking about blkfront which is a PV driver, and should > > > get notified in both cases. So I'm unsure why the same (or at least > > > very similar) approach can't be used in both cases. > > > > > > The suspend/resume approach proposed by this patch is completely > > > different than the one used by a xenbus initiated suspend/resume, and > > > I don't see a technical reason that warrants this difference. > > > > > > > Within an individual PV driver it may well be ok to use common > mechanisms for connecting to the backend but issues will arise if any > subsequent action is visible to the guest. E.g. a network frontend needs > to issue gratuitous ARPs without anything else in the network stack (or > monitoring the network stack) knowing that it has happened. > > > > > I'm not saying that the approach used here is wrong, it's just that I > > > don't see the point in having two different ways to do suspend/resume > > > in the same driver, unless there's a technical reason for it, which I > > > don't think has been provided. > > > > The technical justification is that the driver needs to know what kind > of suspend or resume it is doing, so that it doesn't do the wrong thing. > There may also be differences in the state of the system e.g. in Windows, > at least some of the resume-from-xen-suspend code runs with interrupts > disabled (which is necessary to make sure enough state is restored before > things become visible to other kernel code). > > > > > > > > I would be fine with switching xenbus initiated suspend/resume to also > > > use the approach proposed here: freeze the queues and drain the shared > > > rings before suspending. > > > > > > > I think abstracting away at the xenbus level to some degree is probably > feasible, but some sort of flag should be passed to the individual drivers > so they know what circumstances they are operating under. > > > > > > So, whilst it may be possible to use common routines to, for > example, > > > re-establish PV frontend/backend communication, PV frontend code > should be > > > acutely aware of the circumstances they are operating in. I can cite > > > example code in the Windows PV driver, which have supported guest > S3/S4 > > > power state transitions since day 1. > > > > > > Hm, please bear with me, as I'm not sure I fully understand. Why isn't > > > the current suspend/resume logic suitable for PM transitions? > > > > > > > I don’t know the details for Linux but it may well be to do with > assumptions made about the system e.g. the ability to block waiting for > something to happen on another CPU (which may have already been quiesced > in a PM context). > > > > > As said above, I'm happy to switch xenbus initiated suspend/resume to > > > use the logic in this patch, but unless there's a technical reason for > > > it I don't see why blkfront should have two completely different > > > approaches to suspend/resume depending on whether it's a PM or a > > > xenbus state change. > > > > > > > Hopefully what I said above illustrates why it may not be 100% common. > > Yes, that's fine. I don't expect it to be 100% common (as I guess > that the hooks will have different prototypes), but I expect > that routines can be shared, and that the approach taken can be the > same. > > For example one necessary difference will be that xenbus initiated > suspend won't close the PV connection, in case suspension fails. On PM > suspend you seem to always close the connection beforehand, so you > will always have to re-negotiate on resume even if suspension failed. > > What I'm mostly worried about is the different approach to ring > draining. Ie: either xenbus is changed to freeze the queues and drain > the shared rings, or PM uses the already existing logic of not > flushing the rings an re-issuing in-flight requests on resume. > Yes, that's needs consideration. I don’t think the same semantic can be suitable for both. E.g. in a xen-suspend we need to freeze with as little processing as possible to avoid dirtying RAM late in the migration cycle, and we know that in-flight data can wait. But in a transition to S4 we need to make sure that at least all the in-flight blkif requests get completed, since they probably contain bits of the guest's memory image and that's not going to get saved any other way. Paul