On Jul 8, 2022, at 11:43 AM, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > ⚠ External Email > > On Fri, Jul 08, 2022 at 06:35:48PM +0000, Nadav Amit wrote: >> On Jul 8, 2022, at 10:55 AM, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: >> >>> ⚠ External Email >>> >>> On Fri, Jul 08, 2022 at 04:45:00PM +0000, Nadav Amit wrote: >>>> On Jul 8, 2022, at 5:56 AM, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: >>>> >>>>> And looking at the results above, it's not so much the PIO vs MMIO >>>>> that makes a difference, it's the virtualisation. A mmio access goes >>>>> from 269ns to 85us. Rather than messing around with preferring MMIO >>>>> over PIO for config space, having an "enlightenment" to do config >>>>> space accesses would be a more profitable path. >>>> >>>> I am unfamiliar with the motivation for this patch, but I just wanted to >>>> briefly regard the advice about enlightments. >>>> >>>> “enlightenment”, AFAIK, is Microsoft’s term for "para-virtualization", so >>>> let’s regard the generic term. I think that you consider the bare-metal >>>> results as the possible results from a paravirtual machine, which is mostly >>>> wrong. Para-virtualization usually still requires a VM-exit and for the most >>>> part the hypervisor/host runs similar code for MMIO/hypercall (conceptually; >>>> the code of paravirtual and fully-virtual devices is often different, but >>>> IIUC, this is not what Ajay measured). >>>> >>>> Para-virtualization could have *perhaps* helped to reduce the number of >>>> PIO/MMIO and improve performance this way. If, for instance, all the >>>> PIO/MMIO are done during initialization, a paravirtual interface can be use >>>> to batch them together, and that would help. But it is more complicated to >>>> get a performance benefit from paravirtualization if the PIO/MMIO accesses >>>> are “spread”, for instance, done after each interrupt. >>> >>> What kind of lousy programming interface requires you to do a config >>> space access after every interrupt? This is looney-tunes. >> >> Wild example, hence the “for instance”. > > Stupid example that doesn't help. > >>> You've used a lot of words to not answer the question that was so >>> important that I asked it twice. What's the use case, what's the >>> workload that would benefit from this patch? >> >> Well, you used a lot of words to say “it causes problems” without saying >> which. It appeared you have misconceptions about paravirtualization that >> I wanted to correct. > > Well now, that's some bullshit. I did my fucking research. I went > back 14+ years in history to figure out what was going on back then. > I cited commit IDs. You're just tossing off some opinions. > > I have no misconceptions about whatever you want to call the mechanism > for communicating with the hypervisor at a higher level than "prod this > byte". For example, one of the more intensive things we use config > space for is sizing BARs. If we had a hypercall to siz a BAR, that > would eliminate: > > - Read current value from BAR > - Write all-ones to BAR > - Read new value from BAR > - Write original value back to BAR > > Bingo, one hypercall instead of 4 MMIO or 8 PIO accesses. > > Just because I don't use your terminology, you think I have > "misconceptions"? Fuck you, you condescending piece of shit. Matthew, I did not mean to sound condescending and I apologize if I did. You have my *full* respect for your coding/design skills. Out of my respect to you, I am giving you a pass on your conduct this time and *this time only*. Do not use such language with me or my colleagues again. The only reason I got involved in this discussion is that I feel that my colleagues have concerns about kernel toxic environment. Back to the issue at hand: I think that a new paravirtual interface is a possible solution, with some serious drawbacks. Xen did something similar, IIRC, to a certain extent. More reasonable, I think, based on what you said before, is to check if we run on a hypervisor, and update raw_pci_ops accordingly. There is an issue of whether hypervisor detection might take place too late, but I think this can be relatively easily resolved. The question is whether assigned devices might still be broken. Based on the information that you provided - I do not know. If you can answer this question, that would be helpful. Let’s also wait for Ajay to give some numbers about boot time with this change.