On Tue, Jan 30, 2024 at 6:50 PM Daniel Stone <daniel@xxxxxxxxxxxxx> wrote: > > Hi, > > On Tue, 30 Jan 2024 at 18:39, Zack Rusin <zack.rusin@xxxxxxxxxxxx> wrote: > > In general, yes. Of course it's a little more convoluted because we'll > > act like OpenGL runtime here (i.e. glXSwapBuffers), i.e. our driver > > will fake page-flips because the only memory we'll have is a single > > buffer as the actual page-flipping happens in the presentation code on > > the host. So the guest is not aware of the actual presentation (it's > > also why we don't have any sort of vblank signaling in vmwgfx, the > > concept just doesn't exist for us). i.e. on para-virtualized drivers > > the actual page-flips will be property of the presentation code that's > > outside of the guest. It's definitely one those things that I wanted > > to have a good solution for in a while, in particular to have a better > > story behind vblank handling, but it's difficult because > > "presentation" on vm's is in general difficult to define - it might be > > some vnc connected host on the other continent. Having said that > > that's basically a wonky VRR display so we should be able to handle > > our presentation as VRR and give more control of updates to the guest, > > but we haven't done it yet. > > Please don't. > > Photon time is _a_ useful metric, but only backwards-informational. > It's nice to give userspace a good forward estimate of when pixels > will hit retinas, but as it's not fully reliable, the main part is > being able to let it know when it did happen so it can adjust. Given > that it's not reliable, we can't use it as a basis for preparing > submissions though, so we don't, even on bare-metal drivers. > > As you've noted though, it really falls apart on non-bare-metal cases, > especially where latency vastly exceeds throughput, or when either is > hugely variable. So we don't ever use it as a basis. > > VRR is worse though. The FRR model is 'you can display new content > every $period, and here's your basis so you can calibrate phase'. The > VRR model is 'you can display new content so rapidly it's not worth > trying to quantise, just fire it as rapidly as possible'. That's a > world away from 'errrr ... might be 16ms, might be 500? dunno really'. > > The entire model we have is that basis timing flows backwards. The > 'hardware' gives us a deadline, KMS angles to meet that with a small > margin, the compositor angles to meet that with a margin again, and it > lines up client repaints to hit that window too. Everything works on > that model, so it's not super surprising that using svga is - to quote > one of Weston's DRM-backend people who uses ESXi - 'a juddery mess'. That's very hurtful. Or it would be but of course you didn't believe them because they're working on Weston so clearly don't make good choices in general, right? The presentation on esxi is just as smooth as it is by default on Ubuntu on new hardware... > Given that the entire ecosystem is based on this model, I don't think > there's an easy way out where svga just does something wildly > different. The best way to fix it is to probably work on predictable > quantisation with updates: pick 5/12/47/60Hz to quantise to based on > your current throughput, with something similar to hotplug/LINK_STATUS > and faked EDID to let userspace know when the period changes. If you > have variability within the cycle, e.g. dropped frames, then just suck > it up and keep the illusion alive to userspace that it's presenting to > a fixed period, and if/when you calculate there's a better > quantisation then let userspace know what it is so it can adjust. > > But there's really no future in just doing random presentation rates, > because that's not the API anyone has written for. See, my hope was that with vrr we could layer the weird remote presentation semantics of virtualized guest on top of the same infrastructure that would be used on real hardware. If you're saying that it's not the way userspace will work, then yea, that doesn't help. My issue, that's general for para-virtualized drivers, is that any behavior that differs from hw drivers means that it's going to break at some point, we see that even for basic things like the update-layout hotplug events that have been largely standardized for many years. I'm assuming that refresh-rate-changed will result in the same regressions, but fwiw if I can implement FRR correctly and punt any issues that arise due to changes in the FRR as issues in userspace then that does make my life a lot easier, so I'm not going to object to that. z