On 2019/05/28, Koenig, Christian wrote: > Am 28.05.19 um 18:10 schrieb Emil Velikov: > > On 2019/05/28, Daniel Vetter wrote: > >> On Tue, May 28, 2019 at 10:03 AM Koenig, Christian > >> <Christian.Koenig@xxxxxxx> wrote: > >>> Am 28.05.19 um 09:38 schrieb Daniel Vetter: > >>>> [SNIP] > >>>>> Might be a good idea looking into reverting it partially, so that at > >>>>> least command submission and buffer allocation is still blocked. > >>>> I thought the issue is a lot more than vainfo, it's pretty much every > >>>> hacked up compositor under the sun getting this wrong one way or > >>>> another. Thinking about this some more, I also have no idea how you'd > >>>> want to deprecate rendering on primary nodes in general. Apparently > >>>> that breaks -modesetting already, and probably lots more compositors. > >>>> And it looks like we're finally achieve the goal kms set out to 10 > >>>> years ago, and new compositors are sprouting up all the time. I guess > >>>> we could just break them all (on new hardware) and tell them to all > >>>> suck it up. But I don't think that's a great option. And just > >>>> deprecating this on amdgpu is going to be even harder, since then > >>>> everywhere else it'll keep working, and it's just amdgpu.ko that looks > >>>> broken. > >>>> > >>>> Aside: I'm not supporting Emil's idea here because it fixes any issues > >>>> Intel has - Intel doesn't care. I support it because reality sucks, > >>>> people get this render vs. primary vs. multi-gpu prime wrong all the > >>>> time (that's also why we have hardcoded display+gpu pairs in mesa for > >>>> the various soc combinations out there), and this looks like a > >>>> pragmatic solution. It'd be nice if every compositor and everything > >>>> else would perfectly support multi gpu and only use render nodes for > >>>> rendering, and only primary nodes for display. But reality is that > >>>> people hack on stuff until gears on screen and then move on to more > >>>> interesting things (to them). So I don't think we'll ever win this :-/ > >>> Yeah, but this is a classic case of working around user space issues by > >>> making kernel changes instead of fixing user space. > >>> > >>> Having privileged (output control) and unprivileged (rendering control) > >>> functionality behind the same node is a mistake we have made a long time > >>> ago and render nodes finally seemed to be a way to fix that. > >>> > >>> I mean why are compositors using the primary node in the first place? > >>> Because they want to have access to privileged resources I think and in > >>> this case it is perfectly ok to do so. > >>> > >>> Now extending unprivileged access to the primary node actually sounds > >>> like a step into the wrong direction to me. > >>> > >>> I rather think that we should go down the route of completely dropping > >>> command submission and buffer allocation through the primary node for > >>> non master clients. And then as next step at some point drop support for > >>> authentication/flink. > >>> > >>> I mean we have done this with UMS as well and I don't see much other way > >>> to move forward and get rid of those ancient interface in the long term. > >> Well kms had some really good benefits that drove quick adoption, like > >> "suspend/resume actually has a chance of working" or "comes with > >> buffer management so you can run multiple gears". > >> > >> The render node thing is a lot more niche use case (prime, better priv > >> separation), plus "it's cleaner design". And the "cleaner design" part > >> is something that empirically doesn't seem to matter :-/ Just two > >> examples: > >> - KHR_display/leases just iterated display resources on the fd needed > >> for rendering (and iirc there was even a patch to expose that for > >> render nodes too so it works with DRI3), because implementing > >> protocols is too hard. Barely managed to stop that one before it > >> happened. > >> - Various video players use the vblank ioctl on directly to schedule > >> frames, without telling the compositor. I discovered that when I > >> wanted to limite the vblank ioctl to master clients only. Again, > >> apparently too hard to use the existing extensions, or fix the bugs in > >> there, or whatever. One userspace got fixed last year, but it'll > >> probably get copypasted around forever :-/ > >> > >> So I don't think we'll ever manage to roll a clean split out, and best > >> we can do is give in and just hand userspace what it wants. As much as > >> that's misguided and unclean and all that. Maybe it'll result in a > >> least fewer stuff getting run as root to hack around this, because > >> fixing properly seems not to be on the table. > >> > >> The beauty of kms is that we've achieved the mission, everyone's > >> writing their own thing. Which is also terrible, and I don't think > >> it'll get better. > > > > With the risk of coming rude I will repeat my earlier comment: > > > > The problem is _neither_ Intel nor libva specific. > > > > > > > > That said, let's step back for a moment and consider: > > > > - the "block everything but KMS via the primary node" idea is great but > > orthogonal > > > > - the series does address issues that are vendor-agnostic > > > > - by default this series does _not_ cause any regression be that for > > new or old userspace > > > > - there are two trivial solutions, if the AMD team has concerns about > > closed-source/private stack depending on the old behaviour > > If they want I can even write the patches ;-) > > > > > > That said, the notable comments received so far are: > > - rework patch 13/13 to remove the DRM_AUTH from prime fd to/from > > handle. I'm OK but this will change the return code - from EACCES to > > ENOSYS > > > > - vmwgfx will need a check on the reference ioctl(s) - IIRC Thomas is > > planning to drop nearly all DRM_AUTH instances in their driver. > > > > > > Christian, as mentioned before - this series does _not_ add > > functionality to render nodes. It effectively paves a way towards > > removing DRM_AUTH. > > But it adds functionality to the primary node. > Behaviour is adjusted - functionality was there since day 1. > > I understand the series may feel a bit dirty. Yet I would gladly address > > any technical concerns you have. > > Well putting compatibility issues aside my concern is that this is > simply a bad design decision which we can't revert later on. > As sad above - any concerns (theoretical or actual regressions) can be trivially fixed _without_ reverting any of this. I am more than happy to step up and address any regressions in timely manner. As a reminder without this series, some of your customers are forced to run their applications as root. Thanks Emil _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx