On Mon, 17 Jul 2023 at 14:54, Simon Ser <contact@xxxxxxxxxxx> wrote: > > On Monday, July 17th, 2023 at 15:24, Emil Velikov <emil.l.velikov@xxxxxxxxx> wrote: > > > > > For going forward, here is one way we can shave this yak: > > > > - update libdrm to max 64 nodes > > > > - roll libdrm release, nag distributions to update to it // could be > > > > folded with the next release below > > > > - update libdrm to use name driven type detection > > > > - roll libdrm release, nag distributions to update to it > > > > - once ^^ release hits most distributions, merge the above proposed > > > > kernel patch > > > > - the commit message should explain the caveats and fixed libdrm version > > > > - we should be prepared to revert the commit, if it causes user > > > > space regression - fix (see below) and re-introduce the kernel patch > > > > 1-2 releases later > > > > > > That sounds really scary to me. I'd really prefer to try not to break the > > > kernel uAPI here. > > > > > > > With part in particular? Mind you I'm not particularly happy either, > > since in essence it's like a controlled explosion. > > I believe there are ways to extend the uAPI to support more devices without > breaking the uAPI. Michał Winiarski's patch for instance tried something to > this effect. > > > > The kernel rule is "do not break user-space". > > > > Yes, in a perfect world. In practice, there have been multiple kernel > > changes breaking user-space. Some got reverted, some remained. > > AFAICT the above will get us out of the sticky situation we're in with > > the least amount of explosion. > > > > If there is a concrete proposal, please go ahead and sorry if I've > > missed it. I'm supposed to be off, having fun with family when I saw > > this whole thing explode. > > > > Small note: literally all the users I've seen will stop on a missing > > node (card or render) aka if the kernel creates card0...63 and then > > card200... then (hard wavy estimate) 80% of the apps will be broken. > > That's fine, because that's not a kernel regression. Supporting more than 64 > devices is a new kernel feature, and if some user-space ignores the new nodes, > that's not a kernel regression. A regression only happens when a use-case which > works with an older kernel is broken by a newer kernel. Won't this approach effectively hard-code/leak even more kernel uABI into dozens of not hundreds of userspace projects? This does not sound like a scalable solution IMHO. I am 100% behind the "don't break userspace rule", alas very few things in life are as black/white as your comments seem to suggest. Thus I would suggest doing a bit of both or a compromise if you will. Namely: - try the initial route outlined above - if there are (m)any fires, revert the kernel patch and opt for the work by Michał This has the benefit of fixing a bunch of the uABI abuses out there, and leaking more uABI only on as-needed basis. Side note: KDE folks have their own flatpak runtime and have been quite open to backport libdrm/other fixes. HTH Emil