On Wed, Dec 05, 2018 at 10:25:31AM -0700, Logan Gunthorpe wrote: > > > On 2018-12-04 7:37 p.m., Jerome Glisse wrote: > >> > >> This came up before for apis even better defined than HMS as well as > >> more limited scope, i.e. experimental ABI availability only for -rc > >> kernels. Linus said this: > >> > >> "There are no loopholes. No "but it's been only one release". No, no, > >> no. The whole point is that users are supposed to be able to *trust* > >> the kernel. If we do something, we keep on doing it. > >> > >> And if it makes it harder to add new user-visible interfaces, then > >> that's a *good* thing." [1] > >> > >> The takeaway being don't land work-in-progress ABIs in the kernel. > >> Once an application depends on it, there are no more incompatible > >> changes possible regardless of the warnings, experimental notices, or > >> "staging" designation. DAX is experimental because there are cases > >> where it currently does not work with respect to another kernel > >> feature like xfs-reflink, RDMA. The plan is to fix those, not continue > >> to hide behind an experimental designation, and fix them in a way that > >> preserves the user visible behavior that has already been exposed, > >> i.e. no regressions. > >> > >> [1]: https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2017-August/004742.html > > > > So i guess i am heading down the vXX road ... such is my life :) > > I recommend against it. I really haven't been convinced by any of your > arguments for having a second topology tree. The existing topology tree > in sysfs already better describes the links between hardware right now, > except for the missing GPU links (and those should be addressable within > the GPU community). Plus, maybe, some other enhancements to sockets/numa > node descriptions if there's something missing there. > > Then, 'hbind' is another issue but I suspect it would be better > implemented as an ioctl on existing GPU interfaces. I certainly can't > see any benefit in using it myself. > > It's better to take an approach that would be less controversial with > the community than to brow beat them with a patch set 20+ times until > they take it. So here is what i am gonna do because i need this code now. I am gonna split the helper code that does policy and hbind out from its sysfs peerage and i am gonna turn it into helpers that each device driver can use. I will move the sysfs and syscall to be a patchset on its own which use the exact same above infrastructure. This means that i am loosing feature as it means that userspace can not provide a list of multiple device memory to use (which is much more common that you might think) but at least i can provide something for the single device case through ioctl. I am not giving up on sysfs or syscall as this is needed long term so i am gonna improve it, port existing userspace (OpenCL, ROCm, ...) to use it (in branch) and demonstrate how it get use by end application. I will beat it again and again until either i convince people through hard evidence or i get bored. I do not get bored easily :) Cheers, Jérôme