On Tue, Oct 25, 2022 at 7:15 AM Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > > On Tue, Oct 25, 2022 at 12:27:11PM +1000, Dave Airlie wrote: > > > The userspace for those is normally bespoke like ROCm, which uses > > amdkfd, and amdkfd doesn't operate like most device files from what I > > know, so I'm not sure we'd want it to operate as an accel device. > > I intensely dislike this direction that drivers will create their own > char devs buried inside their device driver with no support or > supervision. > > We've been here before with RDMA and it is just a complete mess. > > Whatever special non-drm stuff amdkfd need to do should be supported > through the new subsystem, in a proper maintainable way. We plan to eventually move ROCm over the drm interfaces once we get user mode queues working on non-compute queues which is already in progress. ROCm already uses the existing drm nodes and libdrm for a number of things today (buffer sharing, media and compute command submission in certain cases, etc.). I don't see much value in the accel nodes for AMD products at this time. Even when we transition, there are still a bunch of things that we'd need to think about, so the current kfd node may stick around until we figure out a plan for those areas. E.g., the kfd node provides platform level compute topology information; e.g., the NUMA details for connected GPUs and CPUs, non-GPU compute node information, cache level topologies, etc. Alex > > Jason