On Mon, Aug 08, 2022 at 11:26:11PM +0300, Oded Gabbay wrote: > So if you want a common uAPI and a common userspace library to use it, > you need to expose the same device character files for every device, > regardless of the driver. e.g. you need all devices to be called > /dev/accelX and not /dev/habanaX or /dev/nvidiaX So, this is an interesting idea. One of the things we did in RDMA that turned our very well is to have the user side of the kernel/user API in a single git repo for all the drivers, including the lowest layer of the driver-specific APIs. It gives a reasonable target for a DRM-like test of "you must have a userspace". Ie send your userspace and userspace documentation/tests before your kernel side can be merged. Even if it is just a git repo collecting and curating driver-specific libraries under the "accel" banner it could be quite a useful activity. But, probably this boils down to things that look like: device = habana_open_device() habana_mooo(device) device = nvidia_open_device() nvidia_baaa(device) > That's what I mean by abstracting all this kernel API from the > drivers. Not because it is an API that is hard to use, but because the > drivers should *not* use it at all. > > I think drm did that pretty well. Their code defines objects for > driver, device and minors, with resource manager that will take care > of releasing the objects automatically (it is based on devres.c). We have lots of examples of subsystems doing this - the main thing unique about accel is that that there is really no shared uAPI between the drivers, and not 'abstraction' provided by the kernel. Maybe that is the point.. > So actually I do want an ioctl but as you said, not for the main > device char, but to an accompanied control device char. There is a general problem across all these "thick" devices in the kernel to support their RAS & configuration requirements and IMHO we don't have a good answer at all. We've been talking on and off here about having some kind of subsystem/methodology specifically for this area - how to monitor, configure, service, etc a very complicated off-CPU device. I think there would be a lot of interest in this and maybe it shouldn't be coupled to this accel idea. Eg we already have some established mechinisms - I would expect any accel device to be able to introspect and upgrade its flash FW using the 'devlink flash' common API. > an application only has access to the information ioctl through this > device char (so it can't submit anything, allocate memory, etc.) and > can only retrieve metrics which do not leak information about the > compute application. This is often being done over a netlink socket as the "second char" Jason