> -----Original Message----- > From: Daniel Vetter <daniel@xxxxxxxx> > Sent: Tuesday, November 24, 2020 7:17 AM > To: Jason Gunthorpe <jgg@xxxxxxxx> > Cc: Xiong, Jianxin <jianxin.xiong@xxxxxxxxx>; Leon Romanovsky <leon@xxxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx; dri- > devel@xxxxxxxxxxxxxxxxxxxxx; Doug Ledford <dledford@xxxxxxxxxx>; Vetter, Daniel <daniel.vetter@xxxxxxxxx>; Christian Koenig > <christian.koenig@xxxxxxx> > Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support > > On Mon, Nov 23, 2020 at 02:05:04PM -0400, Jason Gunthorpe wrote: > > On Mon, Nov 23, 2020 at 09:53:02AM -0800, Jianxin Xiong wrote: > > > > > +cdef class DmaBuf: > > > + def __init__(self, size, unit=0): > > > + """ > > > + Allocate DmaBuf object from a GPU device. This is done through the > > > + DRI device interface (/dev/dri/card*). Usually this > > > +requires the > > Please use /dev/dri/renderD* instead. That's the interface meant for unpriviledged rendering access. card* is the legacy interface with > backwards compat galore, don't use. > > Specifically if you do this on a gpu which also has display (maybe some testing on a local developer machine, no idea ...) then you mess with > compositors and stuff. > > Also wherever you copied this from, please also educate those teams that using /dev/dri/card* for rendering stuff is a Bad Idea (tm) /dev/dri/renderD* is not always available (e.g. for many iGPUs) and doesn't support mode setting commands (including dumb_buf). The original intention here is to have something to support the new tests added, not for general compute. > > > > + effective user id being root or being a member of the 'video' group. > > > + :param size: The size (in number of bytes) of the buffer. > > > + :param unit: The unit number of the GPU to allocate the buffer from. > > > + :return: The newly created DmaBuf object on success. > > > + """ > > > + self.dmabuf_mrs = weakref.WeakSet() > > > + self.dri_fd = open('/dev/dri/card'+str(unit), O_RDWR) > > > + > > > + args = bytearray(32) > > > + pack_into('=iiiiiiq', args, 0, 1, size, 8, 0, 0, 0, 0) > > > + ioctl(self.dri_fd, DRM_IOCTL_MODE_CREATE_DUMB, args) > > > + a, b, c, d, self.handle, e, self.size = unpack('=iiiiiiq', > > > + args) > > Yeah no, don't allocate render buffers with create_dumb. Every time this comes up I'm wondering whether we should just completely > disable dma-buf operations on these. Dumb buffers are explicitly only for software rendering for display purposes when the gpu userspace > stack isn't fully running yet, aka boot splash. > > And yes I know there's endless amounts of abuse of that stuff floating around, especially on arm-soc/android systems. One alternative is to use the GEM_CREATE method which can be done via the renderD* device, but the command is vendor specific, so the logic is a little bit more complex. > > > > + > > > + args = bytearray(12) > > > + pack_into('=iii', args, 0, self.handle, O_RDWR, 0) > > > + ioctl(self.dri_fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, args) > > > + a, b, self.fd = unpack('=iii', args) > > > + > > > + args = bytearray(16) > > > + pack_into('=iiq', args, 0, self.handle, 0, 0) > > > + ioctl(self.dri_fd, DRM_IOCTL_MODE_MAP_DUMB, args); > > > + a, b, self.map_offset = unpack('=iiq', args); > > > > Wow, OK > > > > Is it worth using ctypes here instead? Can you at least add a comment > > before each pack specifying the 'struct XXX' this is following? > > > > Does this work with normal Intel GPUs, like in a Laptop? AMD too? > > > > Christian, I would be very happy to hear from you that this entire > > work is good for AMD as well > > I think the smallest generic interface for allocating gpu buffers which are more useful than the stuff you get from CREATE_DUMB is gbm. > That's used by compositors to get bare metal opengl going on linux. Ofc Android has gralloc for the same purpose, and cros has minigbm > (which isn't the same as gbm at all). So not so cool. Again, would the "renderD* + GEM_CREATE" combination be an acceptable alternative? That would be much simpler than going with gbm and less dependency in setting up the testing evrionment. > > The other generic option is using vulkan, which works directly on bare metal (without a compositor or anything running), and is cross vendor. > So cool, except not used for compute, which is generally the thing you want if you have an rdma card. > > Both gbm-egl/opengl and vulkan have extensions to hand you a dma-buf back, properly. > > Compute is the worst, because opencl is widely considered a mistake (maybe opencl 3 is better, but nvidia is stuck on 1.2). The actually used > stuff is cuda (nvidia-only), rocm (amd-only) and now with intel also playing we have xe (intel-only). > > It's pretty glorious :-/ > > Also I think we discussed this already, but for actual p2p the intel patches aren't in upstream yet. We have some internally, but with very > broken locking (in the process of getting fixed up, but it's taking time). > > Cheers, Daniel > > > Edward should look through this, but I'm glad to see something like > > this > > > > Thanks, > > Jason > > _______________________________________________ > > dri-devel mailing list > > dri-devel@xxxxxxxxxxxxxxxxxxxxx > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch