On Wed, Jan 19, 2022 at 7:58 AM Christian König <christian.koenig@xxxxxxx> wrote: > > Am 19.01.22 um 16:54 schrieb Daniel Vetter: > > On Tue, Jan 18, 2022 at 10:54:16AM -0800, Hridya Valsaraju wrote: > >> On Sun, Jan 16, 2022 at 11:46 PM Christian König > >> <christian.koenig@xxxxxxx> wrote: > >>> Am 15.01.22 um 02:06 schrieb Hridya Valsaraju: > >>>> The optional exporter op provides a way for processes to transfer > >>>> charge of a buffer to a different process. This is essential for the > >>>> cases where a central allocator process does allocations for various > >>>> subsystems, hands over the fd to the client who > >>>> requested the memory and drops all references to the allocated memory. > >>>> > >>>> Signed-off-by: Hridya Valsaraju <hridya@xxxxxxxxxx> > >>>> --- > >>>> include/linux/dma-buf.h | 18 ++++++++++++++++++ > >>>> 1 file changed, 18 insertions(+) > >>>> > >>>> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h > >>>> index 7ab50076e7a6..d5e52f81cc6f 100644 > >>>> --- a/include/linux/dma-buf.h > >>>> +++ b/include/linux/dma-buf.h > >>>> @@ -13,6 +13,7 @@ > >>>> #ifndef __DMA_BUF_H__ > >>>> #define __DMA_BUF_H__ > >>>> > >>>> +#include <linux/cgroup_gpu.h> > >>>> #include <linux/dma-buf-map.h> > >>>> #include <linux/file.h> > >>>> #include <linux/err.h> > >>>> @@ -285,6 +286,23 @@ struct dma_buf_ops { > >>>> > >>>> int (*vmap)(struct dma_buf *dmabuf, struct dma_buf_map *map); > >>>> void (*vunmap)(struct dma_buf *dmabuf, struct dma_buf_map *map); > >>>> + > >>>> + /** > >>>> + * @charge_to_cgroup: > >>>> + * > >>>> + * This is called by an exporter to charge a buffer to the specified > >>>> + * cgroup. > >>> Well that sentence makes absolutely no sense at all. > >>> > >>> The dma_buf_ops are supposed to be called by the DMA-buf subsystem on > >>> behalves of the importer and never by the exporter itself. > >>> > >>> I hope that this is just a documentation mixup. > >> Thank you for taking a look Christian! > >> > >> Yes, that was poor wording, sorry about that. It should instead say > >> that the op would be called by the process the buffer is currently > >> charged to in order to transfer the buffer's charge to a different > >> cgroup. This is helpful in the case where a process acts as an > >> allocator for multiple client processes and we would like the > >> allocated buffers to be charged to the clients who requested their > >> allocation(instead of the allocating process as is the default > >> behavior). In Android, the graphics allocator HAL process[1] does > >> most of the graphics allocations on behalf of various clients. After > >> allocation, the HAL process passes the fd to the client over binder > >> IPC and the binder driver invokes the charge_to_cgroup() DMA-BUF op to > >> uncharge the buffer from the HAL process and charge it to the client > >> process instead. > >> > >> [1]: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsource.android.com%2Fdevices%2Fgraphics%2Farch-bq-gralloc&data=04%7C01%7Cchristian.koenig%40amd.com%7C838d25da974d4ea4257508d9db63eb70%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637782044488604857%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Qn7JeyF5Rq9tnrGw1KgNuQkpu5RbcrvPhDOa1OBJ6TU%3D&reserved=0 > > For that use-case, do we really need to have the vfunc abstraction and > > force all exporters to do something reasonable with it? > > I was about to write up a similar answer, but more from the technical side. > > Why in the world should that be done on the DMA-buf object as a > communication function between importer and exporter? > > That design makes absolutely no sense at all to me. > > Regards, > Christian. > > > > > I think just storing the cgrpus gpu memory bucket this is charged against > > and doing this in a generic way would be a lot better. > > > > That way we can also easily add other neat features in the future, like > > e.g. ttm could take care of charge-assignement automatically maybe, or we > > could print the current cgroups charge relationship in the sysfs info > > file. Or anything else really. Thank you for the comments Daniel and Christian! I made the charge/uncharge/transfer part of the exporter implementation since it provided exporters a choice on whether they wanted to enable cgroup memory accounting for the buffers they were exporting. I also see the benefits of making the charge/uncharge/transfer generic by moving it to the DMA-BUF framework like you are suggesting though. We will move to a more generic design in the next version of the RFC. Regards, Hridya > > > > I do feel that in general for gpu memory cgroups to be useful, we should > > really have memory pools as a fairly strong concept. Otherwise every > > driver/allocator/thing is going to come up with their own, and very likely > > incompatible interpretation. And we end up with a supposed generic cgroups > > interface which cannot actually be used in a driver/vendor agnostic way at > > all. > > -Daniel > > > >> Regards, > >> Hridya > >> > >> > >>> Regards, > >>> Christian. > >>> > >>>> The caller must hold a reference to @gpucg obtained via > >>>> + * gpucg_get(). The DMA-BUF will be uncharged from the cgroup it is > >>>> + * currently charged to before being charged to @gpucg. The caller must > >>>> + * belong to the cgroup the buffer is currently charged to. > >>>> + * > >>>> + * This callback is optional. > >>>> + * > >>>> + * Returns: > >>>> + * > >>>> + * 0 on success or negative error code on failure. > >>>> + */ > >>>> + int (*charge_to_cgroup)(struct dma_buf *dmabuf, struct gpucg *gpucg); > >>>> }; > >>>> > >>>> /** >