On Thu, Feb 04, 2021 at 09:16:32AM +0100, Christian König wrote: > Am 03.02.21 um 22:41 schrieb Suren Baghdasaryan: > > [SNIP] > > > > How many semi-unrelated buffer accounting schemes does google come up with? > > > > > > > > We're at three with this one. > > > > > > > > And also we _cannot_ required that all dma-bufs are backed by struct > > > > page, so requiring struct page to make this work is a no-go. > > > > > > > > Second, we do not want to all get_user_pages and friends to work on > > > > dma-buf, it causes all kinds of pain. Yes on SoC where dma-buf are > > > > exclusively in system memory you can maybe get away with this, but > > > > dma-buf is supposed to work in more places than just Android SoCs. > > > I just realized that vm_inser_page doesn't even work for CMA, it would > > > upset get_user_pages pretty badly - you're trying to pin a page in > > > ZONE_MOVEABLE but you can't move it because it's rather special. > > > VM_SPECIAL is exactly meant to catch this stuff. > > Thanks for the input, Daniel! Let me think about the cases you pointed out. > > > > IMHO, the issue with PSS is the difficulty of calculating this metric > > without struct page usage. I don't think that problem becomes easier > > if we use cgroups or any other API. I wanted to enable existing PSS > > calculation mechanisms for the dmabufs known to be backed by struct > > pages (since we know how the heap allocated that memory), but sounds > > like this would lead to problems that I did not consider. > > Yeah, using struct page indeed won't work. We discussed that multiple times > now and Daniel even has a patch to mangle the struct page pointers inside > the sg_table object to prevent abuse in that direction. > > On the other hand I totally agree that we need to do something on this side > which goes beyong what cgroups provide. > > A few years ago I came up with patches to improve the OOM killer to include > resources bound to the processes through file descriptors. I unfortunately > can't find them of hand any more and I'm currently to busy to dig them up. > > In general I think we need to make it possible that both the in kernel OOM > killer as well as userspace processes and handlers have access to that kind > of data. > > The fdinfo approach as suggested in the other thread sounds like the easiest > solution to me. Yeah for OOM handling cgroups alone isn't enough as the interface - we need to make sure that oom killer takes into account the system memory usage (ideally zone aware, for CMA pools). But to track that we still need that infrastructure first I think. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch