On 10/03/17 11:39, Laurent Pinchart wrote: > Hello, > > Memory leaks have been reported when allocating a cached omap_bo (with > OMAP_BO_CACHED. Investigation showed that this can only come from the DMA > mapping debug layer, as on ARM32 the non-coherent, non-IOMMU DMA mapping code > doesn't allocate memory to map a page (and kmemcheck facility is only > available on x86, so it can't be a source of memory leaks either on ARM). > > The DMA debug layer pre-allocates DMA debugging entries and stores them in a > list. As the omapdrm driver maps cached buffer page by page, the list of 4096 > pre-allocated entries is starved very soon. However, raising the number of DMA > mapping debug entries to 32 * 4096 (through the dma_debug_entries kernel > command line argument) led to more interesting results. > > The number of entries being large enough to handle all the pages mapped by > kmstest, monitoring the DMA mapping statistics through > /sys/kernel/debug/dma-api/ showed that the number of free entries dropped > significantly when kmstest was started and didn't raise when it was stopped. > In particular, running kmstest without flipping resulting in a drop of 1266 > free entries, which corresponds to one 1440x900 framebuffer in XR24. The > proved that the pages backing the framebuffer, while freed when the > framebuffer was destroyed, were not unmapped. > > I've thus started investigating the driver GEM implementation. After a few > confusing moments that resulted in the 1/7 to 6/7 cleanup patches, I wrote > patch 7/7 that should fix the issue. So, possibly this series could be applied with the one issue fixed that I reported, as it looks like a nice cleanup, but I think this is still far from making cached bos usable and I would rather have it all in one series. One issue is that if creating a cached buffer without DMM, omap_gem_mmap_obj() will hit if (WARN_ON(!obj->filp)) and break. The other is that the cached bos are unusably slow. I hope that is because of the constant fault, map, unmap cycle, and can be fixed by keeping the buffer mapped and using explicit sync ioctls. But if the performance issue cannot be solved, then I think we should just drop the cached bo support as it's unusable at the moment. Tomi
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel