Russell King - ARM Linux <linux@xxxxxxxxxxxxxxxx> writes: > On Thu, Aug 13, 2015 at 01:44:03PM -0700, Eric Anholt wrote: >> Struct mutex is here because this code is from the V3D series, with the >> in-kernel BO cache ripped out (it turns out that the CMA allocator is >> slow, and you can't just userspace cache since we have to do allocations >> within the kernel to the tune of a couple per draw and that's too much). > > The CMA allocator is fast until you have pinned pages in its region, > where it becomes _very_ slow to do allocations, sometimes getting up > to the order of seconds. > > The main culpret of this are GFP_HIGHUSER_MOVABLE allocations which > then pin the page. It doesn't take many of those to make CMA really > inefficient. > > The problem is that CMA doesn't get any information back from the > internal page migration about which pages couldn't be moved, so it > dumbly just tries incrementing the allocation by one page (subject > to alignment constraints) and retrying again - repeating over the > entire CMA region. The bigger the region, the more time this takes. Ouch. Since I can workaround the allocation cost, the main problem I have right now is that I've got a set of small allocations for 3D that all need to have the same high 4 bits of paddr, because someone cleverly packed some address bits in a GPU-managed structure. Any recommendations for ways to handle this with CMA?
Attachment:
signature.asc
Description: PGP signature