> On Wed, Jan 25, 2023 at 2:20 AM Jaewon Kim <jaewon31.kim@xxxxxxxxxxx> wrote: > > > > On Tue, Jan 17, 2023 at 10:54 PM John Stultz <jstultz@xxxxxxxxxx> wrote: > > > > > > > > > > On Tue, Jan 17, 2023 at 12:31 AM Jaewon Kim <jaewon31.kim@xxxxxxxxxxx> wrote: > > > > > > > Using order 4 pages would be helpful for many IOMMUs, but it could spend > > > > > > > quite much time in page allocation perspective. > > > > > > > > > > > > > > The order 4 allocation with __GFP_RECLAIM may spend much time in > > > > > > > reclaim and compation logic. __GFP_NORETRY also may affect. These cause > > > > > > > unpredictable delay. > > > > > > > > > > > > > > To get reasonable allocation speed from dma-buf system heap, use > > > > > > > HIGH_ORDER_GFP for order 4 to avoid reclaim. > > > > > > > > > > Thanks for sharing this! > > > > > The case where the allocation gets stuck behind reclaim under pressure > > > > > does sound undesirable, but I'd be a bit hesitant to tweak numbers > > > > > that have been used for a long while (going back to ion) without a bit > > > > > more data. > > > > > > > > > > It might be good to also better understand the tradeoff of potential > > > > > on-going impact to performance from using low order pages when the > > > > > buffer is used. Do you have any details like or tests that you could > > > > > share to help ensure this won't impact other users? > > > > > > > > > > TJ: Do you have any additional thoughts on this? > > > > > > > > > I don't have any data on how often we hit reclaim for mid order > > > > allocations. That would be interesting to know. However the 70th > > > > percentile of system-wide buffer sizes while running the camera on my > > > > phone is still only 1 page, so it looks like this change would affect > > > > a subset of use-cases. > > > > > > > > Wouldn't this change make it less likely to get an order 4 allocation > > > > (under memory pressure)? The commit message makes me think the goal of > > > > the change is to get more of them. > > > > > > Hello John Stultz > > > > > > I've been waiting for your next reply. > > Sorry, I was thinking you were gathering data on the tradeoffs. Sorry > for my confusion. > > > > With my commit, we may gather less number of order 4 pages and fill the > > > requested size with more number of order 0 pages. I think, howerver, stable > > > allocation speed is quite important so that corresponding user space > > > context can move on within a specific time. > > > > > > Not only compaction but reclaim also, I think, would be invoked more if the > > > __GFP_RECLAIM is added on order 4. I expect the reclaim could be decreased > > > if we move to order 0. > > > > > > > Additionally I'd like to say the old legacy ion system heap also used the > > __GFP_RECLAIM only for order 8, not for order 4. > > > > drivers/staging/android/ion/ion_system_heap.c > > > > static gfp_t high_order_gfp_flags = (GFP_HIGHUSER | __GFP_ZERO | __GFP_NOWARN | > > __GFP_NORETRY) & ~__GFP_RECLAIM; > > static gfp_t low_order_gfp_flags = GFP_HIGHUSER | __GFP_ZERO; > > static const unsigned int orders[] = {8, 4, 0}; > > > > static int ion_system_heap_create_pools(struct ion_page_pool **pools) > > { > > int i; > > > > for (i = 0; i < NUM_ORDERS; i++) { > > struct ion_page_pool *pool; > > gfp_t gfp_flags = low_order_gfp_flags; > > > > if (orders[i] > 4) > > gfp_flags = high_order_gfp_flags; > > > This seems a bit backwards from your statement. It's only removing > __GFP_RECLAIM on order 8 (high_order_gfp_flags). Oh sorry, my fault. I also read wrongly. But as far as I know, most of AP chipset vendors have been using __GFP_RECLAIM only for order 0. I can't say in detail though. > > So apologies again, but how is that different from the existing code? > > #define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_COMP) > #define MID_ORDER_GFP (LOW_ORDER_GFP | __GFP_NOWARN) > #define HIGH_ORDER_GFP (((GFP_HIGHUSER | __GFP_ZERO | __GFP_NOWARN \ > | __GFP_NORETRY) & ~__GFP_RECLAIM) \ > | __GFP_COMP) > static gfp_t order_flags[] = {HIGH_ORDER_GFP, MID_ORDER_GFP, LOW_ORDER_GFP}; > > Where the main reason we introduced the mid-order flags is to avoid > the warnings on order 4 allocation failures when we'll fall back to > order 0 > > The only substantial difference I see between the old ion code and > what we have now is the GFP_COMP addition, which is a bit hazy in my > memory. I unfortunately don't have a record of why it was added (don't > have access to my old mail box), so I suspect it was something brought > up in private review. Dropping that from the low order flags probably > makes sense as TJ pointed out, but this isn't what your patch is > changing. > > Your patch is changing that for mid-order allocations we'll use the > high order flags, so we'll not retry and not reclaim, so there will be > more failing and falling back to single page allocations. > This makes sense to make allocation time faster and more deterministic > (I like it!), but potentially has the tradeoff of losing the > performance benefit of using mid order page sizes. > > I suspect your change is a net win overall, as the cumulative effect > of using larger pages probably won't benefit more than the large > indeterministic allocation time, particularly under pressure. > > But because your change is different from what the old ion code did, I > want to be a little cautious. So it would be nice to see some > evaluation of not just the benefits the patch provides you but also of > what negative impact it might have. And so far you haven't provided > any details there. > > A quick example might be for the use case where mid-order allocations > are causing you trouble, you could see how the performance changes if > you force all mid-order allocations to be single page allocations (so > orders[] = {8, 0, 0};) and compare it with the current code when > there's no memory pressure (right after reboot when pages haven't been > fragmented) so the mid-order allocations will succeed. That will let > us know the potential downside if we have brief / transient pressure > at allocation time that forces small pages. > > Does that make sense? Let me try this. It make take some days. But I guess it depends on memory status as you said. If there were quite many order 4 pages, then 8 4 0 should be faster than 8 0 0. I don't know this is a right approach. In my opinion, except the specific cases like right after reboot, there are not many order 4 pages. And in determinisitic allocation time perspective, I think avoiding too long allocations is more important than making faster with already existing free order 4 pages. BR Jaewon Kim > > thanks > -john