On 29/03/22 9:30 pm, Arunpravin Paneer Selvam wrote: > > > On 29/03/22 4:54 pm, Christian König wrote: >> Am 29.03.22 um 13:19 schrieb Arunpravin Paneer Selvam: >>> [SNIP] >>>>> + pages_left = node->base.num_pages; >>>>> >>>>> i = 0; >>>>> - spin_lock(&mgr->lock); >>>>> while (pages_left) { >>>>> - uint32_t alignment = tbo->page_alignment; >>>>> + if (tbo->page_alignment) >>>>> + min_page_size = tbo->page_alignment << PAGE_SHIFT; >>>>> + else >>>>> + min_page_size = mgr->default_page_size; >>>> The handling here looks extremely awkward to me. >>>> >>>> min_page_size should be determined outside of the loop, based on default_page_size, alignment and contiguous flag. >>> I kept min_page_size determine logic inside the loop for cases 2GiB+ >>> requirements, Since now we are round up the size to the required >>> alignment, I modified the min_page_size determine logic outside of the >>> loop in v12. Please review. >> >> Ah! So do we only have the loop so that each allocation isn't bigger >> than 2GiB? If yes couldn't we instead add a max_alloc_size or something >> similar? > yes we have the loop to limit the allocation not bigger than 2GiB, I > think we cannot avoid the loop since we need to allocate the remaining > pages if any, to complete the 2GiB+ size request. In other words, first > iteration we limit the max size to 2GiB and in the second iteration we > allocate the left over pages if any. Hi Christian, Here my understanding might be incorrect, should we limit the max size = 2GiB and skip all the remaining pages for a 2GiB+ request? Thanks, Arun >> >> BTW: I strongly suggest that you rename min_page_size to min_alloc_size. >> Otherwise somebody could think that those numbers are in pages and not >> bytes. > modified in v12 >> >>>> Then why do you drop the lock and grab it again inside the loop? And what is "i" actually good for? >>> modified the lock/unlock placement in v12. >>> >>> "i" is to track when there is 2GiB+ contiguous allocation request, first >>> we allocate 2GiB (due to SG table limit) continuously and the remaining >>> pages in the next iteration, hence this request can't be a continuous. >>> To set the placement flag we make use of "i" value. In our case "i" >>> value becomes 2 and we don't set the below flag. >>> node->base.placement |= TTM_PL_FLAG_CONTIGUOUS; >>> >>> If we don't get such requests, I will remove "i". >> >> I'm not sure if that works. >> >> As far as I can see drm_buddy_alloc_blocks() can allocate multiple >> blocks at the same time, but i is only incremented when we loop. >> >> So what you should do instead is to check if node->blocks just contain >> exactly one element after the allocation but before the trim. > ok >> >>>>> + >>>>> + /* Limit maximum size to 2GB due to SG table limitations */ >>>>> + pages = min(pages_left, 2UL << (30 - PAGE_SHIFT)); >>>>> >>>>> if (pages >= pages_per_node) >>>>> - alignment = pages_per_node; >>>>> - >>>>> - r = drm_mm_insert_node_in_range(mm, &node->mm_nodes[i], pages, >>>>> - alignment, 0, place->fpfn, >>>>> - lpfn, mode); >>>>> - if (unlikely(r)) { >>>>> - if (pages > pages_per_node) { >>>>> - if (is_power_of_2(pages)) >>>>> - pages = pages / 2; >>>>> - else >>>>> - pages = rounddown_pow_of_two(pages); >>>>> - continue; >>>>> - } >>>>> - goto error_free; >>>>> + min_page_size = pages_per_node << PAGE_SHIFT; >>>>> + >>>>> + if (!is_contiguous && !IS_ALIGNED(pages, min_page_size >> PAGE_SHIFT)) >>>>> + is_contiguous = 1; >>>>> + >>>>> + if (is_contiguous) { >>>>> + pages = roundup_pow_of_two(pages); >>>>> + min_page_size = pages << PAGE_SHIFT; >>>>> + >>>>> + if (pages > lpfn) >>>>> + lpfn = pages; >>>>> } >>>>> >>>>> - vis_usage += amdgpu_vram_mgr_vis_size(adev, &node->mm_nodes[i]); >>>>> - amdgpu_vram_mgr_virt_start(&node->base, &node->mm_nodes[i]); >>>>> - pages_left -= pages; >>>>> + BUG_ON(min_page_size < mm->chunk_size); >>>>> + >>>>> + mutex_lock(&mgr->lock); >>>>> + r = drm_buddy_alloc_blocks(mm, (u64)place->fpfn << PAGE_SHIFT, >>>>> + (u64)lpfn << PAGE_SHIFT, >>>>> + (u64)pages << PAGE_SHIFT, >>>>> + min_page_size, >>>>> + &node->blocks, >>>>> + node->flags); >>>>> + mutex_unlock(&mgr->lock); >>>>> + if (unlikely(r)) >>>>> + goto error_free_blocks; >>>>> + >>>>> ++i; >>>>> >>>>> if (pages > pages_left) >>>>> - pages = pages_left; >>>>> + pages_left = 0; >>>>> + else >>>>> + pages_left -= pages; >>>>> } >>>>> - spin_unlock(&mgr->lock); >>>>> >>>>> - if (i == 1) >>>>> + /* Free unused pages for contiguous allocation */ >>>>> + if (is_contiguous) { >>>> Well that looks really odd, why is trimming not part of >>>> drm_buddy_alloc_blocks() ? >>> we didn't place trim function part of drm_buddy_alloc_blocks since we >>> thought this function can be a generic one and it can be used by any >>> other application as well. For example, now we are using it for trimming >>> the last block in case of size non-alignment with min_page_size. >> >> Good argument. Another thing I just realized is that we probably want to >> double check if we only allocated one block before the trim. > ok >> >> Thanks, >> Christian. >>