Re: [PATCH 3/6] drm/i915: use vmap in shmem_pin_map

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 22/09/2020 17:33, Christoph Hellwig wrote:
On Tue, Sep 22, 2020 at 05:13:45PM +0100, Tvrtko Ursulin wrote:
   void *shmem_pin_map(struct file *file)
   {
-	const size_t n_pte = shmem_npte(file);
-	pte_t *stack[32], **ptes, **mem;

Chris can comment how much he'd miss the 32 page stack shortcut.

I'd like to see a profile that claim that kmalloc matters in a
path that does a vmap and reads pages through the page cache.
Especially when the kmalloc saves doing another page cache lookup
on the free side.

Only reason I can come up with now is if mapping side is on a latency sensitive path, while un-mapping is lazy/delayed so can be more costly. Then fast map and extra cost on unmap may make sense.

It more applies to the other i915 patch, which implements a much more used API, but whether or not we can demonstrate any difference in the perf profiles I couldn't tell you without trying to collect some.

Is there something in vmap() preventing us from freeing the pages array
here? I can't spot anything that is holding on to the pointer. Or it was
just a sketch before you realized we could walk the vm_area?

Also, I may be totally misunderstanding something, but I think you need to
assign area->pages manually so shmem_unpin_map can access it below.

We need area->pages to hold the pages for the free side.  That being
said the patch I posted is broken because it never assigned to that.
As said it was a sketch.  This is the patch I just rebooted into on
my Laptop:

http://git.infradead.org/users/hch/misc.git/commitdiff/048522dfa26b6667adfb0371ff530dc263abe829

it needs extra prep patches from the series:

http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/alloc_vm_area

   	mapping_clear_unevictable(file->f_mapping);
-	__shmem_unpin_map(file, ptr, shmem_npte(file));
+	for (i = 0; i < shmem_npages(file); i++)
+		put_page(area->pages[i]);
+	kvfree(area->pages);
+	vunmap(ptr);

Is the verdict from mm experts that we can't use vfree due __free_pages vs
put_page differences?

Switched to vfree now.

Could we get from ptes to pages, so that we don't have to keep the
area->pages array allocated for the duration of the pin?

We could do vmalloc_to_page, but that is fairly expensive (not as bad
as reading from the page cache..).  Are you really worried about the
allocation?

Not so much given how we don't even use shmem_pin_map outside selftests.

If we start using it I expect it will be for tiny objects anyway. Only if they end up being pinned for the lifetime of the driver, it may be a pointless waste of memory compared to the downsides of vmalloc_to_page. But we can revisit this particular edge case optimization if the need arises.

I'll look at your other i915 patch tomorrow.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx



[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux