Re: [PATCH] drm/i915: Remove temporary allocation of dma addresses when rotating

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 14/11/2017 18:14, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2017-02-27 14:31:17)

On 27/02/2017 10:21, Chris Wilson wrote:
On Mon, Feb 27, 2017 at 10:14:12AM +0000, Tvrtko Ursulin wrote:

On 27/02/2017 10:06, Chris Wilson wrote:
On Mon, Feb 27, 2017 at 09:55:10AM +0000, Tvrtko Ursulin wrote:

On 22/02/2017 08:44, Chris Wilson wrote:
I also think that's an argument for improving the general cache rather
than arguing against using it.

Well I wasn't concerned about the cache per se, but about whether it
is completely appropriate (best choice) to use it in this particular
case.

Because as I said before, for 1920x1080x32 we are talking about a
16KiB extremely short lived temporary allocation, vs the similar
size for the sg radix cache. But radix cache sticks around the the
lifetime of obj->mm.pages and it wouldn't otherwise be there since
AFAICS in practice no one really touches frame buffers in a way to
trigger its creation.

Those amounts of memory are not a concern, but again, is the
simplification of the code worth the conceptual downsides mentioned
above? Even if we considered 4K frame buffers, when both allocations
go to ~64KiB, would that change anything? I am not sure, probably
not for me.

So I am still unsure that we should go with this change.

Again, the complaint you have here are general concerns about caching
the mapping. Avoiding using the cache instead of improving the cache
seems the wrong approach.

Depends what kind of improvments to the cache you have in mind. If
you are thinking about size then I disagree, I think it is efficient
enough already. But if you are thinking about the lifetime
management then it is obvious from all that I have written so far
that I would agree with that. Since the core of my "complaint" is
the lifetime mismatch, and not the size.

For lifetime I am not sure what you could do. Exposing the size of
it, with maybe some other bits attached the the object, to the
shrinker I think doesn't make much sense since the sizes are so
small compared to the backing store sizes.

Perhaps you could add an explicit reset of the cache after the
rotation is done with it, but then the only remaining benefit will
be avoiding greater than zero order allocations. I say the only
one.. it would still be a good one. Just have no idea if this level
of cache usage would satisfy you!

Perhaps you could say what kind of optimisation you have in mind to
save me guessing? :)

I was thinking you would like an inactivity timer. Or we could have a
separate shrinker, as that's the principal cache management system.

I thought about the shrinker myself. Even wrote some code to more
accurately size the objects as part of the existing passes. But as I
said the contribution of anything object and not backing store is so
small that, even though it would conceptually be more correct and
perhaps avoid some marginal over-shrinking, I am not sure it is worth
doing it. Assuming of course that I got the sizing of the radix tree
correct! I just hacked something up based on some debug dumping code
from radix-tree.c.

So the complication is there is no API to get the size of the radix tree
(or the scatter list table) and we would have to add something, either
internally to i915, or try and upstream it.

Or we avoid that with your timer idea and just purge all caches which
haven't been used in a while. Maybe from idle work or something.

Tempting. I like hooking into mark_idle/park more than adding a new
timer, and we already have the precedent of using that to initiate a
cache flush.

What's the impact of us keeping pages pinned when idle -- (a lot) more
work in the shrinker. Let's see where the cost-beneift lies.
But for this immediate patch, would you be happy with adding and
exporting i915_gem_object_reset_page_iter and calling it after rotation
is done with accessing the pages? Benefit would be avoidance of
drm_malloc_gfp if that bothers you most.

Honestly I think the page_iter cache is useful and likely to already
exist or be used shortly after a portion of the object is rotated.

How come? I thought CPU access to framebuffers is atypical nowadays.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux