>-----Original Message----- >From: dri-devel [mailto:dri-devel-bounces@xxxxxxxxxxxxxxxxxxxxx] On Behalf >Of Jesse Barnes >Sent: Wednesday, July 23, 2014 5:00 PM >To: dri-devel@xxxxxxxxxxxxxxxxxxxxx >Subject: Re: [PATCH v2 00/25] AMDKFD kernel driver > >On Mon, 21 Jul 2014 19:05:46 +0200 >daniel at ffwll.ch (Daniel Vetter) wrote: > >> On Mon, Jul 21, 2014 at 11:58:52AM -0400, Jerome Glisse wrote: >> > On Mon, Jul 21, 2014 at 05:25:11PM +0200, Daniel Vetter wrote: >> > > On Mon, Jul 21, 2014 at 03:39:09PM +0200, Christian K?nig wrote: >> > > > Am 21.07.2014 14:36, schrieb Oded Gabbay: >> > > > >On 20/07/14 20:46, Jerome Glisse wrote: > >[snip!!] My BlackBerry thumb thanks you ;) > >> > > > >> > > > The main questions here are if it's avoid able to pin down the >> > > > memory and if the memory is pinned down at driver load, by >> > > > request from userspace or by anything else. >> > > > >> > > > As far as I can see only the "mqd per userspace queue" might be >> > > > a bit questionable, everything else sounds reasonable. >> > > >> > > Aside, i915 perspective again (i.e. how we solved this): When >> > > scheduling away from contexts we unpin them and put them into the >> > > lru. And in the shrinker we have a last-ditch callback to switch >> > > to a default context (since you can't ever have no context once >> > > you've started) which means we can evict any context object if it's >getting in the way. >> > >> > So Intel hardware report through some interrupt or some channel when >> > it is not using a context ? ie kernel side get notification when >> > some user context is done executing ? >> >> Yes, as long as we do the scheduling with the cpu we get interrupts >> for context switches. The mechanic is already published in the >> execlist patches currently floating around. We get a special context >> switch interrupt. >> >> But we have this unpin logic already on the current code where we >> switch contexts through in-line cs commands from the kernel. There we >> obviously use the normal batch completion events. > >Yeah and we can continue that going forward. And of course if your hw can >do page faulting, you don't need to pin the normal data buffers. > >Usually there are some special buffers that need to be pinned for longer >periods though, anytime the context could be active. Sounds like in this case >the userland queues, which makes some sense. But maybe for smaller >systems the size limit could be clamped to something smaller than 128M. Or >tie it into the rlimit somehow, just like we do for mlock() stuff. > Yeah, even the queues are in pageable memory, it's just a ~256 byte structure per queue (the Memory Queue Descriptor) that describes the queue to hardware, plus a couple of pages for each process using HSA to hold things like doorbells. Current thinking is to limit # processes using HSA to ~256 and #queues per process to ~1024 by default in the initial code, although my guess is that we could take the #queues per process default limit even lower. >> > The issue with radeon hardware AFAICT is that the hardware do not >> > report any thing about the userspace context running ie you do not >> > get notification when a context is not use. Well AFAICT. Maybe hardware >do provide that. >> >> I'm not sure whether we can do the same trick with the hw scheduler. >> But then unpinning hw contexts will drain the pipeline anyway, so I >> guess we can just stop feeding the hw scheduler until it runs dry. And >> then unpin and evict. > >Yeah we should have an idea which contexts have been fed to the scheduler, >at least with kernel based submission. With userspace submission we'll be in a >tougher spot... but as you say we can always idle things and unpin everything >under pressure. That's a really big hammer to apply though. > >> > Like the VMID is a limited resources so you have to dynamicly bind >> > them so maybe we can only allocate pinned buffer for each VMID and >> > then when binding a PASID to a VMID it also copy back pinned buffer to >pasid unpinned copy. >> >> Yeah, pasid assignment will be fun. Not sure whether Jesse's patches >> will do this already. We _do_ already have fun with ctx id assigments >> though since we move them around (and the hw id is the ggtt address >> afaik). So we need to remap them already. Not sure on the details for >> pasid mapping, iirc it's a separate field somewhere in the context >> struct. Jesse knows the details. > >The PASID space is a bit bigger, 20 bits iirc. So we probably won't run out >quickly or often. But when we do I thought we could apply the same trick >Linux uses for ASID management on SPARC and ia64 (iirc on sparc anyway, >maybe MIPS too): "allocate" a PASID everytime you need one, but don't tie it >to the process at all, just use it as a counter that lets you know when you need >to do a full TLB flush, then start the allocation process over. This lets you >minimize TLB flushing and gracefully handles oversubscription. IIRC we have a 9-bit limit for PASID on current hardware, although that will go up in future. > >My current code doesn't bother though; context creation will fail if we run out >of PASIDs on a given device. > >-- >Jesse Barnes, Intel Open Source Technology Center >_______________________________________________ >dri-devel mailing list >dri-devel@xxxxxxxxxxxxxxxxxxxxx >http://lists.freedesktop.org/mailman/listinfo/dri-devel _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel