On Sat, Nov 19, 2011 at 5:37 PM, Thomas Hellstrom <thellstrom@xxxxxxxxxx> wrote: > On 11/19/2011 10:22 PM, Jerome Glisse wrote: >> >> On Sat, Nov 19, 2011 at 4:01 PM, Thomas Hellstrom<thellstrom@xxxxxxxxxx> >> wrote: >> >>> >>> On 11/19/2011 09:37 PM, Jerome Glisse wrote: >>> >>>> >>>> On Sat, Nov 19, 2011 at 2:46 PM, Thomas Hellstrom<thellstrom@xxxxxxxxxx> >>>> wrote: >>>> >>>> >>>>> >>>>> On 11/19/2011 07:11 PM, Jerome Glisse wrote: >>>>> >>>>> On Sat, Nov 19, 2011 at 12:00 PM, Thomas Hellstrom >>>>> <thellstrom@xxxxxxxxxx> wrote: >>>>> >>>>> >>>>> On 11/19/2011 03:53 PM, Ben Skeggs wrote: >>>>> >>>>> >>>>> On Sat, 2011-11-19 at 11:07 +0100, Thomas Hellstrom wrote: >>>>> >>>>> >>>>> >>>>> On 11/19/2011 01:26 AM, Ben Skeggs wrote: >>>>> >>>>> >>>>> >>>>> On Fri, 2011-11-18 at 23:48 +0100, Thomas Hellstrom wrote: >>>>> >>>>> >>>>> >>>>> >>>>> On 11/18/2011 06:26 PM, Ben Skeggs wrote: >>>>> >>>>> >>>>> >>>>> >>>>> On Fri, 2011-11-18 at 15:30 +0100, Thomas Hellstrom wrote: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On 11/18/2011 02:15 PM, Ben Skeggs wrote: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Fri, 2011-11-18 at 08:57 +0100, Thomas Hellstrom wrote: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Jerome, >>>>> >>>>> I don't like this change for the following reasons >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -snip- >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> One can take advantage of move notify callback but, there are >>>>> corner case where bind/unbind might be call without move notify >>>>> being call (in error path mostly). So to make sure that each >>>>> virtual address space have a consistent view of wether a buffer >>>>> object is backed or not by system page it's better to pass the >>>>> buffer object to bind/unbind. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> As I discussed previously with Jerome on IRC, I believe the >>>>> move_notify >>>>> hook is sufficient. I fixed a couple of issues back with it back >>>>> when I >>>>> implemented support for this in nouveau. >>>>> >>>>> Jerome did point out two issues however, which I believe can be >>>>> solved >>>>> easily enough. >>>>> >>>>> The first was that the error path after move_notify() has been >>>>> called >>>>> doesn't reverse the move_notify() too, leaving incorrect page table >>>>> entries. This, I think, could easily be solved by swapping bo->mem >>>>> and >>>>> mem, and re-calling move_notify() to have the driver reverse >>>>> whatever it >>>>> did. >>>>> >>>>> The second issue is that apparently move_notify() doesn't get called >>>>> in >>>>> the destroy path. Nouveau's GEM layer takes care of this for our >>>>> userspace bos, and we don't use any kernel bos that are mapped into >>>>> a >>>>> channel's address space so we don't hit it. This can probably be >>>>> solved >>>>> easily enough too I expect. >>>>> >>>>> Ben. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> OK. I understand. Surely if a move_notify is missing somewhere, that >>>>> can >>>>> easily be fixed. >>>>> >>>>> It might be good if we can outline how the virtual tables are set up. >>>>> In >>>>> my world, the following would happen: >>>>> >>>>> 1) Pre command submission: >>>>> >>>>> a) reserve bo >>>>> b) call ttm_bo_validate to set placement. move_notify will tear down >>>>> any >>>>> existing GPU page tables for the bo. >>>>> c) Set up GPU page tables. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Nouveau doesn't do this exactly. I wanted to avoid having to check if >>>>> a >>>>> bo was bound to a particular address space on every command >>>>> submission. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> It perhaps might be a good idea to revisit this? >>>>> I think using move_notify to set up a new placement before the data has >>>>> actually been moved is very fragile and not the intended use. That >>>>> would >>>>> also save you from having to handle error paths. Also how do you handle >>>>> swapouts? >>>>> >>>>> >>>>> >>>>> >>>>> I don't see how it's fragile, there's only the one error path after >>>>> that >>>>> point that needs to undo it. And, what *is* the expected use then? I >>>>> see it as "tell the driver the buffer is moving so it can do whatever >>>>> is >>>>> necessary as a result". >>>>> >>>>> Swapouts are a missed case though, indeed. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> A quick check in c) that the virtual map hasn't been torn down by a >>>>> move_notify and, in that case, rebuild it would probably be next to >>>>> free. The virtual GPU mapping would then be valid only after validation >>>>> and while the bo is fenced or pinned. >>>>> >>>>> >>>>> >>>>> >>>>> This alone doesn't solve the swapouts issue either unless you're also >>>>> wanting us to unmap once a buffer becomes unfenced, which would lead to >>>>> loads of completely unnecessary map/unmaps. >>>>> >>>>> Ben. >>>>> >>>>> >>>>> >>>>> >>>>> I think you misunderstand me a little. >>>>> The swapout issue should be handled by calling a move_notify() to kill >>>>> the virtual GPU mappings. >>>>> >>>>> However, when moving data that has page tables pointing to it, one >>>>> should (IMHO) do: >>>>> >>>>> 1) Kill the old page tables >>>>> 2) Move the object >>>>> 3) Set up new page tables on demand. >>>>> >>>>> This is done by TTM for CPU page tables and I think that should be done >>>>> also for GPU page tables. move_notify() should be responsible for 1), >>>>> The driver execbuf for 3), and 3) needs only to be done for the >>>>> particular ring / fifo which is executing commands that touch the >>>>> buffer. >>>>> >>>>> This leaves a clear and well-defined requirement on TTM: >>>>> Notify the driver when it needs to kill its GPU page tables. >>>>> >>>>> >>>>> >>>>> Ok. This I don't really object to really. I read the previous mail as >>>>> that GPU mappings should only exist as long as the buffer is fenced, >>>>> which would be a ridiculous amount of overhead. >>>>> >>>>> >>>>> >>>>> I agree. What I meant was that the page tables wouldn't be considered >>>>> valid >>>>> when the fence had signaled. However that was actually incorrect >>>>> because >>>>> they would actually be valid until the next move_notify(). The >>>>> intention >>>>> was >>>>> never to tear down the mappings once the fence had signaled; that would >>>>> have >>>>> been pretty stupid. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> With the latest patch from jerome: >>>>> Notify the driver when it needs to rebuild it page tables. Also on >>>>> error >>>>> paths, but not for swapins because no driver will probably set up GPU >>>>> page tables to SYSTEM_MEMORY. >>>>> >>>>> This is what I mean with fragile, and I much rather see the other >>>>> approach. >>>>> >>>>> Ben, I didn't fully understand why you didn't want to use that >>>>> approach. >>>>> Did you see a significant overhead with it. >>>>> >>>>> >>>>> >>>>> Now I'm more clear on what you meant, no, it wouldn't be a lot of >>>>> overhead. However, move_notify() was never intended for the use you're >>>>> proposing now, or the new ttm_mem_reg would never have been being >>>>> passed >>>>> in as a parameter... >>>>> >>>>> >>>>> >>>>> I suppose you're right. Still I think this is the right way to go. >>>>> Since >>>>> it >>>>> has the following advantages: >>>>> >>>>> 1) TTM doesn't need to care about the driver re-populating its GPU page >>>>> tables. >>>>> Since swapin is handled from the tt layer not the bo layer, this makes >>>>> it >>>>> a >>>>> bit easier on us. >>>>> 2) Transition to page-faulted GPU virtual maps is straightforward and >>>>> consistent. A non-page-faulting driver sets up the maps at CS time, A >>>>> pagefaulting driver can set them up directly from an irq handler >>>>> without >>>>> reserving, since the bo is properly fenced or pinned when the pagefault >>>>> happens. >>>>> 3) A non-page-faulting driver knows at CS time exactly which >>>>> page-table-entries really do need populating, and can do this more >>>>> efficiently. >>>>> >>>>> So unless there are strong objections I suggest we should go this way. >>>>> >>>>> Even if this changes the semantics of the TTM<-> driver interface >>>>> compared >>>>> to how Nouveau currently does things, it means that Jerome's current >>>>> patch >>>>> is basically correct and doesn't any longer have any assumptions about >>>>> SYSTEM memory never being used for virtual GPU maps. >>>>> >>>>> Thanks, >>>>> Thomas. >>>>> >>>>> >>>>> I think it's better to let driver choose how it will handle its >>>>> virtual page table. For radeon i update in move_notify in order to >>>>> minimize the number of update. I don't want to track if an object have >>>>> been updated or not against each page table. Of course this add >>>>> possibly useless overhead to move notify as we might update page table >>>>> to many time (bo from vram->gtt->system) >>>>> >>>>> I think if move notify is properly call once for each effective move >>>>> driver can do their own dance behind curtain. >>>>> >>>>> Cheers, >>>>> Jerome >>>>> >>>>> >>>>> Jerome, >>>>> >>>>> It's not really what the driver does that is the problem, it's what the >>>>> driver expects from the driver<-> TTM interface. >>>>> >>>>> That's why I'd really like a maintainable interface design before >>>>> coding >>>>> patches that makes the interface a set of various workarounds. >>>>> >>>>> Enforcing this will also force driver writers to think twice before >>>>> implementing things in their own way adding another load of ad hoc >>>>> callbacks >>>>> to TTM, without a clear specification but with the only purpose to fit >>>>> a >>>>> specific driver implementation, so in a way it's our responsibility to >>>>> future driver writers to try to agree on a simple interface that works >>>>> well >>>>> and allows drivers to do an efficient implementation, and that adds a >>>>> little >>>>> maintenance burden on TTM. >>>>> >>>>> If a future driver writer looks at the Radeon code and replicates it, >>>>> because he thinks it's state of the art, and then finds out his code >>>>> breaks >>>>> because he can't use SYSTEM memory for GPU page tables, or use his own >>>>> private LRU, or the code breaks with partially populated TTMs and >>>>> finally >>>>> finds it's quite inefficient too because it unnecessarily populates >>>>> page >>>>> tables, we've failed. >>>>> >>>>> This is really the point I'd like to make, but as a side note, I'm not >>>>> asking you to track each bo against each page table. What I'm asking >>>>> you >>>>> is >>>>> to not populate the page tables in bo_move() but in execbuf / pushbuf. >>>>> The >>>>> possibility to track a bo against each page table comes as a nice >>>>> benefit. >>>>> >>>>> >>>>> /Thomas >>>>> >>>>> >>>>> >>>> >>>> I don't see the issue with updating page table in move_notify. That's >>>> what i do for radeon virtual memory, doing it in command buffer ioctl >>>> is kind of too costly. >>>> >>> >>> Could you describe why it is too costly? I previously asked Ben the same >>> thing and he said it didn't come with a substantial performance penalty. >>> Why >>> would it be more than checking a single bool saying whether page tables >>> had >>> previously been killed or not? This would really help me understand why >>> you >>> want to do it from move_notify. >>> >> >> So we have one page table for each opener of the drm file. This page >> table is in a bo, it's validated in vram when the virtual address page >> is in use. There could be several active page table at the same time >> (GPU working on different work load all possibly in different virtual >> address space). The page table also cover vram (everything is a page >> wether it's in system memory or vram). So in the end the pagetable can >> be quite big like 1M or bigger. Each page table can have the same bo >> at different virtual address (so no sharing btw page table). >> >> So if we were to build/update page table at each cs ioctl we are >> talking about moving around possibly several M of memory. Also if we >> want to go the update way only that means we need to track for each >> page table what was the last position of each bo we want to use, that >> sounds like a big loop. >> >> With move notify we update all page table in which the bo is mapped >> into. Assumption is that bo don't move often and when it does it will >> soon be use. Also that way we don't have to track anything. We know >> that at all point in time all the page table have an up to date entry >> for all the bo. >> >> > > As mentioned previously, and in the discussion with Ben, the page tables > would not need to be rebuilt on each CS. They would be rebuilt only on the > first CS following a move_notify that caused a page table invalidation. > > move_notify: > if (is_incompatible(new_mem_type)) { > bo->page_tables_invalid = true; > invalidate_page_tables(bo); > } > > command_submission: > if (bo->page_tables_invalid) { > set_up_page_tables(bo); > bo->page_tables_invalid = false; > } Why is it different from updating page table in move notify ? I don't see any bonus here, all the information we need are already available in move_notify. > >>>> Note that nouveau also update the page table in >>>> move_notify. So i think it's the right solution. TTM interface for >>>> move notify should just state that it will be call whenever a buffer >>>> change placement. Then we can add a comment on system buffer which can >>>> see there page disappear in thin air at any time. Note that when >>>> placement is system we mark all page table as invalid and point them >>>> to default entry (which is what nouveau is doing too). >>>> >>>> >>> >>> Well I've previously listed a number of disadvantages with this, which I >>> think are still valid, but I fail to see any advantages? Just proposed >>> workarounds for the disadvantages? (FWIW vmwgfx used to have GPU page >>> tables >>> pointing to system memory, and I'd say it would be a pretty common >>> use-case >>> for drivers that don't have a mappable GART). >>> >> >> Ok there is 2 different thing for me, system placement in ttm is when >> the bo will never be use by the gpu (true for both radeon and >> nouveau). TTM_PL_TT is when a bo is bind to a gart and backed by >> system page. We know that the page will there as long as the bo is in >> use. PL_TT means the GPU page table point to system memory. >> > > Memory types in TTM are completely orthogonal to allowed GPU usage. The GPU > may access a bo if it's reserved, fenced or pinned, regardless of its > placement. > > A TT memory type is a *single* GPU aperture that may be mapped from the > aperture side by the CPU (AGP). It may also be used by a single unmappable > aperture that wants to use TTM's range management and eviction (vmwgfx GMR). > The driver can define more than one such memory type (psb), but a bo can > only be placed in one of those at a time, so this approach is unsuitable for > multiple apertures pointing to the same pages. radeon virtual memory have a special address space, the system address space, it's managed by ttm through a TTM_TT (exact same code as current one). All the other address space are not managed by ttm but we require a bo to be bound to ttm_tt to be use, thought we can relax that. That's the reason why i consider system placement as different. > Hardware with multiple page tables (one per context or file descriptor) > would probably want to use SYSTEM memory directly (unichrome SGDMA, old > vmwgfx GMR), unless the per-context page tables can only point to pages that > already reside in a bigger GPU-wide translation table. (IIRC i965 might be > an example of the latter, but I'm unsure). > Cheers, Jerome _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel