Hi Ben, I´ll switch the conversation to the mailing list... In the case of prime_self_import, the problem is self-contained (it doesn´t really need a previous test A): the first subtest opens the first fd, which provokes a context switch (gem_quiescent_gpu). This switch is actually completed (gem_quiescent_gpu makes sure with a gem_sync) and the old context disposed of, but its backing object remains alive until a retire_work kicks in (which in my case usually happens in the middle of the prime_self_test/export-vs-gem_close-race subtest, thus the "-1 objects leaked"). The comment in do_switch says it all: /* The backing object for the context is done after switching to the * *next* context. Therefore we cannot retire the previous context until * the next context has already started running. In fact, the below code * is a bit suboptimal because the retiring can occur simply after the * MI_SET_CONTEXT instead of when the next seqno has completed. */ I´ll send a fix for prime_self_import, but... maybe we should make sure that the GPU is really quiescent, rather than fixing individual tests? (retire requests via drop caches at the end of gem_quiescent_gpu?). -- Oscar > -----Original Message----- > From: Ben Widawsky [mailto:benjamin.widawsky@xxxxxxxxx] > Sent: Friday, November 01, 2013 3:38 AM > To: Mateo Lozano, Oscar > Cc: Chris Wilson; Daniel Vetter > Subject: Re: new PPGTT patches pushed > > That wasn't clear... I fixed gem_flink_close, I didn't touch the prime test. > > On Thu, Oct 31, 2013 at 8:29 PM, Ben Widawsky > <benjamin.widawsky@xxxxxxxxx> wrote: > > Yep, I agree this looks like the problem here. > > > > What likely happens is a context from a previous run (which was > > created for the fd) finally dies. So for example: > > > > Test A creates context, runs, finishes. > > (Context is not destroyed yet since we didn't switch away) Test > > prime_self_import runs, opens the fd, and creates the context, but > > doesn't yet switch. The switch will kill the context from test A. This > > is how we are minus one. > > > > I've pushed a fix to my intel-gpu-tools PPGTT branch which uses drop > > caches to switch back to the global default context when needed. > > > > Would you like to fix prime_self_import, I hope it's the same? > > > > On Wed, Oct 30, 2013 at 04:31:11PM +0000, Mateo Lozano, Oscar wrote: > >> Hi Ben, > >> > >> I think I homed inthe cause for the regression in prime_self_import > (maybe gem_flink_race as well): > >> > >> When the first fd is opened with drm_open_any(), it calls > gem_quiescent_gpu() which in turn sends a nop execbuf to all the rings. > During the do_switch() for the render ring, the backing object for the old > context happens to be kept alive until later on. If this backing bo is freed > between consequtive calls to get_object_count(), then we have a "false" > leaking object report. > >> > >> Printk messages during the do_switch(): > >> > >> [ 428.389210] ACHTUNG! do_switch render ring, to ctx object > ffff8800a21d4a80 > >> [ 428.391717] Is object pinned now?: no > >> [ 428.393199] Object set to gtt domain > >> [ 428.394601] GTT offset: 020dd000, size: 00011000, table: ggtt) > >> [ 428.394630] hw_flags |= MI_RESTORE_INHIBIT > >> [ 428.396605] mi_set_context succesful > >> [ 428.397284] ACHTUNG! : from ctx object ffff8800a21d5800 <--------- > ---------------------------------------------------------------------------------------------- > --------- backing bo for the old context > >> [ 428.397918] GTT offset: 020cb000, size: 00011000, table: ggtt) <--------- > ---------------------------------------------------------------------------------------------- > --------- > >> [ 428.397931] Done! > >> > >> GTT just after the first get_object_count() in the "export-vs-gem_close- > race" test: > >> > >> ffff880240cde000: p g 68KiB 10 00 0 0 0 L3+LLC dirty (pinned x 1) (ggtt > offset: 00000000, size: 00011000) > >> ffff880240cde180: p g 4KiB 01 01 0 0 0 snooped or LLC (pinned x 1) > (ggtt offset: 00011000, size: 00001000) (p mappable) > >> ffff880240cde300: p g 128KiB 40 40 0 0 0 snooped or LLC dirty (pinned x > 1) (ggtt offset: 00012000, size: 00020000) (p mappable) > >> ffff880240cde480: p g 4KiB 01 01 0 0 0 snooped or LLC (pinned x 1) > (ggtt offset: 00032000, size: 00001000) (p mappable) > >> ffff880240cde600: p g 4KiB 01 01 0 0 0 snooped or LLC (pinned x 1) > (ggtt offset: 00033000, size: 00001000) (p mappable) > >> ffff880240cde780: p g 128KiB 40 40 0 0 0 snooped or LLC dirty (pinned x > 1) (ggtt offset: 00034000, size: 00020000) (p mappable) > >> ffff880240cde900: p g 4KiB 01 01 0 0 0 snooped or LLC (pinned x 1) > (ggtt offset: 00054000, size: 00001000) (p mappable) > >> ffff880240cdea80: p g 128KiB 40 40 0 0 0 snooped or LLC dirty (pinned x > 1) (ggtt offset: 00055000, size: 00020000) (p mappable) > >> ffff880240cdec00: p g 4KiB 01 01 0 0 0 snooped or LLC (pinned x 1) > (ggtt offset: 00075000, size: 00001000) (p mappable) > >> ffff880240cded80: p g 128KiB 40 40 0 0 0 snooped or LLC dirty (pinned x > 1) (ggtt offset: 00076000, size: 00020000) (p mappable) > >> ffff880240cdef00: p g 8100KiB 41 00 0 0 0 uncached (pinned x 2) > (display) (ggtt offset: 00096000, size: 007e9000) (stolen: 00000000) (p > mappable) > >> ffff8800a21d5800: g 68KiB 10 00 272 0 0 L3+LLC dirty (ggtt offset: > 020cb000, size: 00011000) (render ring) <-------------------------------------------- > --------------- backing bo still alive and kicking > >> ffff8800a21d4a80: p g 68KiB 41 00 0 0 0 L3+LLC (pinned x 1) (ggtt > offset: 020dd000, size: 00011000) > >> ffff8800a21d4480: g 16KiB 41 00 0 0 0 snooped or LLC (ggtt offset: > 020ee000, size: 00004000) (f mappable) > >> Total 14 objects, 9064448 bytes, 9064448 GTT size > >> > >> GTT just before the second get_object_count() in the "export-vs- > gem_close-race" test: > >> > >> ffff880240cde000: p g 68KiB 10 00 0 0 0 L3+LLC dirty (pinned x 1) (ggtt > offset: 00000000, size: 00011000) > >> ffff880240cde180: p g 4KiB 01 01 0 0 0 snooped or LLC (pinned x 1) > (ggtt offset: 00011000, size: 00001000) (p mappable) > >> ffff880240cde300: p g 128KiB 40 40 0 0 0 snooped or LLC dirty (pinned x > 1) (ggtt offset: 00012000, size: 00020000) (p mappable) > >> ffff880240cde480: p g 4KiB 01 01 0 0 0 snooped or LLC (pinned x 1) > (ggtt offset: 00032000, size: 00001000) (p mappable) > >> ffff880240cde600: p g 4KiB 01 01 0 0 0 snooped or LLC (pinned x 1) > (ggtt offset: 00033000, size: 00001000) (p mappable) > >> ffff880240cde780: p g 128KiB 40 40 0 0 0 snooped or LLC dirty (pinned x > 1) (ggtt offset: 00034000, size: 00020000) (p mappable) > >> ffff880240cde900: p g 4KiB 01 01 0 0 0 snooped or LLC (pinned x 1) > (ggtt offset: 00054000, size: 00001000) (p mappable) > >> ffff880240cdea80: p g 128KiB 40 40 0 0 0 snooped or LLC dirty (pinned x > 1) (ggtt offset: 00055000, size: 00020000) (p mappable) > >> ffff880240cdec00: p g 4KiB 01 01 0 0 0 snooped or LLC (pinned x 1) > (ggtt offset: 00075000, size: 00001000) (p mappable) > >> ffff880240cded80: p g 128KiB 40 40 0 0 0 snooped or LLC dirty (pinned x > 1) (ggtt offset: 00076000, size: 00020000) (p mappable) > >> ffff880240cdef00: p g 8100KiB 41 00 0 0 0 uncached (pinned x 2) > (display) (ggtt offset: 00096000, size: 007e9000) (stolen: 00000000) (p > mappable) > >> ffff8800a21d4a80: p g 68KiB 41 00 0 0 0 L3+LLC (pinned x 1) (ggtt > offset: 020dd000, size: 00011000) > >> ffff8800a21d4480: g 16KiB 41 00 0 0 0 snooped or LLC (ggtt offset: > 020ee000, size: 00004000) (f mappable) > >> Total 13 objects, 8994816 bytes, 8994816 GTT size > >> > >> Results of the test: > >> > >> leaked -1 objects > >> Test assertion failure function test_export_close_race, file > prime_self_import.c:392: > >> Failed assertion: obj_count == 0 > >> Subtest export-vs-gem_close-race: FAIL > >> > >> I´m struggling to understand how this happens exactly, but I can avoid it > by sending an extra nop execbuffer to the render ring right after > gem_quiescent_gpu(). I´m not saying this is a fix, but rather a (meaningful?) > thought experiment. > >> It looks to me like this is not a problem with the KMD, bur rather with the > way the test is written. What do you think? > >> -- Oscar _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx