Hi Chris,
On 3 June 2014 16:12, Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> wrote:
On Mon, Jun 02, 2014 at 02:18:14PM +0100, Sam Jansen wrote:> [1]http://cgit.freedesktop.org/drm-intel. The results are the same.
> Hello intel-gfx,
> I'm working on an application using VA-API for H264 encode+decode, and
> JPEG decode on an Atom E3815. Unfortunately we've hit what I believe is a
> kernel bug, and the "perf top" output is pointing at i915 DRM code.
> After some amount of time running my application, the system will become
> unresponsive (userspace applications get very little CPU, system CPU will
> go up to 80+%), and sometimes the system will appear out of memory for a
> period (the OOM killer is sometimes invoked), even though there is a lot
> of free memory on the system. I noticed this first on kernel 3.14.5, so I
> moved to "drm-intel-nightly", built on Friday (2014-05-30), from
> Using "perf top", I have gathered the following traces showing high systemIt's a buffer leak in the userspace va-api application. The buffers
> CPU at the time when the system was encountering this problem:
appear as cached memory, they are not yet accounted against the
applications that have a reference to them. Look at
/sys/kernel/debug/dri/0/i915_gem_objects for a breakdown of users.
Thanks for taking the time to respond. I had previously ruled out buffer leaks by using valgrind and similar to track down any user-space leaks -- VA-API buffers have user-space metadata allocated with malloc/calloc, so if you leak these it is fairly easy to track down.
However, given the new knowledge that the memory really is associated with my app, I used divide-and-conquer to eventually track the issue down to my JPEG decoder. I found that due to not updating one bit of state, I was accidentily creating/destroying the surfaces and context every frame. I've fixed that, and my application no longer leaks "cached" kernel memory.
I thought perhaps this is still a real bug, as it looks to me like my application was cleaning up resources correctly. So I've managed to reproduce my results using the "loadjpeg" test application distributed with libva, with only minimal changes: looping to decode the JPEG image many times a second, and cleaning up buffers each iteration. I've no idea if this problem is limited to just the JPEG decoder, but it seemed the simplest test app to hack. When I run this modified version of loadjpeg with a ~720p image, I leak ~40M cached memory/sec, ~100 objects/sec (as shown by i915_gem_objects).
I've attached the patch in case you are interested.
As an aside, while debugging this, I hit the attached OOPS a couple of times, while running "watch cat /sys/kernel/debug/dri/0/i915_gem_objects".
Cheers,
Sam
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[12969.134772] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028 [12969.134935] IP: [<ffffffff81394951>] per_file_stats+0xc9/0x12d [12969.135051] PGD 6b9fb067 PUD 5b5e1067 PMD 0 [12969.135141] Oops: 0000 [#1] SMP [12969.135208] Modules linked in: lpc_ich(E) mfd_core(E) rtc_cmos(E) i2c_hid(E) [12969.135358] CPU: 0 PID: 9578 Comm: cat Tainted: G E 3.15.0-rc7-sl-01023-g085391259 #4 [12969.135514] Hardware name: \xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff \xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff/DE3815TYKH, BIOS TYBYT10H.86A.0019.2014.0327.1516 03/27/201 [12969.135755] task: ffff8800750ccb60 ti: ffff880059db0000 task.ti: ffff880059db0000 [12969.135887] RIP: 0010:[<ffffffff81394951>] [<ffffffff81394951>] per_file_stats+0xc9/0x12d [12969.136039] RSP: 0018:ffff880059db1d58 EFLAGS: 00010246 [12969.136134] RAX: 0000000000000000 RBX: ffff880059db1e40 RCX: 0000000000000000 [12969.136259] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff88006852c000 [12969.136385] RBP: ffff880059db1d68 R08: 000000000000000a R09: 00000000fffffff7 [12969.136511] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88006852c000 [12969.136636] R13: 00000000ffffffff R14: ffff88006852c000 R15: ffffffff81394888 [12969.136763] FS: 00007f38d16a7700(0000) GS:ffff880079200000(0000) knlGS:0000000000000000 [12969.136905] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [12969.137007] CR2: 0000000000000028 CR3: 0000000067208000 CR4: 00000000001007f0 [12969.137132] Stack: [12969.137170] ffff880059db1d98 0000000000000167 ffff880059db1dd8 ffffffff812b0797 [12969.137316] ffff880059db1e40 0000ffff00000000 ffff88006713a940 ffff88006713b180 [12969.137461] ffff880059db1e08 ffff880059db1da8 ffff880059db1f58 ffff88005b737400 [12969.137606] Call Trace: [12969.137659] [<ffffffff812b0797>] idr_for_each+0xac/0xd7 [12969.137758] [<ffffffff813947fc>] i915_gem_object_info+0x405/0x491 [12969.137874] [<ffffffff811a849c>] seq_read+0x161/0x317 [12969.137970] [<ffffffff8118d147>] vfs_read+0x95/0xf0 [12969.138063] [<ffffffff8118d8e0>] SyS_read+0x46/0x79 [12969.138156] [<ffffffff81676253>] tracesys+0xe1/0xe6 [12969.138245] Code: 53 20 eb 15 48 8b 92 e0 01 00 00 48 85 d2 74 3b 48 8b 3b 48 39 7a 10 74 32 48 8b 40 68 48 83 e8 68 eb a5 49 8b 44 24 08 4c 89 e7 <48> 8b 70 28 48 81 c6 90 79 00 00 e8 94 e8 00 00 84 c0 74 2b 49 [12969.138849] RIP [<ffffffff81394951>] per_file_stats+0xc9/0x12d [12969.138959] RSP <ffff880059db1d58> [12969.139022] CR2: 0000000000000028 [12969.178528] ---[ end trace 8403dc25eeb2b354 ]---
diff --git a/test/decode/tinyjpeg.c b/test/decode/tinyjpeg.c index 045e79a..fac1d01 100644 --- a/test/decode/tinyjpeg.c +++ b/test/decode/tinyjpeg.c @@ -578,6 +578,8 @@ int tinyjpeg_decode(struct jdec_private *priv) &attrib, 1,&config_id); CHECK_VASTATUS(va_status, "vaQueryConfigEntrypoints"); + while (1) { + va_status = vaCreateSurfaces(va_dpy,VA_RT_FORMAT_YUV420, priv->width,priv->height, //alignment? &surface_id, 1, NULL, 0); @@ -732,12 +734,26 @@ int tinyjpeg_decode(struct jdec_private *priv) va_status = va_put_surface(va_dpy, surface_id, &src_rect, &dst_rect); CHECK_VASTATUS(va_status, "vaPutSurface"); } - printf("press any key to exit\n"); - getchar(); - - vaDestroySurfaces(va_dpy,&surface_id,1); + //printf("press any key to exit\n"); + //getchar(); + + va_status = vaDestroyBuffer(va_dpy, pic_param_buf); + CHECK_VASTATUS(va_status, "vaDestroyBuffer"); + va_status = vaDestroyBuffer(va_dpy, iqmatrix_buf); + CHECK_VASTATUS(va_status, "vaDestroyBuffer"); + va_status = vaDestroyBuffer(va_dpy, huffmantable_buf); + CHECK_VASTATUS(va_status, "vaDestroyBuffer"); + va_status = vaDestroyBuffer(va_dpy, slice_param_buf); + CHECK_VASTATUS(va_status, "vaDestroyBuffer"); + va_status = vaDestroyBuffer(va_dpy, slice_data_buf); + CHECK_VASTATUS(va_status, "vaDestroyBuffer"); + + va_status = vaDestroySurfaces(va_dpy,&surface_id,1); + CHECK_VASTATUS(va_status, "vaDestroySurfaces"); + va_status = vaDestroyContext(va_dpy,context_id); + CHECK_VASTATUS(va_status, "vaDestroyContext"); + } vaDestroyConfig(va_dpy,config_id); - vaDestroyContext(va_dpy,context_id); vaTerminate(va_dpy); va_close_display(va_dpy);
_______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx