Hi Dave! On Tue, Dec 18, 2012 at 5:38 AM, Dave Airlie <airlied@xxxxxxxxx> wrote: > So I've gotten back to playing with prime for a day, and found some > old intel/radeon tests I had failing, > > Tracked it down to a lifetime issue with the current code and can > think of two fixes, > > The problem scenario is > > 1. i915: create gem object > 2. i915: export gem object to prime > 3. radeon: import gem object > 4. close prime fd > 5. radeon: unref object > 6. i915: unref object > > So we end up at this point, with a imported buffer record for the > dma_buf on the i915 file private. > > Now if a subsequent test (without closing the drm fd) reallocates the > dma_buf with the same address, we can end up seeing that. > > So why doesn't that reference get cleaned up? > > So the reference gets added above at step 2, and when radeon unrefs > the object, i915 gets the dma-buf release callback, however at that > stage > we don't actually have the file priv to remove the pointer from, so it > dangles there. I haven't checked yet, but we seem to have similar leaks by simply self-importing a dma-buf on a different open fd of the same driver already (i.e. the wayland/dri3/rendernode use-case). > Possible fixes: > > a) take a reference on the dma_buf attached to the gem handle when we > export it, keep it until the gem handle goes away. I'm unsure if this > could create zombie objects, since the dma buf has a reference on the > gem object, but since the gem handle is separate to the object it > might work. I'm leaning towards this option, since it'll give us cleaner lifetime rules: - As long as we still have gem handle around, we keep an explicit reference to the dma_buf. We need to keep a full reference since otherwise re-exporting by doing reusing the cached dma_buf will be a royal pain and fraught with races. From a drm core/prime perspective I think we could even unify the 2 dma_buf poiners we have currently - drivers could keep track of whether a gem bo is native or imported in their own state with just one bit ... - If gem->handle_refcount drops to zero we clean up the dma-buf pointer&cache entry and drop that reference. We need to be careful here with races against other users resurrect the object with a new userspace gem handle. But essentially it's the same dance as with flink, with the only complication that we now have two ways to create a new gem handle while we're doing the teardown when the last gem handle is closed. This way dma-buf mirrors the lifetime rules of flink, where we hold a reference on the lookup object (flink name/dma-buf) for as long a gem handle exists. flink names are driver private there's no need for a real refcount, but the current rules amount to an implicit refcount with only 0,1 as values. The lookup-object themselves also hold a reference onto the real backing storage. The second reason why I think we should give the dma-buf (re-)import cache a real reference is my framebuffer lifetime rework. I've played around with tons of different schemes which all kinda looked somewhat silly, until settling on holding a reference while the object is in the lookup idr. This reference gets dropped when the real user disappears: Either at rmfb ioctl time for userspace fbs, or at driver destruction for the fbdev fb (or whenever the driver drops an internally-created fb). Ime that pattern of keeping an explicit reference just leads to less locking headache. > b) don't keep track of dma_buf, keep track of gem objects, when we get > a lookup, check inside the gem object, since currently we NULL out the > export_dma_buf when we hit the release path, apart from the fact I'm > sure the locking is foobar, Scrolled through your patch quickly, but didn't do a full review. Good thing is that I've resurrect my ivb/nvc0 test machine and beaten the igt/prime tests a bit into shape, so I should be able to test things (not like last time around, where my attempt to fix up lifetime rules a bit resulted in some unduly oopses). I'll look at this in-depth next year, pls yell in case I'll forget by then ;-) Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel