On Thu, 20 Oct 2016 16:17:25 +0200, Ville Syrjälä wrote: > > On Thu, Oct 20, 2016 at 03:36:54PM +0200, Takashi Iwai wrote: > > On Thu, 20 Oct 2016 15:28:14 +0200, > > Ville Syrjälä wrote: > > > > > > On Thu, Oct 20, 2016 at 03:20:55PM +0200, Takashi Iwai wrote: > > > > Since 4.7 kernel, we've seen the error messages like > > > > > > > > kernel: [TTM] Buffer eviction failed > > > > kernel: qxl 0000:00:02.0: object_init failed for (4026540032, 0x00000001) > > > > kernel: [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO > > > > > > > > on QXL when switching and accessing on VT. The culprit was the generic > > > > deferred_io code (qxl driver switched to it since 4.7). There is a > > > > race between the dirty clip update and the call of callback. > > > > > > > > In drm_fb_helper_dirty(), the dirty clip is updated in the spinlock, > > > > while it kicks off the update worker outside the spinlock. Meanwhile > > > > the update worker clears the dirty clip in the spinlock, too. Thus, > > > > when drm_fb_helper_dirty() is called concurrently, schedule_work() is > > > > called after the clip is cleared in the first worker call. > > > > > > Why does that matter? The first worker should have done all the > > > necessary work already, no? > > > > Before the first call, it clears the clip and passes the copied clip > > to the callback. Then the second call will be with the cleared and > > untouched clip, i.e. with x1=~0. This confuses > > qxl_framebuffer_dirty(). > > > > Of course, we can filter out in the callback side by checking the > > clip. It was actually my first version. But basically it's a race > > and should be covered better in the caller side. > > The race is still there AFAICS. The worker may already be executing but > not yet in the critical section, at which point drm_fb_helper_dirty() > will expand the dirty rectangle, and schedule another work. So the first > worker will already see the expanded rectangle, and second worker will > get zilch. Hrm, right, there's a slight race window there. > I think the only good fix is to have the worker validate the dirty > rectangle before calling the driver. OK, let me cook it quickly. (It was actually the second version of the patch I wrote, and I sent the third one :) thanks, Takashi _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel