At Thu, 4 Dec 2014 11:21:47 +0000, Chris Wilson wrote: > > On Thu, Dec 04, 2014 at 11:53:05AM +0100, Takashi Iwai wrote: > > At Wed, 3 Dec 2014 18:31:45 +0000, > > Chris Wilson wrote: > > > > > > On Wed, Dec 03, 2014 at 03:45:35PM +0100, Takashi Iwai wrote: > > > > Hi, > > > > > > > > while checking the reported bug about VT switch hang on openSUSE 13.2, > > > > I also could reproduce a similar issue as reported: namely, X hangs > > > > when repeatedly switching VT quickly. > > > > > > > > For example, running the following on KDE results in the stall of X. > > > > > > > > % for i in $(seq 1 100); do chvt 1; chvt 7; done > > > > > > > > Looking at the sysrq-t output, it stalls at drm_read(). And after > > > > putting some debug prints at event handling codes, it shows like: > > > > > > > > drm_queue_vblank_event event_space=4064 > > > > send_vblank_event event_space=4064 > > > > drm_poll ENTER event_space=4064 > > > > drm_poll mask=0x41 event_space=4064 > > > > drm_poll ENTER event_space=4064 > > > > drm_poll mask=0x41 event_space=4064 > > > > drm_read ENTER event_space=4064 > > > > drm_read total=32 event_space=4096 > > > > drm_poll ENTER event_space=4096 > > > > drm_poll mask=0x0 event_space=4096 > > > > drm_read ENTER event_space=4096 > > > > drm_read ENTER event_space=4096 > > > > drm_read ENTER event_space=4096 > > > > > > > > So, after a vblank event, two poll calls succeeded, followed by one > > > > drm_read(). After that, there were one poll call without event, > > > > followed by three(!) drm_read() calls. The last three drm_read() > > > > never exited, thus X stalled. So, this looks like a race or a > > > > refcount issue somewhere. > > > > > > The key question is how did you get 3 calls to drm_read that each didn't > > > return? The only place where we call drm_read without first doing a poll > > > is in the WakeupHandler with the drm fd flagged for reads. This is > > > broken in ZaphodHeads as the drm fd is not O_NONBLOCK without > > > > > > commit bd008e5b2953186fc0c6633a885ade95e7043800 > > > Author: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > > Date: Tue Oct 7 14:13:51 2014 +0100 > > > > > > drm: Implement O_NONBLOCK support on /dev/dri/cardN > > > > > > I assume that isn't the case as I expect you would have mentioned using > > > ZaphodHeads. > > > > I took a look back at drm_read() code again, and I found that the > > function doesn't care about O_NONBLOCK at all. (And there is a memory > > leak, too.) > > > > So I added the support for O_NONBLOCK, and the problem seems > > resolved. > > > > Although this is no right "fix" (the caller side should be fixed), it > > would be good to have in anyway. I'm going to send patches for review > > to dri-devel ML, as it's no i915 specific. > > I disagree. drm has claimed to support O_NONBLOCK since its inception, > but the implementation was buggy. The nonblock read is obviously buggy. If the current implementation is intentional, then the nonblock flag is somehow misused... > However, I don't think there is a case > in non-ZaphodHeads where we use read() without first select/poll > reporting that there is something to use (and the problem with > ZaphodHeads is that we have two screens that share the same drm fd > without clearing the select read flags... hmm) In my case, I'm using a single screen, so this can't be. And, my rough guess is that this isn't about the lack of poll but rather some race between poll/read or two reads. That explains why my patch worked. In anyway I'd need to trap X stall and diagnose, but I have to leave my machine now. Will check it tomorrow. Meanwhile, it's interesting to see whether this covers Maarten's case, too... thanks, Takashi _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx