2013/12/11 Daniel Vetter <daniel@xxxxxxxx>: > On Wed, Dec 11, 2013 at 6:44 PM, Paulo Zanoni <przanoni@xxxxxxxxx> wrote: >> >> After we re-enable the power well, every time we read/write the VGA >> address 0x3d5 we get an "unclaimed register" interrupt (ERR_INT) and >> print error messages. If we write anything to the VGA MSR register (it >> doesn't really matter which value you write to bit 0), any >> reads/writes to 0x3d5 _don't_ trigger the "unclaimed register" errors >> anymore (even if MSR bit 0 is zero). So what happens with the current >> code is that when we unbind i915 and bind vgacon, we call >> console_unlock(). Function console_unlock() is responsible for >> printing any messages that were supposed to be print when the console >> was locked, so it calls the TTY layer, which calls the console layer, >> which calls vgacon to print the messages. At this point, vgacon >> eventually calls vgacon_set_cursor_size(), which touches 0x3d5, which >> triggers unclaimed register interrupts. The problem is that when we >> get these interrupts, we print the error messages, so we add more work >> to console_unlock(), which will try to print it again, and then call >> vgacon again, trigger a new interrupt, which will put more stuff to >> the buffer, and then we'll be stuck at console_unlock() forever. >> >> If you patch intel_uncore.c to not print anything when we detect >> unclaimed registers, we won't get into the console_unlock() infinite >> loop and the driver unbind will work just fine. We will still be >> getting interrupts every time vgacon touches those registers, but we >> will survive. This is a valid experiment, but IMHO it's not the real >> fix: if we don't print any error messages we will still keep getting >> the interrupts, and if we disable ERR_INT we won't get the interrupt >> anymore, but we will also stop getting all the other error interrupts. >> >> I talked about this problem with the HW engineer and his >> recommendation is "So don't do any VGA I/O or memory access while the >> power well is disabled, and make to re-program MSR after enabling the >> power well and before using VGA I/O or memory accesses.". > > Impressive piece of debugging. And a great story ;-) > >> Based on this, I could write an even-smaller patch that just touches >> the MSR register to avoid the problem, also providing a nice comment >> and an improved commit message. Or I could also write another >> implementation that actually saves the value of MSR before we disable >> the power well, then restores the correct value after we re-enable it: >> but that wouldn't help vgacon too much. IMHO one of these new patches >> would be the appropriate way to solve the problem, but it seems that >> Daniel/Ville didn't like the previous patches, so I'm not sure they >> will be willing to accept the next version either. Since so far I >> haven't seen any alternative implementation/suggestion that actually >> fixes the problem, and since module_reload is not working on Haswell >> for months (possibly hiding other module_reload bugs), my vote is >> obviously to accept the next patch because at least it fixes the >> 6-months-old regression. But if I don't get an "ack" on this I won't >> spend my time working on this. What do you think? > > I'd go with re-disabling vga unconditionally through the msr. We don't > bother with restoring the vga plane anyway at module unload, so vga is > hosed no matter what. And it sounds like this is the more minimal fix. > So ack for the minimal fix to just poke the MSR at the right times. Just to be clear: by "at the right times" you mean "when re-enabling the power well", right? > >> Please also notice that this only fixes the case where the power well >> is already enabled when we unbind. The fix for when the power well is >> _disabled_ when we unbind depends on this, and IMHO we should discuss >> it later. > > I guess I need to resurrect my "kill vgacon" patch and figure out why > it doesn't work. See comment #45 of the bug report for an explanation on how it fails. I'm not sure if it's possible to do what you want. > Or could we just force-enable the power well when we > start our module unload sequence instead? As I mentioned before, driver unbind (a requirement for module unload) happens before we call any of the unload functions, so we never even get to run i915_driver_unload. > I'd prefer the most minimal > fix, and even better if we don't have to touch the vgacon hell at all Me too :) > ... > > Cheers, Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- Paulo Zanoni _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx