On Wed, Jun 05, 2013 at 02:21:50PM -0300, Paulo Zanoni wrote: > From: Paulo Zanoni <paulo.r.zanoni at intel.com> > > Hi > > So this series is a second version of the patch I sent on April 16th. Daniel > asked to write some patches to properly initialize the interrupts we're touching > on the PC8+ patches, so the first 5 patches should do this. > > The 6th patch is the big one enabling PC8+ and even though I still didn't > address all of Daniel's concerns from his first review, the patch already works: > we get into PC10 and when we get back to PC7 things are still working. Ok, so real review now, not just function name bikeshedding ;-) * I'd really like to see a testcase, but not per se to exercise pc8+ but to check that our prepare/exit code and especially the interrupt disabling/enabling doesn't break anything. So the overall structure of the test should probably be something like 1. Put the console into raw mode (so that fbcon doesn't interfere), Imre's recent infrastructure patches should make this really simple. 2. Disable all outputs with setcrtc(fb=NULL). 3. Wait a bit until we're sure that our code has prepared for pc8+, i.e. all the interrupts are shut off, lcpll is disabled ... 4. Run a subtest. 5. Repeat. For subtests we need quite a pile of them I think: - Test getconnectors ioctl to make sure both our ->detec and ->get_modes callbacks have the proper aux_display runtime pm get/put calls. On a quick read through your patch the get/put (currently still call forbid/allow) in the ->get_modes hooks seem to be missing. - Direct i2c access from userspace. This happens both from random tools probing the i2c bus, but also for real uses-cases, like the ddccontrol tool or e.g. the Chromebook Pixie which uses gmbus as the hw i2c engine for the touchpad. - Submit a patch (this might just work when waking the gt automatically wakes up everything) _and_ do a blocking wait for it (this will break if interrupts are dead). It's probably good to have some correctness checks for these, e.g. compare the edid property while outputs are still on (so pc8+ is impossible) with what happens when they're off. Execbuf could be tested with a simple fill blt. Imo it's also important that we have a wait before each new subtest, to make sure we really teste everything. E.g. a subtest only checks EDID for one output, then waits again. Otherwise it could be that the working refcounting in e.g. HDMI could shadow broken runtime pm refcounting for dp aux if it's done right afterwards. * spin_lock_irqsave isn't that irq-save ;-) It only blocks interrupt on the local cpu. See Rusty's unreliable guide to kernel locking in Documentation/DocBook/kernel-locking. If you don't understand what spin_lock_irqsave is useful for, how our current locking works or why the one in your patch is broken I think we should discuss this afterwards. Reading my irq locking review patches and comparing it to the locking guid could also be interesting. I haven't yet done the full review, but probably the safest route is to fully disable our interrupt handler with disable_irq while we do the pc8+ prepare sequence. On the exit sequence we probably don't need it. In any case we can reenable interrupts again with enable_irq. Maybe there's a possible race there (I need to review the core irq code first), but now I think having a tiny race there where we could miss hpd events is less evil than adding complex locking just for pc8+ entry. Even more so with the next point, which aims to unify pc8+ entry/exit with our existing suspend/resume code. And s/r runs in a non-reentrant enviroment, so if we can keep those basic assumptions that's imo worthwile. * Unifying irq setup/teardown code. I'm not really happy with the setup/teardown code, and I think we should take a good look at trying to unify it with the existing driver load/unload/suspend/resume code. Massive register save/restore code tends to get out of sync with other code changes way too easily. Currently our irq enable sequence is: 1. call ->irq_preinstall hook 2. enable irq handler 3. call ->irq_postinstall hook and somewhen later 4. enable hotplug interrupt handling with our own ->hpd_irq_setup hook. The disabling sequence is: 1. disable irq handler 2. call ->irq_uninstall hook Now for pc8+ we want to disable/enable almost everything _but_ the hpd interrupts. I think with some code wrestling we should be able to do that with the existing functions: - Split the irq_unistall into two parts, irq_unistall which disables all interrupt sources _but_ hotplug, and a hpd_irq_teardown which disables hotplug, too. For that we'd need an i915_irq_unistall functions which first calls drm_irq_uninstall and the new ->hpd_irq_teardown hook. That would also allow us to unify the hotplug reenable timer teardown a bit. - Rework ->irq_postinstall and ->irq_uninstall to not frob hpd settings. Since we control hpd interrupts all through SDEIMR it should be fairly simple to disable everything but hotplug interrupts, even more so with the new helpers I've added in my irq locking review series. - Important is that the new ->irq_uninstall hook won't disable the master interrupt sources, that'd would all be moved into the ->hpd_irq_teardown code. With the pc8+ prepare sequence would look like 1. disable irq handler 2. call ->irq_unistall hook directly -> hpd interrupts are still working 3. re-enable irq handler And the exit sequence would be 1. disable irq handler 2. call ->irq_postinstall 3. re-enable irq handler * As you might have guessed I see the biggest risk in us screwing up the interrupt handling and especially someone sneaking in while we're in pc8+ mode and changing the irq registers. Safe for the hotplug bits which might need masking due to a hotplug storm I expect all such changes to be bugs. The upshot of the above proposal now is that we can make sure that for all the pc8+ relevant interrupt bits calling ->irq_uninstall leaves behind the same state as ->irq_preinstall. So we could sprinkle massive amounts of WARNs into ->irq_postinstall to check that all the interupt registers are still in the correct state and no one changed them. * Finally (and this part is really just an idea I'll throw out here) our driver setup/teardown sequences are getting pretty complicated, and different parts (load, s/r, pc8+, ...) will use them ever so slightly. I think adding an atomic_t dev_priv->driver_stage_ready bit mask would be useful. Every time a driver stage is loaded/setup/resumed it sets the bit, if we unload/suspend/teardown it gets cleared. And every driver setup stage can check at the beginning whether all the required pieces are present. This should help us in documenting the sometimes tricky ordering constraints in general. For this case specifically I'm thinking of 2 driver stages: - general_irq_ready - hpd_irq_ready Since atomic_read is as cheap as any unordered load we could sprinkle a few of these checks over the code, e.g. in the gmbus or dp aux irq-based wait code. Similar e.g. for gem in the __wait_seqno function. This should help a lot in debugging bugs due to missing refcounting. For priorities I think the best approach is to start with the testcase. I think currently you're code is missing quite a few runtime pm get/put calls, so I'm pretty sure it'll blow up. And if it doesn't my understanding of how pc8+ works is seriously wrong. And once we have a solid testcase it should be much easier to move the code piece around a bit so that we have a sane pc8+ enter/exit sequence. Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch