2013/12/4 Daniel Vetter <daniel@xxxxxxxx>: > On Wed, Dec 4, 2013 at 2:44 PM, Paulo Zanoni <przanoni@xxxxxxxxx> wrote: >> 2013/12/4 Daniel Vetter <daniel@xxxxxxxx>: >>> On Thu, Nov 21, 2013 at 01:47:18PM -0200, Paulo Zanoni wrote: >>>> From: Paulo Zanoni <paulo.r.zanoni@xxxxxxxxx> >>>> >>>> Currently, PC8 is enabled at modeset_global_resources, which is called >>>> after intel_modeset_update_state. Due to this, there's a small race >>>> condition on the case where we start enabling PC8, then do a modeset >>>> while PC8 is still being enabled. The racing condition triggers a WARN >>>> because intel_modeset_update_state will mark the CRTC as enabled, then >>>> the thread that's still enabling PC8 might look at the data structure >>>> and think that PC8 is being enabled while a pipe is enabled. Despite >>>> the WARN, this is not really a bug since we'll wait for the >>>> PC8-enabling thread to finish when we call modeset_global_resources. >>>> >>>> So this patch makes sure we get/put PC8 before we update >>>> drm_crtc->enabled, because if a get() call triggers a PC8 disable, >>>> we'll call cancel_delayed_work_sync(), which will wait for the thread >>>> that's enabling PC8, then, after this, we'll disable PC8. >>>> >>>> The side-effect benefit of this patch is that we have a nice place to >>>> track enabled/disabled CRTCs, so we may want to move some code from >>>> modeset_global_resources to intel_crtc_set_state in the future. >>>> >>>> The problem fixed by this patch can be reproduced by the >>>> modeset-lpsp-stress-no-wait subtest from the pc8 test of >>>> intel-gpu-tools. >>>> >>>> v2: - No need for pc8.lock since we already have >>>> cancel_delayed_work_sync(). >>>> >>>> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@xxxxxxxxx> >>> >>> Now that Imre's big rework has landed this looks a bit strange. We now >>> have the display power wells (properly refcounted) and the hsw pc8 stuff >>> here. Which imo doesn't make much sense since to have a working display >>> power well we can't go into pc8. >> >> This patch has nothing to do with power wells. We're checking >> enabled/disabled CRTCs here, not power wells. The problem this patch >> solves also happens on LPSP mode, where the power well is disabled. >> I'm confused, please clarify. > > With Imre's power well rework we now have two pieces that manage the > power state of the display hw: modeset_update_power_wells and > modeset_update_power_wells. The later is redundant since you can't > ever enable a crtc power well without disabling pc8. > > Note that I'm talking about the generic display power wells Imre > added, which means we also track the power domain usage for crtc A. > > My idea is to completely ditch the call to hsw_update_package_c8 and > shovel the respective pc8 get/put calls into the right power domains > in the haswell display power domain implementation in intel_pm.c. That > means that we need to adjust the always-on power well code a bit for > hsw. > > In a way power domains should nest, and code should always only care > about the innermost power domain. If it needs to grab explicit > references for power domains outside of that the structure isn't quite > right. Thanks for the clarification! Just documenting what we discussed on IRC yesterday: I agree this is the way to go, but I think this should all be follow-up patches. We'll also probably need to add some new power domains: for example which one do we grab when someone submits a execbuf? POWER_DOMAIN_GT? > >>> So the right thing to do is to only grab the topmost power well/domain >>> reference. Those in turn then need to grab the lower-level domains. I.e. >>> on hsw the crtc get/put should also do a pc8 get/put and then that in turn >>> should do a D3 get/put (if we keep them split up). >>> >> >> Same as above. >> >> >>> With that change it'd also make tons of sense to move all the hsw pc8 >>> stuff into intel_pm.c together with the other power well stuff. Imo the >>> approach with use_count in Imre's code is easier to understand. >>> >>> Now for the actual issue you're trying to fix here: That's just a race in >>> the check in assert_can_disable_lcpll - you access crtc->base.enabled >>> without taking the right locks. >> >> Which ones are the "right locks"? > > crtc->mutex. It'll deadlock though due to the synchronous work > cancelling. If you go with lockless you'd need some barriers since the > implicit barriers in schedule_work and cancel_work aren't good enough > any more. Ok, but assert_can_disable_lcpll is completely unrelated with this specific patch we're replying. We should fix this with a follow-up patch. Also, why aren't the implicit barriers good enough anymore? > >>> And if the work runs concurrently to >>> re-enabling the display power then it'll get confused. >> >> What exactly do you mean when you say "re-enabling the display power"? >> Power wells? > > power domains in general, whether this is what bspec calls "display > ower well" or pc8 mode or something else. Specifically here the race > seems to happen with the pc8 domain though. > >>> The other issue >>> with this check is that crtc->base.enabled is the wrong thing to check: It >>> only tracks sw state, e.g. it's still set when we're in dpms off mode. The >>> right thing to check would be crtc->active, and for that one we have the >>> correct ordering between get/put calls and updating the field already. >> >> No, we only set crtc->active to true after we call >> ironlake_crtc_enable, and that's too late for this specific bug. Also, >> we don't enable PC8 nor disable power wells while in DPMS off, there's >> a lot of work to do if we want to enable that, and this patch here is >> just a bug fix. > > I mean the WARN check in assert_can_disable_lcpll, not the checks > you're touching in your patch. For actually deciding whether we can > allow pc8 or not we should imo piggy-pack on top of Imre's rework of > the display power domain handling. > > Maybe we should go over the code and replace all mentions of > power_well with power_domain in the generic code to make this clearer. > Imo Imre's work isn't just about what Bspec calls "power wells" but > about managing display power domains in general. Which includes pc8. > >>> Plan B would be to just ditch this check. >> >> The "crtc->base.enabled == enabled" check? That's not possible, it >> would result in unbalanced get/put calls. > > I mean the check in assert_can_disable_lcpll for crtc->base.enabled, > not the ones you're touching in your patch here. Sorry for the unclear > "this" reference. Since we don't support disabling LCPLL when the CRTC is in DPMS state, I still think that check is not wrong (not considering locking/concurrency issues). But if we want to only check what's strictly necessary, yeah we could change to crtc->active or even directly poke at the register. Anyway, given all this discussion, I don't think I should do any changes on this specific 04/19 patch since it's completely unrelated with what we're discussing. Please correct me if I'm wrong. Thanks, Paulo > -Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- Paulo Zanoni _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx