On Tue, Aug 22, 2017 at 01:14:19PM +0000, Szwichtenberg, Radoslaw wrote: > On Tue, 2017-08-22 at 13:33 +0100, Chris Wilson wrote: > > Quoting Szwichtenberg, Radoslaw (2017-08-22 12:56:00) > > > On Tue, 2017-08-22 at 01:31 +0300, Arkadiusz Hiler wrote: > > > > On Mon, Aug 21, 2017 at 09:39:24PM +0200, Daniel Vetter wrote: > > > > > On Mon, Aug 21, 2017 at 11:21:49AM +0100, Chris Wilson wrote: > > > > > > Quoting Chris Wilson (2017-08-21 10:53:36) > > > > > > > Quoting Arkadiusz Hiler (2017-08-21 10:42:25) > > > > > > > > On Mon, Aug 21, 2017 at 08:05:58AM +0000, Dec, Katarzyna wrote: > > > > > > > > > I understand we do not want to check registers in IGT tests. > > > > > > > > > What > > > > > > > > > about reading interrupt masks from debugfs > > > > > > > > > (i915_frequency_info). > > > > > > > > > > > > > > > > Hey Kasia > > > > > > > > > > > > > > > > It would be pretty much the same thing, but instead of us reading > > > > > > > > the > > > > > > > > PMINTRMASK directly we would ask the kernel to do that on our > > > > > > > > behalf. > > > > > > > > > > > > > > > > That would just hide register read, not get rid of it. > > > > > > > > > > > > > > > > > > > > > > > > I think you are missing the point. The idea is that we do not want > > > > > > > > to > > > > > > > > test details of in-kernel implementation, not ban the register > > > > > > > > reads > > > > > > > > completely. > > > > > > > > > > > > > > > > Reading register directly, especially just to make sure that the > > > > > > > > kernel > > > > > > > > set something correctly is a good indicator that we are trying to > > > > > > > > do > > > > > > > > just that - test the internal details. > > > > > > > > > > > > > > > > > Would that be better approach? You guys suggested to get > > > > > > > > > interested > > > > > > > > > in > > > > > > > > > kselftests for having such checks, but I am afraid that it could > > > > > > > > > be > > > > > > > > > too much job and we have too few hands to work. > > > > > > > > > > > > > > > > How much of an effort would the kselftest be, since it seems that > > > > > > > > you > > > > > > > > did some > > > > > > > > investigation already? > > > > > > > > > > > > > > It doesn't even require a whole selftest, just something like > > > > > > > > > > > > > > diff --git a/drivers/gpu/drm/i915/intel_pm.c > > > > > > > b/drivers/gpu/drm/i915/intel_pm.c > > > > > > > index 448e71af4772..e83b67fe0354 100644 > > > > > > > --- a/drivers/gpu/drm/i915/intel_pm.c > > > > > > > +++ b/drivers/gpu/drm/i915/intel_pm.c > > > > > > > @@ -7733,7 +7733,8 @@ void intel_suspend_gt_powersave(struct > > > > > > > drm_i915_private *dev_priv) > > > > > > > if (cancel_delayed_work_sync(&dev_priv- > > > > > > > >rps.autoenable_work)) > > > > > > > intel_runtime_pm_put(dev_priv); > > > > > > > > > > > > > > - /* gen6_rps_idle() will be called later to disable > > > > > > > interrupts */ > > > > > > > + WARN_ON(I915_READ(GEN6_PMINTRMSK) != > > > > > > > + gen6_sanitize_rps_pm_mask(dev_priv, ~0)); > > > > > > > } > > > > > > > > > > > > Wrong spot. We actually need a call from > > > > > > intel_runtime_pm_disable_interrupts. > > > > > > > > > > Yeah, for consistency checks which are very closely tied to the > > > > > implementation we tend to sprinkle WARN_ON all over the place. In some > > > > > cases those checks are too expensive for production, then we add a > > > > > compile-time-option to hide them (e.g. GEM_BUG_ON). > > > > > > > > > > I chatted with Radek, and if I understand things correctly, the main > > > > > value > > > > > you derive from these is making sure a frankenstein port to an older > > > > > kernel doesn't miss or miss-apply any critical patches. In-kernel > > > > > consistency checks unfortunately don't really help with that, but we > > > > > heavily rely on these for validation. > > > > > > > > Having that stated on the mailing list from the beginning (e.g. in the > > > > commit message or as one of the first replies) would help directing the > > > > whole discussion on the right track and make us understand your needs > > > > better. > > > > > > > > I agree with Daniel's earlier statement that we should be very > > > > (over)verbose about the changes we are making and purpose they are > > > > serving. > > > > > > > > > There's also examples of this (and again, they're very important) > > > > > outside > > > > > of i915, like kasan, lockdep (and maybe we'll even get kmemleak somehow > > > > > integrated into CI/igt eventually). > > > > > > > > > > So still no idea what would be the best suggestion here for your team. > > > > > > > > Kasia and Radek, can you elaborate a little more on the "frankenstein > > > > port" and your use cases for such tests? > > > > > > > > How is that comparable to backports to stable/LTS kernel branches? > > > > > > > > > > This test proposed by Kasia not only was used to find various regressions > > > (including performance ones) that were later fixed on upstream (and example > > > would be patch from Sagar: https://patchwork.kernel.org/patch/9527335/). > > > > It is a terrible test for that. If you're goal is to validate performance, > > do that. For example, the presumption there is that RPS continues to > > respond after a sequence of events (delivering the expected > > performance). You can either use the gpu clocks as reported by the > > kernel, but you forgo that trust by doing a known workload and > > count cycles. The result should be that you get a test that matches > > typical patterns and corner cases of userspace, and that you are > > delivering the expected *userspace* behaviour. All in a way that makes > > no assumptions how the kernel/hw responds, just that it will. > > -Chris > > I do agree with your point - I just wanted give a example of where this test > turned out helpfull (not saying this test was aimed to catch such behaviour even > though it was helpfull in rootcausing the provlem). I hope Sagar will be able to > provide better examples. > > The most important part of this discussion probably is a question whether IGT > should be backwards compatible and help testing previous stable kernel releases. > I know that saying that some user of IGTs uses it in certain ways to test older > kernels (we do) is not enough to justify upstreaming these. We do maintain some > older kernels (stable ones) with some changes forklifted from upstream/newer > releases and this test does help finding out if all required changes were > cherrypicked correctly or not. One thing we could do on that front is run igt patches against older-ish kernels. Well, since full igt runs are already at the limit of machine time we have, that would probably be something that each release engineering team that cares about a specific kernel/hw combo would need to do. But patchwork supports that, you can add arbitrary amounts of CI farms to it that post results. If we only rely on review at least I don't think igt is going to be backward compatible enough for serious backporting efforts. There's still going to be issues, simply because we sometimes need to change the uabi (in a way that only breaks igt, not any real apps ofc). -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx