On Fri, Sep 11, 2015 at 12:36:24AM +0530, Animesh Manna wrote: > > > On 9/10/2015 8:15 PM, Daniel Vetter wrote: > >On Thu, Sep 10, 2015 at 01:58:54AM +0530, Animesh Manna wrote: > >> > >>On 9/2/2015 2:24 PM, Daniel Vetter wrote: > >>>On Wed, Aug 26, 2015 at 07:40:54PM +0530, Animesh Manna wrote: > >>>>On 8/26/2015 6:40 PM, Daniel Vetter wrote: > >>>>>On Wed, Aug 26, 2015 at 01:36:05AM +0530, Animesh Manna wrote: > >>>>>>Dmc will restore the csr program except DC9, cold boot, > >>>>>>warm reset, PCI function level reset, and hibernate/suspend. > >>>>>> > >>>>>>intel_csr_load_program() function is used to load the firmware > >>>>>>data from kernel memory to csr address space. > >>>>>> > >>>>>>All values of csr address space will be zero if it got reset and > >>>>>>the first byte of csr program is always a non-zero if firmware > >>>>>>is loaded successfuly. Based on hardware status will load the > >>>>>>firmware. > >>>>>> > >>>>>>Without this condition check if we overwrite the firmware data the > >>>>>>counters exposed for dc5/dc6 (help for debugging) will be nullified. > >>>>Bacause of the above reason mentioned just above we need to block firmware loading again. > >>>>So only WARN_ON will not help. > >>>> > >>>> > >>>>>>v1: Initial version. > >>>>>> > >>>>>>v2: Based on review comments from Daniel, > >>>>>>- Added a check to know hardware status and load the firmware if not loaded. > >>>>>> > >>>>>>Cc: Daniel Vetter <daniel.vetter@xxxxxxxxx> > >>>>>>Cc: Damien Lespiau <damien.lespiau@xxxxxxxxx> > >>>>>>Cc: Imre Deak <imre.deak@xxxxxxxxx> > >>>>>>Cc: Sunil Kamath <sunil.kamath@xxxxxxxxx> > >>>>>>Signed-off-by: Animesh Manna <animesh.manna@xxxxxxxxx> > >>>>>>Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@xxxxxxxxx> > >>>>>>--- > >>>>>> drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++ > >>>>>> 1 file changed, 9 insertions(+) > >>>>>> > >>>>>>diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c > >>>>>>index ba1ae03..682cc26 100644 > >>>>>>--- a/drivers/gpu/drm/i915/intel_csr.c > >>>>>>+++ b/drivers/gpu/drm/i915/intel_csr.c > >>>>>>@@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev) > >>>>>> return; > >>>>>> } > >>>>>>+ /* > >>>>>>+ * Dmc will restore the csr the program except DC9, cold boot, > >>>>>>+ * warm reset, PCI function level reset, and hibernate/suspend. > >>>>>>+ * This condition will help to check if csr address space is reset/ > >>>>>>+ * not loaded. > >>>>>>+ */ > >>>>>Atm we call this from driver load and resume, which doesn seem to cover > >>>>>all the cases you mention in the comment. Should this be a WARN_ON > >>>>>instead? Or do we have troubles in our init sequence where we load too > >>>>>many times? > >>>>Yes, the above statement taken from bspec to describe about the special cases dmc will not restore the firmware. > >>>>Agree, In our cases cold boot and hibernate/suspend mainly we need to load the firmware again, so in my > >>>>second sentence I wanted to comment mainly regarding this condition check added for suspend-hibernate(reset) > >>>>and cold boot(not loaded). > >>>> > >>>>Anyways the same api later can be used to load the firmware from anywhere, so my intention to check firmware loaded or not. > >>>>If already loaded then not to overwrite the csr address space to maintain the dc5/dc6 counter value. > >>>> > >>>>Can the below comment more clear to you. > >>>> > >>>> /* > >>>> * Dmc will restore the csr the program except DC9, cold boot, > >>>> * warm reset, PCI function level reset, and hibernate/suspend. > >>>> * If firmware is restored by dmc then no need to load again which > >>>> * will keep the dc5/dc6 counter exposed by firmware. > >>>> */ > >>>> > >>>>No issue in init sequence. > >>>That seems to still cover all the callers of the function afaics - we do > >>>pci resets over suspend resume unconditionally. So I still don't > >>>understand where exactly we try to load the dmc firmware in i915.ko when > >>>it's already loaded. > >>During resume intel_csr_load_program() will be called from > >>intel_runtime_resume(). > >> > >>intel_runtime_resume()-> skl_resume_prepare()-> intel_csr_load_program() > >> > >>During Pc10 entry testing I can see dmc is restoring back the firmware always, > >>but as you mentioned pci-reset can happen unconditionally, but still then > >>also during resume intel_runtime_resume() will be called and based on > >>register read of csr-base-address firmware loading will happen. > >But in your comment you're saying it won't get restored in case of dc9 and > >suspend. So that seems to mismatch what you're saying here (and what the > >commit message says) and what the code does. And this function here is > >called for resume after suspend/hibernate only. > > pc10 entry explanation I told is for skylake. dc9 in skylake is not possible. > I think you are confusing between dc6 and dc9. Pc10 can be achieved by > entering into dc6 (not dc9) for skylake. dc9 is the lowest possible state > for broxton which is not present for skylake. I have no idea at all about different pc levels on skl. What I'm talking about is system suspend/resume and driver load, which are the places this function gets called. At least afaics. > Here intel_csr_load_program() will be used for both skylake and broxton, and instruction > execution flow will be different in case of suspend/resume which I think is confusing > you. That seems like really important information. What's different on bxt? These are the kind of details you should explain in the commit message ... > I am ready explain you in detail. It will be good if we discuss specific use-case scenario > and itz software design for specific platform. Another point - as dmc related code for > broxton is not merged better first we close design for skylake. Now, I have added dc9 > description in comment thinking of future. If you want I can remove for now and later > can add in bxt patch series for enabling dmc. Will wait for your reply. This question here isn't about the overall design and how to handle power wells in skl/bxt. That's a separate discussion and tracked somewhere else. I'm really just confused about when exactly we need to reload to firmware, and why we need a runtime check for that. Normally we should know when to reload the firmware and just either reload or not, without checking hw state. And I don't like checking for hw state since at least in the past that kind of code ended up being fragile - it's an illusion that it does the right thing no matter what, since often there's other tricky ordering constraints. And if you have automatic duct-tape like then no one will ever spot those other, harder to spot issues, until an expensive customer escalation happens. So what I want to know here is: - When exactly do we need to reload dmc firmware. - What exactly is the reason why we can't make that decision statically in the code (by calling csr_load at the right spots). Thanks, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx