On Fri, 2023-11-17 at 11:50 -0500, Rodrigo Vivi wrote: > On Fri, Nov 17, 2023 at 11:26:44AM +0200, Ville Syrjälä wrote: > > On Fri, Nov 17, 2023 at 10:41:43AM +0200, Ville Syrjälä wrote: > > > On Fri, Nov 17, 2023 at 08:05:21AM +0000, Coelho, Luciano wrote: > > > > Thanks for your comments, Ville! > > > > > > > > On Fri, 2023-11-17 at 09:19 +0200, Ville Syrjälä wrote: > > > > > On Thu, Nov 16, 2023 at 01:27:00PM +0200, Luca Coelho wrote: > > > > > > Since we're abstracting the display code from the underlying driver > > > > > > (i.e. i915 vs xe), we can't use the uncore's spinlock to protect > > > > > > critical sections of our code. > > > > > > > > > > > > After further inspection, it seems that the spinlock is not needed at > > > > > > all and this can be handled by disabling preemption and interrupts > > > > > > instead. > > > > > > > > > > uncore.lock has multiple purposes: > > > > > 1. serialize all register accesses to the same cacheline as on > > > > > certain platforms that can hang the machine > > > > > > > > Okay, do you remember which platforms? > > > > > > HSW is the one I remember for sure being affected. > > > Althoguh I don't recall if I ever managed to hang it > > > using display registers specifically. intel_gpu_top > > > certainly was very good at reproducing the problem. > > > > > > > I couldn't find any reference to > > > > this reason. > > > > > > If all else fails git log is your friend. > > > > It seems to be documented in intel_uncore.h. Though that one > > mentions IVB instead of HSW for some reason. I don't recall > > seeing it on IVB myself, but I suppose it might have been an > > issue there as well. How long the problem remained after HSW > > I have no idea. > > Paulo very recently told me that he could easily reproduce the issue > on IVB, simply by running 2 glxgears at the same time. Just a minor correction: I didn't give the degree of confidence in my answer that the sentence above suggests :). It's all "as far as I remember". This is all from like 10 years ago and I can't remember what I had for lunch yesterday. Maybe it was some other similar bug that I could reproduce with glxgears. Also, the way we used registers was different back then, maybe today glxgears is not enough to do it anymore. And I think it required vblank_mode=0. > > > > > > > > > > Also, the only place where where we take the uncore.lock > > > > is in this vblank code I changed, where the only explanation I found > > > > was about timing, specifically when using RT-kernels and in very old > > > > and slow platforms... (this was added 10 years ago). > > > > > > > > > > > > > 2. protect the forcewake/etc. state > > > > > > > > > > 1 is relevant here, 2 is not. > > > > > > > > Okay, good that we have only one known problem. :) > > and good it is an old one! :) > > > > > > > > > -- > > > > Cheers, > > > > Luca. > > > > > > -- > > > Ville Syrjälä > > > Intel > > > > -- > > Ville Syrjälä > > Intel