On Tue, 13 Jul 2021 09:48:01 -0400 Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote: > On 7/12/21 7:38 PM, Halil Pasic wrote: > > On Wed, 7 Jul 2021 11:41:56 -0400 > > Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote: > > > >> It was pointed out during an unrelated patch review that locks should not > >> be open coded - i.e., writing the algorithm of a standard lock in a > >> function instead of using a lock from the standard library. The setting and > >> testing of the kvm_busy flag and sleeping on a wait_event is the same thing > >> a lock does. Whatever potential deadlock was found and reported via the > >> lockdep splat was not magically removed by going to a wait_queue; it just > >> removed the lockdep annotations that would identify the issue early > > Did you change your opinion since we last talked about it? This reads to > > me like we are deadlocky without this patch, because of the last > > sentence. > > The words are a direct paraphrase of Jason G's responses to my > query regarding what he meant by open coding locks. I > am choosing to take his word on the subject and remove the > open coded locks. > > Having said that, we do not have a deadlock problem without > this patch. If you recall, the lockdep splat occurred ONLY when > running a Secure Execution guest in a CI environment. Since > AP is not yet supported for SE guests, there is no danger of > a lockdep splat occurring in a customer environment. Given > Jason's objections to the original solution (i.e., kvm_busy flag > and wait queue), I decided to replace the so-called open > coded locks. I'm in favor of doing that. But if ("s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks") ain't buggy, then this patch does not qualify for stable. For a complete set of rules consult: https://github.com/torvalds/linux/blob/master/Documentation/process/stable-kernel-rules.rst Here the most relevant points: * It must fix a real bug that bothers people (not a, "This could be a problem..." type thing). * t must fix a problem that causes a build error (but not for things marked CONFIG_BROKEN), an oops, a hang, data corruption, a real security issue, or some "oh, that's not good" issue. In short, something critical. * No "theoretical race condition" issues, unless an explanation of how the race can be exploited is also provided. Jason may give it another try to convince us that 0cc00c8d4050 only silenced lockdep, but vfio_ap remained prone to deadlocks. To my best knowledge using condition variable and a mutex is one of the well known ways to implement an rwlock. In my opinion, you should drop the fixes tag, drop the cc stable, and provide a patch description that corresponds to *your* understanding of the situation. Neither the Fixes tag or the stable process is (IMHO) meant for these types of (style) issues. And if you don't think the alleged problem is real, don't make the description of your patch say it is real. Regards, Halil > > > > > Regards, > > Halil >