On Tue, Aug 16, 2022 at 09:13:26PM +0000, Dexuan Cui wrote: > > From: Bjorn Helgaas <helgaas@xxxxxxxxxx> > > Sent: Tuesday, August 16, 2022 8:51 AM > > To: Dexuan Cui <decui@xxxxxxxxxxxxx> > > > > This has only observations with no explanations, and I don't see how > > it will be useful to future readers of the git history. > Please see the below. > > > I assume you bisected the problem to b4b77778ecc5? > Yes. > > > Can you just revert that? A revert requires no more explanation than > > "this broke something." > > It's better to not revert b4b77778ecc5, which is required by Jeff's > Multi-MSI device, which doesn't seem to be affected by the interrupt > issue I described. You must debug it, there are no two ways about it. We can't apply fixes on a hunch, more so given that I am not convinced at all this patch is fixing anything, it is just papering over an underlying bug that is still to be pinpointed. > > I guess this is a fine distinction, but I really don't like random > > code changes that "seem to avoid a problem but we don't know how." > > A revert at least has the advantage that we can cover our eyes and > > pretend the commit never happened. This patch feels like future > > readers will have to try to understand the code even though we > > clearly don't understand why it makes a difference. > > I just replied to Lorenzo's email with more details. FYI, this is the link > to my reply: > https://lwn.net/ml/linux-kernel/SA1PR21MB1335D08F987BBAE08EADF010BF6B9@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ > > I just felt the commit message might be too long if I had put all the > details there. :-) Can we add a Links: tag? Commit logs must describe the issue you are fixing, thouroughly and concisely. To start with "Jeffrey's 4 recent patches" is a very bad start for a commit log, it means nothing, try to read your log as someone who needs to understand the commit years down the line please. Now, back to this patch: we are at -rc1, unless Bjorn is willing to do so I am not inclined to apply this patch till next merge window (and actually I am not inclined to merge it at all). This gives you folks time to debug it (and it must be debugged), the fact that it works for one multi-MSI device does not mean that the bug isn't still there - I am worried that the issue is with b4b77778ecc5 and the interaction with core MSI/IOMMU. Thanks, Lorenzo