On Thu, May 27, 2021 at 8:42 PM Kees Cook <keescook@xxxxxxxxxxxx> wrote: > > On Thu, May 27, 2021 at 01:51:13PM +0200, Rodrigo Campos wrote: > > > > Kees, as I mentioned in the linked thread, this issue is present in > > 5.9+ kernels. Should we add the cc to stable for this patch? Or should > > we cc to stable the one linked, that just fixes the issue without > > semantic changes to userspace? > > It sounds like the problem is with Go, using addfd, on 5.9-5.13 kernels, > yes? Yes. > Would the semantic change be a problem there? (i.e. it sounds like > the semantic change was fine for the 5.14+ kernels, so I'm assuming it's > fine for earlier ones too.) No, I don't think it will cause any problem. > > Just to be clear, the other patch that fixes the problem without > > userspace visible changes is this: > > https://lore.kernel.org/lkml/20210413160151.3301-1-rodrigo@xxxxxxxxxx/ > > I'd prefer to use the now-in-next fix if we can. Is it possible to build > a test case that triggers the race so we can have some certainty that > any fix in -stable covers it appropriately? I've verified that Sargun's patch also solves the problem in mainline. I have now also verified that it applies cleany and fixes the issue for linux-stable/5.10.y and linux-stable/5.12.y too (without the patch I see the problem, with the patch I don't see it). 5.11 is already EOL, so I didn't try it (probably will work as well). The test case that I have is quite a complicated one, though. I'm using the PR we opened to runc to add support for seccomp notify[1] and a seccomp agent slightly modified from the example in the PR with some cgo to use addfd, and need to run it for several thousand iterations, as the kernel needs to be interrupted in a specific line and some kernel locks to be acquired in a specific order for this to trigger. If you think it is important, I can try to cleanup the code and share it, but the issue is basically what I explained here: https://lore.kernel.org/lkml/20210413160151.3301-2-rodrigo@xxxxxxxxxx/ Can we cc this patch to stable, then? :) Best, Rodrigo [1]: https://github.com/opencontainers/runc/pull/2682