On Sat, 2023-12-09 00:24:42 +0000, Casey Schaufler wrote: > > On 12/8/2023 3:32 PM, Paul Moore wrote: > > On Fri, Dec 8, 2023 at 6:21 PM Casey Schaufler <casey@xxxxxxxxxxxxxxxx> wrote: > >> On 12/8/2023 2:43 PM, Paul Moore wrote: > >>> On Thu, Dec 7, 2023 at 9:14 PM Munehisa Kamata <kamatam@xxxxxxxxxx> wrote: > >>>> On Tue, 2023-12-05 14:21:51 -0800, Paul Moore wrote: > >>> .. > >>> > >>>>> I think my thoughts are neatly summarized by Andrew's "yuk!" comment > >>>>> at the top. However, before we go too much further on this, can we > >>>>> get clarification that Casey was able to reproduce this on a stock > >>>>> upstream kernel? Last I read in the other thread Casey wasn't seeing > >>>>> this problem on Linux v6.5. > >>>>> > >>>>> However, for the moment I'm going to assume this is a real problem, is > >>>>> there some reason why the existing pid_revalidate() code is not being > >>>>> called in the bind mount case? From what I can see in the original > >>>>> problem report, the path walk seems to work okay when the file is > >>>>> accessed directly from /proc, but fails when done on the bind mount. > >>>>> Is there some problem with revalidating dentrys on bind mounts? > >>>> Hi Paul, > >>>> > >>>> https://lkml.kernel.org/linux-fsdevel/20090608201745.GO8633@xxxxxxxxxxxxxxxxxx/ > >>>> > >>>> After reading this thread, I have doubt about solving this in VFS. > >>>> Honestly, however, I'm not sure if it's entirely relevant today. > >>> Have you tried simply mounting proc a second time instead of using a bind mount? > >>> > >>> % mount -t proc non /new/location/for/proc > >>> > >>> I ask because from your description it appears that proc does the > >>> right thing with respect to revalidation, it only becomes an issue > >>> when accessing proc through a bind mount. Or did I misunderstand the > >>> problem? > >> It's not hard to make the problem go away by performing some simple > >> action. I was unable to reproduce the problem initially because I > >> checked the Smack label on the bind mounted proc entry before doing > >> the cat of it. The problem shows up if nothing happens to update the > >> inode. > > A good point. > > > > I'm kinda thinking we just leave things as-is, especially since the > > proposed fix isn't something anyone is really excited about. > > "We have to compromise the performance of our sandboxing tool because of > a kernel bug that's known and for which a fix is available." > > If this were just a curiosity that wasn't affecting real development I > might agree. But we've got a real world problem, and I don't see ignoring > it as a good approach. I can't see maintainers of other LSMs thinking so > if this were interfering with their users. We do bind mount to make information exposed to the sandboxed task as little as possible. We also create a separate PID namespace for each sandbox, but still want to bind mount even with it to hide system-wide and pid 1 information from the task. So, yeah, I see this as a real problem for our use case and want to seek an opinion about a possibly better fix. Thanks, Munehisa