On Sat, Dec 9, 2023 at 4:17 PM Munehisa Kamata <kamatam@xxxxxxxxxx> wrote: > On Sat, 2023-12-09 10:10:32 -0800, Paul Moore wrote: > > On Fri, Dec 8, 2023 at 8:11 PM Munehisa Kamata <kamatam@xxxxxxxxxx> wrote: > > > On Sat, 2023-12-09 00:24:42 +0000, Casey Schaufler wrote: > > > > On 12/8/2023 3:32 PM, Paul Moore wrote: > > > > > On Fri, Dec 8, 2023 at 6:21 PM Casey Schaufler <casey@xxxxxxxxxxxxxxxx> wrote: > > > > >> On 12/8/2023 2:43 PM, Paul Moore wrote: > > > > >>> On Thu, Dec 7, 2023 at 9:14 PM Munehisa Kamata <kamatam@xxxxxxxxxx> wrote: > > > > >>>> On Tue, 2023-12-05 14:21:51 -0800, Paul Moore wrote: > > > > >>> .. > > > > >>> > > > > >>>>> I think my thoughts are neatly summarized by Andrew's "yuk!" comment > > > > >>>>> at the top. However, before we go too much further on this, can we > > > > >>>>> get clarification that Casey was able to reproduce this on a stock > > > > >>>>> upstream kernel? Last I read in the other thread Casey wasn't seeing > > > > >>>>> this problem on Linux v6.5. > > > > >>>>> > > > > >>>>> However, for the moment I'm going to assume this is a real problem, is > > > > >>>>> there some reason why the existing pid_revalidate() code is not being > > > > >>>>> called in the bind mount case? From what I can see in the original > > > > >>>>> problem report, the path walk seems to work okay when the file is > > > > >>>>> accessed directly from /proc, but fails when done on the bind mount. > > > > >>>>> Is there some problem with revalidating dentrys on bind mounts? > > > > >>>> Hi Paul, > > > > >>>> > > > > >>>> https://lkml.kernel.org/linux-fsdevel/20090608201745.GO8633@xxxxxxxxxxxxxxxxxx/ > > > > >>>> > > > > >>>> After reading this thread, I have doubt about solving this in VFS. > > > > >>>> Honestly, however, I'm not sure if it's entirely relevant today. > > > > >>> Have you tried simply mounting proc a second time instead of using a bind mount? > > > > >>> > > > > >>> % mount -t proc non /new/location/for/proc > > > > >>> > > > > >>> I ask because from your description it appears that proc does the > > > > >>> right thing with respect to revalidation, it only becomes an issue > > > > >>> when accessing proc through a bind mount. Or did I misunderstand the > > > > >>> problem? > > > > >> It's not hard to make the problem go away by performing some simple > > > > >> action. I was unable to reproduce the problem initially because I > > > > >> checked the Smack label on the bind mounted proc entry before doing > > > > >> the cat of it. The problem shows up if nothing happens to update the > > > > >> inode. > > > > > A good point. > > > > > > > > > > I'm kinda thinking we just leave things as-is, especially since the > > > > > proposed fix isn't something anyone is really excited about. > > > > > > > > "We have to compromise the performance of our sandboxing tool because of > > > > a kernel bug that's known and for which a fix is available." > > > > > > > > If this were just a curiosity that wasn't affecting real development I > > > > might agree. But we've got a real world problem, and I don't see ignoring > > > > it as a good approach. I can't see maintainers of other LSMs thinking so > > > > if this were interfering with their users. > > > > > > We do bind mount to make information exposed to the sandboxed task as little > > > as possible. We also create a separate PID namespace for each sandbox, but > > > still want to bind mount even with it to hide system-wide and pid 1 > > > information from the task. > > > > > > So, yeah, I see this as a real problem for our use case and want to seek an > > > opinion about a possibly better fix. > > > > First, can you confirm that this doesn't happen if you do a second > > proc mount instead of a bind mount of the original /proc as requested > > previously? > > Mounting the entire /proc was considered and this doesn't happen with it. > Although we still prefer to do bind mount for the reasons above and then > seek a solution. Ah, I had forgotten that you aren't bind mounting all of /proc, only a PID specific directory. I guess I'm not surprised this is behaving a little odd in some corner cases and I'm even less inclined to support a hack patch to handle this case; if we're going to fully support this, the patch will need to be pretty clean. -- paul-moore.com