On 2024-05-21, Amir Goldstein <amir73il@xxxxxxxxx> wrote: > On Tue, May 21, 2024 at 5:27 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > > > On Tue, 2024-05-21 at 16:11 +0200, Christian Brauner wrote: > > > On Tue, May 21, 2024 at 03:46:06PM +0200, Christian Brauner wrote: > > > > On Mon, May 20, 2024 at 05:35:49PM -0400, Aleksa Sarai wrote: > > > > > Now that we have stabilised the unique 64-bit mount ID interface in > > > > > statx, we can now provide a race-free way for name_to_handle_at(2) to > > > > > provide a file handle and corresponding mount without needing to worry > > > > > about racing with /proc/mountinfo parsing. > > > > > > > > > > As with AT_HANDLE_FID, AT_HANDLE_UNIQUE_MNT_ID reuses a statx AT_* bit > > > > > that doesn't make sense for name_to_handle_at(2). > > > > > > > > > > Signed-off-by: Aleksa Sarai <cyphar@xxxxxxxxxx> > > > > > --- > > > > > > > > So I think overall this is probably fine (famous last words). If it's > > > > just about being able to retrieve the new mount id without having to > > > > take the hit of another statx system call it's indeed a bit much to > > > > add a revised system call for this. Althoug I did say earlier that I > > > > wouldn't rule that out. > > > > > > > > But if we'd that then it'll be a long discussion on the form of the new > > > > system call and the information it exposes. > > > > > > > > For example, I lack the grey hair needed to understand why > > > > name_to_handle_at() returns a mount id at all. The pitch in commit > > > > 990d6c2d7aee ("vfs: Add name to file handle conversion support") is that > > > > the (old) mount id can be used to "lookup file system specific > > > > information [...] in /proc/<pid>/mountinfo". > > > > > > > > Granted, that's doable but it'll mean a lot of careful checking to avoid > > > > races for mount id recycling because they're not even allocated > > > > cyclically. With lots of containers it becomes even more of an issue. So > > > > it's doubtful whether exposing the mount id through name_to_handle_at() > > > > would be something that we'd still do. > > > > > > > > So really, if this is just about a use-case where you want to spare the > > > > additional system call for statx() and you need the mnt_id then > > > > overloading is probably ok. > > > > > > > > But it remains an unpleasant thing to look at. > > > > > > And I'd like an ok from Jeff and Amir if we're going to try this. :) > > > > I don't have strong feelings about it other than "it looks sort of > > ugly", so I'm OK with doing this. > > > > I suspect we will eventually need name_to_handle_at2, or something > > similar, as it seems like we're starting to grow some new use-cases for > > filehandles, and hitting the limits of the old syscall. I don't have a > > good feel for what that should look like though, so I'm happy to put > > that off for a while. > > I'm ok with it, but we cannot possibly allow it without any bikeshedding... > > Please call it AT_HANDLE_MNT_ID_UNIQUE to align with > STATX_MNT_ID_UNIQUE > > and as I wrote, I do not like overloading the AT_*_SYNC flags > and as there is no other obvious candidate to overload, so > I think that it is best to at least declare in a comment that > > /* 0x00ff flags are reserved for per-syscall flags */ > > and use one of those bits for AT_HANDLE_MNT_ID_UNIQUE. I can switch the flag to use 0x80, but given there are already exceptions to that rule, it seems unlikely that this is going to be a strong guarantee going forward. I will add a comment though. Note that this will mean that we are planning to only have 15 remaining generic AT_* flags. > It does not matter whether we decide to unify the AT_ flags > namespace with RENAME_ flags namespace or not. > > The fact that there is a syscall named renameat2() with a flags > argument, means that someone is bound to pass in an AT_ flags > in this syscall sooner or later, so the least we can do is try to > delay the day that this will not result in EINVAL. While there is a risk this could happen, in theory a user could also incorrectly pass AT_* to open(). While ergonomics is important, I think that most users generally read the docs when figuring out how to use flags for syscalls (mainly because we don't have a unified flag namespace for all syscalls) so I don't think this is a huge problem. (But I'm sure I was part of making this problem worse with RESOLVE_* flags.) > Thanks, > Amir. > > P.S.: As I mentioned to Jeff in LSFMM, I have a patch in my tree > to add AT_HANDLE_CONNECTABLE which I have not yet > decided if it is upstream worthy. -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/>
Attachment:
signature.asc
Description: PGP signature