On Mon, Mar 4, 2024 at 12:33 PM Jan Kara <jack@xxxxxxx> wrote: > > On Tue 27-02-24 21:42:37, Amir Goldstein wrote: > > On Mon, Feb 19, 2024 at 1:01 PM Jan Kara <jack@xxxxxxx> wrote: > > > > > > On Thu 15-02-24 17:40:07, Amir Goldstein wrote: > > > > > > Last time we discussed this the conclusion was an API of a group-less > > > > > > default mask, for example: > > > > > > > > > > > > 1. fanotify_mark(FAN_GROUP_DEFAULT, > > > > > > FAN_MARK_ADD | FAN_MARK_MOUNT, > > > > > > FAN_PRE_ACCESS, AT_FDCWD, path); > > > > > > 2. this returns -EPERM for access until some group handles FAN_PRE_ACCESS > > > > > > 3. then HSM is started and subscribes to FAN_PRE_ACCESS > > > > > > 4. and then the mount is moved or bind mounted into a path exported to users > > > > > > > > > > Yes, this was the process I was talking about. > > > > > > > > > > > It is a simple solution that should be easy to implement. > > > > > > But it does not involve "register the HSM app with the filesystem", > > > > > > unless you mean that a process that opens an HSM group > > > > > > (FAN_REPORT_FID|FAN_CLASS_PRE_CONTENT) should automatically > > > > > > be given FMODE_NONOTIFY files? > > > > > > > > > > Two ideas: What you describe above seems like what the new mount API was > > > > > intended for? What if we introduced something like an "hsm" mount option > > > > > which would basically enable calling into pre-content event handlers > > > > > > > > I like that. > > > > I forgot that with my suggestion we'd need a path to setup > > > > the default mask. > > > > > > > > > (for sb without this flag handlers wouldn't be called and you cannot place > > > > > pre-content marks on such sb). > > > > > > > > IMO, that limitation (i.e. inside brackets) is too restrictive. > > > > In many cases, the user running HSM may not have control over the > > > > mount of the filesystem (inside containers?). > > > > It is true that HSM without anti-crash protection is less reliable, > > > > but I think that it is still useful enough that users will want the > > > > option to run it (?). > > > > > > > > Think of my HttpDirFS demo - it's just a simple lazy mirroring > > > > of a website. Even with low reliability I think it is useful (?). > > > > > > Yeah, ok, makes sense. But for such "unpriviledged" usecases we don't have > > > a deadlock-free way to fill in the file contents because that requires a > > > special mountpoint? > > > > True, unless we also keep the FMODE_NONOTIFY event->fd > > for the simple cases. I'll need to think about this some more. > > Well, but even creating new fds with FMODE_NONOTIFY or setting up fanotify > group with HSM events need to be somehow priviledged operation (currently > it requires CAP_SYS_ADMIN). So the more I think about it the less obvious > the "unpriviledged" usecase seems to be. > ok. Let's put this one on ice for now. > > > > > These handlers would return EACCESS unless > > > > > there's somebody handling events and returning something else. > > > > > > > > > > You could then do: > > > > > > > > > > fan_fd = fanotify_init() > > > > > ffd = fsopen() > > > > > fsconfig(ffd, FSCONFIG_SET_STRING, "source", device, 0) > > > > > fsconfig(ffd, FSCONFIG_SET_FLAG, "hsm", NULL, 0) > > > > > rootfd = fsconfig(ffd, FSCONFIG_CMD_CREATE, NULL, NULL, 0) > > > > > fanotify_mark(fan_fd, FAN_MARK_ADD, ... , rootfd, NULL) > > > > > <now you can move the superblock into the mount hierarchy> > > > > > > > > Not too bad. > > > > I think that "hsm_deny_mask=" mount options would give more flexibility, > > > > but I could be convinced otherwise. > > > > > > > > It's probably not a great idea to be running two different HSMs on the same > > > > fs anyway, but if user has an old HSM version installed that handles only > > > > pre-content events, I don't think that we want this old version if it happens > > > > to be run by mistake, to allow for unsupervised create,rename,delete if the > > > > admin wanted to atomically mount a fs that SHOULD be supervised by a > > > > v2 HSM that knows how to handle pre-path events. > > > > > > > > IOW, and "HSM bit" on sb is too broad IMO. > > > > > > OK. So "hsm_deny_mask=" would esentially express events that we require HSM > > > to handle, the rest would just be accepted by default. That makes sense. > > > > Yes. > > > > > The only thing I kind of dislike is that this ties fanotify API with mount > > > API. So perhaps hsm_deny_mask should be specified as a string? Like > > > preaccess,premodify,prelookup,... and transformed into a bitmask only > > > inside the kernel? It gives us more maneuvering space for the future. > > > > > > > Urgh. I see what you are saying, but this seems so ugly to me. > > I have a strong feeling that we are trying to reinvent something > > and that we are going to reinvent it badly. > > I need to look for precedents, maybe in other OS. > > I believe that in Windows, there is an API to register as a > > Cloud Engine Provider, so there is probably a way to have multiple HSMs > > working on different sections of the filesystem in some reliable > > crash safe manner. > > OK, let's see what other's have came up with :) >From my very basic Google research (did not ask Chat GPT yet ;)) I think that MacOS FSEvents do not have blocking permission events at all, so there is no built-in HSM API. The Windows Cloud Sync Engine API: https://learn.microsoft.com/en-us/windows/win32/cfapi/build-a-cloud-file-sync-engine Does allow registring different "Storage namespace providers". AFAICT, the persistence of "Place holder" files is based on NTFS "Reparse points", which are a long time native concept which allows registering a persistent hook on a file to be handled by a specific Windows driver. So for example, a Dropbox place holder file, is a file with "reparse point" that has some label to direct the read/write calls to the Windows Cloud Sync Engine driver and a sub-label to direct the handling of the upcall by the Dropbox CloudSync Engine service. I do not want to deal with "persistent fanotify marks" at this time, so maybe something like: fsconfig(ffd, FSCONFIG_SET_STRING, "hsmid", "dropbox", 0) fsconfig(ffd, FSCONFIG_SET_STRING, "hsmver", "1", 0) Add support ioctls in fanotify_ioctl(): - FANOTIFY_IOC_HSMID - FANOTIFY_IOC_HSMVER And require that a group with matching hsmid and recent hsmver has a live filesystem mark on the sb. If this is an acceptable API for a single crash-safe HSM provider, then the question becomes: How would we extend this to multiple crash-safe HSM providers in the future? Or maybe we do not need to support multiple HSM groups per sb? Maybe in the future a generic service could be implemented to delegate different HSM modules, e.g.: fsconfig(ffd, FSCONFIG_SET_STRING, "hsmid", "cloudsync", 0) fsconfig(ffd, FSCONFIG_SET_STRING, "hsmver", "1", 0) And a generic "cloudsync" service could be in charge of registration of "cloudsync" engines and dispatching the pre-content event to the appropriate module based on path (i.e. under the dropbox folder). Once this gets passed NACKs from fs developers I'd like to pull in some distro people to the discussion and maybe bring this up as a topic discussion for LSFMM if we feel that there is something to discuss. Thoughts? Amir.