Re: [RFC][PATCH] fanotify: allow to set errno in FAN_DENY permission response

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 12, 2024 at 2:01 PM Jan Kara <jack@xxxxxxx> wrote:
>
> On Thu 08-02-24 21:21:13, Amir Goldstein wrote:
> > On Thu, Feb 8, 2024 at 8:31 PM Jan Kara <jack@xxxxxxx> wrote:
> > > On Thu 08-02-24 16:04:29, Amir Goldstein wrote:
> > > > > > On Mon 29-01-24 20:30:34, Amir Goldstein wrote:
> > > > > > > On Mon, Dec 18, 2023 at 5:53 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> > > > > > > > In the HttpDirFS HSM demo, I used FAN_OPEN_PERM on a mount mark
> > > > > > > > to deny open of file during the short time that it's content is being
> > > > > > > > punched out.
> > > > > > > > It is quite complicated to explain, but I only used it for denying access,
> > > > > > > > not to fill content and not to write anything to filesystem.
> > > > > > > > It's worth noting that returning EBUSY in that case would be more meaningful
> > > > > > > > to users.
> > > > > > > >
> > > > > > > > That's one case in favor of allowing FAN_DENY_ERRNO for FAN_OPEN_PERM,
> > > > > > > > but mainly I do not have a proof that people will not need it.
> > > > > > > >
> > > > > > > > OTOH, I am a bit concerned that this will encourage developer to use
> > > > > > > > FAN_OPEN_PERM as a trigger to filling file content and then we are back to
> > > > > > > > deadlock risk zone.
> > > > > > > >
> > > > > > > > Not sure which way to go.
> > > > > > > >
> > > > > > > > Anyway, I think we agree that there is no reason to merge FAN_DENY_ERRNO
> > > > > > > > before FAN_PRE_* events, so we can continue this discussion later when
> > > > > > > > I post FAN_PRE_* patches - not for this cycle.
> > > > > > >
> > > > > > > I started to prepare the pre-content events patches for posting and got back
> > > > > > > to this one as well.
> > > > > > >
> > > > > > > Since we had this discussion I have learned of another use case that
> > > > > > > requires filling file content in FAN_OPEN_PERM hook, FAN_OPEN_EXEC_PERM
> > > > > > > to be exact.
> > > > > > >
> > > > > > > The reason is that unless an executable content is filled at execve() time,
> > > > > > > there is no other opportunity to fill its content without getting -ETXTBSY.
> > > > > >
> > > > > > Yes, I've been scratching my head over this usecase for a few days. I was
> > > > > > thinking whether we could somehow fill in executable (and executed) files on
> > > > > > access but it all seemed too hacky so I agree that we probably have to fill
> > > > > > them in on open.
> > > > > >
> > > > >
> > > > > Normally, I think there will not be a really huge executable(?)
> > > > > If there were huge executables, they would have likely been broken down
> > > > > into smaller loadable libraries which should allow more granular
> > > > > content filling,
> > > > > but I guess there will always be worst case exceptions.
> > > > >
> > > > > > > So to keep things more flexible, I decided to add -ETXTBSY to the
> > > > > > > allowed errors with FAN_DENY_ERRNO() and to decided to allow
> > > > > > > FAN_DENY_ERRNO() with all permission events.
> > > > > > >
> > > > > > > To keep FAN_DENY_ERRNO() a bit more focused on HSM, I have
> > > > > > > added a limitation that FAN_DENY_ERRNO() is allowed only for
> > > > > > > FAN_CLASS_PRE_CONTENT groups.
> > > > > >
> > > > > > I have no problem with adding -ETXTBSY to the set of allowed errors. That
> > > > > > makes sense. Adding FAN_DENY_ERRNO() to all permission events in
> > > > > > FAN_CLASS_PRE_CONTENT groups - OK,
> > > > >
> > > > > done that.
> > > > >
> > > > > I am still not very happy about FAN_OPEN_PERM being part of HSM
> > > > > event family when I know that O_TRUCT and O_CREAT call this hook
> > > > > with sb writers held.
> > > > >
> > > > > The irony, is that there is no chance that O_TRUNC will require filling
> > > > > content, same if the file is actually being created by O_CREAT, so the
> > > > > cases where sb writers is actually needed and the case where content
> > > > > filling is needed do not overlap, but I cannot figure out how to get those
> > > > > cases out of the HSM risk zone. Ideas?
> > > > >
> > > >
> > > > Jan,
> > > >
> > > > I wanted to run an idea by you.
> > > >
> > > > I like your idea to start a clean slate with
> > > > FAN_CLASS_PRE_CONTENT | FAN_REPORT_FID
> > > > and it would be nice if we could restrict this HSM to use
> > > > pre-content events, which is why I was not happy about allowing
> > > > FAN_DENY_ERRNO() for the legacy FAN_OPEN*_PERM events,
> > > > especially with the known deadlocks.
> > > >
> > > > Since we already know that we need to generate
> > > > FAN_PRE_ACCESS(offset,length) for read-only mmap() and
> > > > FAN_PRE_MODIFY(offset,length) for writable mmap(),
> > > > we could treat uselib() and execve() the same way and generate
> > > > FAN_PRE_ACCESS(0,i_size) as if the file was mmaped.
> > >
> > > BTW uselib() is deprecated and there is a patch queued to not generate
> > > OPEN_EXEC events for it because it was causing problems (not the generation
> > > of events itself but the FMODE_EXEC bit being set in uselib). So I don't
> > > think we need to instrument uselib().
> > >
> >
> > Great. The fewer the better :)
> >
> > BTW, for mmap, I was thinking of adding fsnotify_file_perm() next to
> > call sites of security_mmap_file(), but I see that:
> > 1. shmat() has security_mmap_file() - is it relevant?
>
> Well, this is effectively mmap(2) of a tmpfs file. So I don't think this is
> particularly useful for HSM purposes but we should probably have it for
> consistency?

OK.

Actually, I think there is a use case for tmpfs + HSM. HSM is about
lazy populating of a local filesystem from a remote slower tier.
The fast local tier could be persistent, but could just as well be tmpfs,
for a temporary lazy local mirror.

This use case reminds overlayfs a bit, where the local tier is the upper fs,
but in case of HSM, the "copy up" happens also on read.

I do want to limit HSM to "fully featured local fs", so we do not end up
with crazy fs doing HSM. The very minimum should be decodable
file handle support and unique fsid, but we could also think of more strict
requirements(?). Anyway, tmpfs should meet all the "fully featured local fs"
requirement.

>
> > 2. remap_file_pages() calls do_mmap() without security_mmap_file() -
> >     do we need to cover it?
>
> Hmm, AFAIU remap_file_pages() just allows you to mess with an existing
> mapping so it isn't very interesting from HSM POV?

Nope.

>
> > > > I've already pushed a POC to fan_pre_content branch [1].
> > > > Only sanity tested that nothing else is broken.
> > > > I still need to add the mmap hooks and test the new events.
> > > >
> > > > With this, HSM will have appropriate hooks to fill executable
> > > > and library on first access and also fill arbitrary files on open
> > > > including the knowledge if the file was opened for write.
> > > >
> > > > Thoughts?
> > >
> > > Yeah, I guess this looks sensible.
> >
> > Cool, so let's see, what is left to do for the plan of
> > FAN_CLASS_PRE_CONTENT | FAN_REPORT_FID?
> >
> > 1. event->fd is O_PATH mount_fd for open_by_handle_at()
> > 2. open_by_handle_at() inherits FMODE_NONOTIFY from mount_fd
> > 3. either implement the FAN_CLOSE_FD response flag (easy?) and/or
> >     implement FAN_REPORT_EVENT_ID and new header format
> >
> > Anything else?
> > Are you ok with 1 and 2?
>
> I'm not sure about 1) and 2) so I'm mostly thinking out loud now. AFAIU you
> want to provide mount_fd only because of FMODE_NONOTIFY inheritance so 2)
> is the key question. But if you provide mount_fd with FMODE_NONOTIFY and
> have FMODE_NONOTIFY inheritance, then what's the difference to just allow
> opens with FMODE_NONOTIFY from the start? I don't think restricting
> FMODE_NONOTIFY to inheritance gives any additional strong security
> guarantees?
>

You are right.
Just need to think of the right API.
It is just very convenient to be getting the FMODE_NONOTIFY from
fanotify, but let's think about it.

> I understand so far we didn't expose FMODE_NONOTIFY so that people cannot
> bypass fanotify permission events. But now we need a sensible way to fill
> in the filesystem without deadlocking on our own watches. Maybe exposing
> FMODE_NONOTIFY as an open flag is too attractive for misuse so could be
> somehow tie it to the HSM app? We were already discussing that we need to
> somehow register the HSM app with the filesystem to avoid issues when the
> app crashes or so. So maybe we could tie this registration to the ability
> of bypassing generation of notification?
>

Last time we discussed this the conclusion was an API of a group-less
default mask, for example:

1. fanotify_mark(FAN_GROUP_DEFAULT,
                           FAN_MARK_ADD | FAN_MARK_MOUNT,
                           FAN_PRE_ACCESS, AT_FDCWD, path);
2. this returns -EPERM for access until some group handles FAN_PRE_ACCESS
3. then HSM is started and subscribes to FAN_PRE_ACCESS
4. and then the mount is moved or bind mounted into a path exported to users

It is a simple solution that should be easy to implement.
But it does not involve "register the HSM app with the filesystem",
unless you mean that a process that opens an HSM group
(FAN_REPORT_FID|FAN_CLASS_PRE_CONTENT) should automatically
be given FMODE_NONOTIFY files?

There is one more crazy idea that I was pondering -
what if we used the fanotify_fd as mount_fd arg to open_by_handle_at()?
The framing is that it is not the filesystem, but fanotify which actually
encoded the fsid+fid, so HSM could be asking fanotify to decode them.
Technically, the group could keep a unique map from fsid -> sb, then
fanotify group could decode an fanotify_event_info_fid buffer to a specific
inode on a specific fs.
Naturally, those decoded files would be FMODE_NONOTIFY.

Too crazy?

I am very much open to a simpler suggestion.

> > Do you have a preference for 3?
>
> Yeah. FAN_CLOSE_FD seems as a hack to me (and you're likely to get it wrong
> until you've spent quite some time debugging with exec sometimes fails with
> ETXTBUSY). So I prefer FAN_REPORT_EVENT_ID and use event id as an
> identifier when replying to HSM events.

OK.

At the end of this week I am going on a one week vacation,
so it is unlikely that pre-content events will be ready for 6.9.
Perhaps I will be able to get some prep work ready for the
merge window, we shall see.

Thanks,
Amir.





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux