On Mon, Dec 18, 2023 at 4:35 PM Jan Kara <jack@xxxxxxx> wrote: > > On Fri 15-12-23 18:50:39, Amir Goldstein wrote: > > On Fri, Dec 15, 2023 at 5:31 PM Josef Bacik <josef@xxxxxxxxxxxxxx> wrote: > > > > > > On Wed, Dec 13, 2023 at 09:09:30PM +0200, Amir Goldstein wrote: > > > > On Wed, Dec 13, 2023 at 7:28 PM Jan Kara <jack@xxxxxxx> wrote: > > > > > > > > > > On Fri 08-12-23 10:01:35, Amir Goldstein wrote: > > > > > > With FAN_DENY response, user trying to perform the filesystem operation > > > > > > gets an error with errno set to EPERM. > > > > > > > > > > > > It is useful for hierarchical storage management (HSM) service to be able > > > > > > to deny access for reasons more diverse than EPERM, for example EAGAIN, > > > > > > if HSM could retry the operation later. > > > > > > > > > > > > Allow userspace to response to permission events with the response value > > > > > > FAN_DENY_ERRNO(errno), instead of FAN_DENY to return a custom error. > > > > > > > > > > > > The change in fanotify_response is backward compatible, because errno is > > > > > > written in the high 8 bits of the 32bit response field and old kernels > > > > > > reject respose value with high bits set. > > > > > > > > > > > > Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx> > > > > > > > > > > So a couple of comments that spring to my mind when I'm looking into this > > > > > now (partly maybe due to my weak memory ;): > > > > > > > > > > 1) Do we still need the EAGAIN return? I think we have mostly dealt with > > > > > freezing deadlocks in another way, didn't we? > > > > > > > > I was thinking about EAGAIN on account of the HSM not being able to > > > > download the file ATM. > > > > > > > > There are a bunch of error codes that are typical for network filesystems, e.g. > > > > ETIMEDOUT, ENOTCONN, ECONNRESET which could be relevant to > > > > HSM failures. > > > > > > > > > > > > > > 2) If answer to 1) is yes, then there is a second question - do we expect > > > > > the errors to propagate back to the unsuspecting application doing say > > > > > read(2) syscall? Because I don't think that will fly well with a big > > > > > majority of applications which basically treat *any* error from read(2) as > > > > > fatal. This is also related to your question about standard permission > > > > > events. Consumers of these error numbers are going to be random > > > > > applications and I see a potential for rather big confusion arising there > > > > > (like read(1) returning EINVAL or EBADF and now you wonder why the hell > > > > > until you go debug the kernel and find out the error is coming out of > > > > > fanotify handler). And the usecase is not quite clear to me for ordinary > > > > > fanotify permission events (while I have no doubts about creativity of > > > > > implementors of fanotify handlers ;)). > > > > > > > > > > > > > That's a good question. > > > > I prefer to delegate your question to the prospect users of the feature. > > > > > > > > Josef, which errors did your use case need this feature for? > > > > > > > > > 3) Given the potential for confusion, maybe we should stay conservative and > > > > > only allow additional EAGAIN error instead of arbitrary errno if we need it? > > > > > > > > > > > > > I know I was planning to use this for EDQUOT error (from FAN_PRE_MODIFY), > > > > but I certainly wouldn't mind restricting the set of custom errors. > > > > I think it makes sense. The hard part is to agree on this set of errors. > > > > > > > > > > I'm all for flexibility here. > > > > > > We're going to have 2 classes of applications interacting with HSM backed > > > storage, normal applications and applications that know they're backed by HSM. > > > The normal applications are just going to crash if they get an error on read(2), > > > it doesn't matter what errno it is. The second class would have different > > > things they'd want to do in the face of different errors, and that's what this > > > patchset is targeting. We can limit it to a few errno's if that makes you feel > > > better, but having more than just one would be helpful. > > > > Ok. In another email I got from your colleagues, they listed: > > EIO, EAGAIN, ENOSPC as possible errors to return. > > I added EDQUOT for our in house use case. > > OK, so do I get it right that you also have applications that are aware > that they are operation on top of HSM managed filesystem and thus they can > do meaningful things with the reported errors? > Some applications are HSM aware. Some just report the errors that they get which are meaningful to users. EIO is the standard response for HSM failure to fill content. EDQUOT/ENOSPC is a good example of special functionality. HSM "swaps out" file content to a slow remote tier, but the slow remote tier may have a space/quota limit that is known to HSM. By tracking the total of st_size under some HSM managed folder, including the st_size of files whose content is punched out, HSM can enforce this limit not in the conventional meaning of local disk blocks usage. This is when returning EDQUOT/ENOSPC for FAN_PRE_MODIFY makes sense to most users/applications, except for ones that try to create large sparse files... > > Those are all valid errors for write(2) and some are valid for read(2). > > ENOSPC/EDQUOT make a lot of sense for HSM for read(2), but could > > be surprising to applications not aware of HSM. > > I think it wouldn't be that bad if we let HSM decide which of those errors > > to return for FAN_PRE_ACCESS as opposed to FAN_PRE_MODIFY. > > Yeah, I don't think we need to be super-restrictive here, I'd just prefer > to avoid the "whatever number you decide to return" kind of interface > because I can see potential for confusion and abuse there. I think all four > errors above are perfectly fine for both FAN_PRE_ACCESS and FAN_PRE_MODIFY > if there are consumers that are able to use them. > > > But given that we do want to limit the chance of abusing this feature, > > perhaps it would be best to limit the error codes to known error codes > > for write(2) IO failures (i.e. not EBADF, not EFAULT) and allow returning > > FAN_DENY_ERRNO only for the new FAN_PRE_{ACCESS,MODIFY} > > HSM events. > > > > IOW, FAN_{OPEN,ACCESS}_PERM - no FAN_DENY_ERRNO for you! > > > > Does that sound good to you? > > It sounds OK to me. I'm open to allowing FAN_DENY_ERRNO for FAN_OPEN_PERM > if there's a usecase because at least conceptually it makes a good sense > and chances for confusion are low there. People are used to dealing with > errors on open(2). > I wrote about one case I have below. > > Furthermore, we can start with allowing a very limited set of errors > > and extend it in the future, on case by case basis. > > > > The way that this could be manageable is if we provide userspace > > a way to test for supported return codes. > > > > There is already a simple method that we used for testing FAN_INFO > > records type support - > > After fan_fd = fanotify_init(), userspace can write a "test" fanotify_response > > to fan_fd with fanotify_response.fd=FAN_NOFD. > > > > When setting fanotify_response.fd=FAN_DENY, this would return ENOENT, > > but with fanotify_response.fd=FAN_DENY_ERRNO(EIO), upstream would > > return EINVAL. > > > > This opens the possibility of allowing, say, EIO, EAGAIN in the first release > > and ENOSPC, EDQUOT in the following release. > > If we forsee that ENOSPC and EDQUOT will be needed, then we can just enable > it from start and not complicate our lives more than necessary. > Sure, I was just giving an example how the list could be extended case by case in the future. > > The advantage in this method is that it is very simple and already working > > correctly for old kernels. > > > > The downside is that this simple method does not allow checking for > > allowed errors per specific event type, so if we decide that we do want > > to allow returning FAN_DENY_ERRNO for FAN_OPEN_PERM later on, this method > > could not be used by userspace to test for this finer grained support. > > True, in that case the HSM manager would have to try responding with > FAN_DENY_ERRNO() and if it fails, it will have to fallback to responding > with FAN_DENY. Not too bad I'd say. > Yeah that works too. > > In another thread, I mention the fact that FAN_OPEN_PERM still has a > > potential freeze deadlock when called from open(O_TRUNC|O_CREATE), > > so we can consider the fact that FAN_DENY_ERRNO is not allowed with > > FAN_OPEN_PERM as a negative incentive for people to consider using > > FAN_OPEN_PERM as a trigger for HSM. > > AFAIU from the past discussions, there's no good use of FAN_OPEN_PERM > event for HSM. If that's the case, I'm for not allowing FAN_DENY_ERRNO for > FAN_OPEN_PERM. In the HttpDirFS HSM demo, I used FAN_OPEN_PERM on a mount mark to deny open of file during the short time that it's content is being punched out [1]. It is quite complicated to explain, but I only used it for denying access, not to fill content and not to write anything to filesystem. It's worth noting that returning EBUSY in that case would be more meaningful to users. That's one case in favor of allowing FAN_DENY_ERRNO for FAN_OPEN_PERM, but mainly I do not have a proof that people will not need it. OTOH, I am a bit concerned that this will encourage developer to use FAN_OPEN_PERM as a trigger to filling file content and then we are back to deadlock risk zone. Not sure which way to go. Anyway, I think we agree that there is no reason to merge FAN_DENY_ERRNO before FAN_PRE_* events, so we can continue this discussion later when I post FAN_PRE_* patches - not for this cycle. Thanks, Amir. [1] https://github.com/amir73il/fsnotify-utils/wiki/Hierarchical-Storage-Management-API#invalidating-local-cache