On Mon, Dec 18, 2023 at 5:53 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > On Mon, Dec 18, 2023 at 4:35 PM Jan Kara <jack@xxxxxxx> wrote: > > > > On Fri 15-12-23 18:50:39, Amir Goldstein wrote: > > > On Fri, Dec 15, 2023 at 5:31 PM Josef Bacik <josef@xxxxxxxxxxxxxx> wrote: > > > > > > > > On Wed, Dec 13, 2023 at 09:09:30PM +0200, Amir Goldstein wrote: > > > > > On Wed, Dec 13, 2023 at 7:28 PM Jan Kara <jack@xxxxxxx> wrote: > > > > > > > > > > > > On Fri 08-12-23 10:01:35, Amir Goldstein wrote: > > > > > > > With FAN_DENY response, user trying to perform the filesystem operation > > > > > > > gets an error with errno set to EPERM. > > > > > > > > > > > > > > It is useful for hierarchical storage management (HSM) service to be able > > > > > > > to deny access for reasons more diverse than EPERM, for example EAGAIN, > > > > > > > if HSM could retry the operation later. > > > > > > > > > > > > > > Allow userspace to response to permission events with the response value > > > > > > > FAN_DENY_ERRNO(errno), instead of FAN_DENY to return a custom error. > > > > > > > > > > > > > > The change in fanotify_response is backward compatible, because errno is > > > > > > > written in the high 8 bits of the 32bit response field and old kernels > > > > > > > reject respose value with high bits set. > > > > > > > > > > > > > > Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx> > > > > > > > > > > > > So a couple of comments that spring to my mind when I'm looking into this > > > > > > now (partly maybe due to my weak memory ;): > > > > > > > > > > > > 1) Do we still need the EAGAIN return? I think we have mostly dealt with > > > > > > freezing deadlocks in another way, didn't we? > > > > > > > > > > I was thinking about EAGAIN on account of the HSM not being able to > > > > > download the file ATM. > > > > > > > > > > There are a bunch of error codes that are typical for network filesystems, e.g. > > > > > ETIMEDOUT, ENOTCONN, ECONNRESET which could be relevant to > > > > > HSM failures. > > > > > > > > > > > > > > > > > 2) If answer to 1) is yes, then there is a second question - do we expect > > > > > > the errors to propagate back to the unsuspecting application doing say > > > > > > read(2) syscall? Because I don't think that will fly well with a big > > > > > > majority of applications which basically treat *any* error from read(2) as > > > > > > fatal. This is also related to your question about standard permission > > > > > > events. Consumers of these error numbers are going to be random > > > > > > applications and I see a potential for rather big confusion arising there > > > > > > (like read(1) returning EINVAL or EBADF and now you wonder why the hell > > > > > > until you go debug the kernel and find out the error is coming out of > > > > > > fanotify handler). And the usecase is not quite clear to me for ordinary > > > > > > fanotify permission events (while I have no doubts about creativity of > > > > > > implementors of fanotify handlers ;)). > > > > > > > > > > > > > > > > That's a good question. > > > > > I prefer to delegate your question to the prospect users of the feature. > > > > > > > > > > Josef, which errors did your use case need this feature for? > > > > > > > > > > > 3) Given the potential for confusion, maybe we should stay conservative and > > > > > > only allow additional EAGAIN error instead of arbitrary errno if we need it? > > > > > > > > > > > > > > > > I know I was planning to use this for EDQUOT error (from FAN_PRE_MODIFY), > > > > > but I certainly wouldn't mind restricting the set of custom errors. > > > > > I think it makes sense. The hard part is to agree on this set of errors. > > > > > > > > > > > > > I'm all for flexibility here. > > > > > > > > We're going to have 2 classes of applications interacting with HSM backed > > > > storage, normal applications and applications that know they're backed by HSM. > > > > The normal applications are just going to crash if they get an error on read(2), > > > > it doesn't matter what errno it is. The second class would have different > > > > things they'd want to do in the face of different errors, and that's what this > > > > patchset is targeting. We can limit it to a few errno's if that makes you feel > > > > better, but having more than just one would be helpful. > > > > > > Ok. In another email I got from your colleagues, they listed: > > > EIO, EAGAIN, ENOSPC as possible errors to return. > > > I added EDQUOT for our in house use case. > > > > OK, so do I get it right that you also have applications that are aware > > that they are operation on top of HSM managed filesystem and thus they can > > do meaningful things with the reported errors? > > > > Some applications are HSM aware. > Some just report the errors that they get which are meaningful to users. > EIO is the standard response for HSM failure to fill content. > > EDQUOT/ENOSPC is a good example of special functionality. > HSM "swaps out" file content to a slow remote tier, but the slow remote > tier may have a space/quota limit that is known to HSM. > > By tracking the total of st_size under some HSM managed folder, including > the st_size of files whose content is punched out, HSM can enforce this limit > not in the conventional meaning of local disk blocks usage. > > This is when returning EDQUOT/ENOSPC for FAN_PRE_MODIFY makes > sense to most users/applications, except for ones that try to create > large sparse > files... > > > > > Those are all valid errors for write(2) and some are valid for read(2). > > > ENOSPC/EDQUOT make a lot of sense for HSM for read(2), but could > > > be surprising to applications not aware of HSM. > > > I think it wouldn't be that bad if we let HSM decide which of those errors > > > to return for FAN_PRE_ACCESS as opposed to FAN_PRE_MODIFY. > > > > Yeah, I don't think we need to be super-restrictive here, I'd just prefer > > to avoid the "whatever number you decide to return" kind of interface > > because I can see potential for confusion and abuse there. I think all four > > errors above are perfectly fine for both FAN_PRE_ACCESS and FAN_PRE_MODIFY > > if there are consumers that are able to use them. > > > > > But given that we do want to limit the chance of abusing this feature, > > > perhaps it would be best to limit the error codes to known error codes > > > for write(2) IO failures (i.e. not EBADF, not EFAULT) and allow returning > > > FAN_DENY_ERRNO only for the new FAN_PRE_{ACCESS,MODIFY} > > > HSM events. > > > > > > IOW, FAN_{OPEN,ACCESS}_PERM - no FAN_DENY_ERRNO for you! > > > > > > Does that sound good to you? > > > > It sounds OK to me. I'm open to allowing FAN_DENY_ERRNO for FAN_OPEN_PERM > > if there's a usecase because at least conceptually it makes a good sense > > and chances for confusion are low there. People are used to dealing with > > errors on open(2). > > > > I wrote about one case I have below. > > > > Furthermore, we can start with allowing a very limited set of errors > > > and extend it in the future, on case by case basis. > > > > > > The way that this could be manageable is if we provide userspace > > > a way to test for supported return codes. > > > > > > There is already a simple method that we used for testing FAN_INFO > > > records type support - > > > After fan_fd = fanotify_init(), userspace can write a "test" fanotify_response > > > to fan_fd with fanotify_response.fd=FAN_NOFD. > > > > > > When setting fanotify_response.fd=FAN_DENY, this would return ENOENT, > > > but with fanotify_response.fd=FAN_DENY_ERRNO(EIO), upstream would > > > return EINVAL. > > > > > > This opens the possibility of allowing, say, EIO, EAGAIN in the first release > > > and ENOSPC, EDQUOT in the following release. > > > > If we forsee that ENOSPC and EDQUOT will be needed, then we can just enable > > it from start and not complicate our lives more than necessary. > > > > Sure, I was just giving an example how the list could be extended case by case > in the future. > > > > The advantage in this method is that it is very simple and already working > > > correctly for old kernels. > > > > > > The downside is that this simple method does not allow checking for > > > allowed errors per specific event type, so if we decide that we do want > > > to allow returning FAN_DENY_ERRNO for FAN_OPEN_PERM later on, this method > > > could not be used by userspace to test for this finer grained support. > > > > True, in that case the HSM manager would have to try responding with > > FAN_DENY_ERRNO() and if it fails, it will have to fallback to responding > > with FAN_DENY. Not too bad I'd say. > > > > Yeah that works too. > > > > In another thread, I mention the fact that FAN_OPEN_PERM still has a > > > potential freeze deadlock when called from open(O_TRUNC|O_CREATE), > > > so we can consider the fact that FAN_DENY_ERRNO is not allowed with > > > FAN_OPEN_PERM as a negative incentive for people to consider using > > > FAN_OPEN_PERM as a trigger for HSM. > > > > AFAIU from the past discussions, there's no good use of FAN_OPEN_PERM > > event for HSM. If that's the case, I'm for not allowing FAN_DENY_ERRNO for > > FAN_OPEN_PERM. > > In the HttpDirFS HSM demo, I used FAN_OPEN_PERM on a mount mark > to deny open of file during the short time that it's content is being > punched out [1]. > It is quite complicated to explain, but I only used it for denying access, > not to fill content and not to write anything to filesystem. > It's worth noting that returning EBUSY in that case would be more meaningful > to users. > > That's one case in favor of allowing FAN_DENY_ERRNO for FAN_OPEN_PERM, > but mainly I do not have a proof that people will not need it. > > OTOH, I am a bit concerned that this will encourage developer to use > FAN_OPEN_PERM as a trigger to filling file content and then we are back to > deadlock risk zone. > > Not sure which way to go. > > Anyway, I think we agree that there is no reason to merge FAN_DENY_ERRNO > before FAN_PRE_* events, so we can continue this discussion later when > I post FAN_PRE_* patches - not for this cycle. > Hi Jan, I started to prepare the pre-content events patches for posting and got back to this one as well. Since we had this discussion I have learned of another use case that requires filling file content in FAN_OPEN_PERM hook, FAN_OPEN_EXEC_PERM to be exact. The reason is that unless an executable content is filled at execve() time, there is no other opportunity to fill its content without getting -ETXTBSY. So to keep things more flexible, I decided to add -ETXTBSY to the allowed errors with FAN_DENY_ERRNO() and to decided to allow FAN_DENY_ERRNO() with all permission events. To keep FAN_DENY_ERRNO() a bit more focused on HSM, I have added a limitation that FAN_DENY_ERRNO() is allowed only for FAN_CLASS_PRE_CONTENT groups. Thanks, Amir.