On 2022-05-09 10:54, Jan Kara wrote: > On Fri 06-05-22 14:46:49, Richard Guy Briggs wrote: > > On 2022-05-05 16:44, Jan Kara wrote: > > > On Tue 03-05-22 21:33:35, Richard Guy Briggs wrote: > > > > On 2022-05-02 20:16, Paul Moore wrote: > > > > > On Thu, Apr 28, 2022 at 8:45 PM Richard Guy Briggs <rgb@xxxxxxxxxx> wrote: > > > > > > This patch adds 2 structure members to the response returned from user > > > > > > space on a permission event. The first field is 16 bits for the context > > > > > > type. The context type will describe what the meaning is of the second > > > > > > field. The default is none. The patch defines one additional context > > > > > > type which means that the second field is a 32-bit rule number. This > > > > > > will allow for the creation of other context types in the future if > > > > > > other users of the API identify different needs. The second field size > > > > > > is defined by the context type and can be used to pass along the data > > > > > > described by the context. > > > > > > > > > > > > To support this, there is a macro for user space to check that the data > > > > > > being sent is valid. Of course, without this check, anything that > > > > > > overflows the bit field will trigger an EINVAL based on the use of > > > > > > FAN_INVALID_RESPONSE_MASK in process_access_response(). > > > > > > > > > > > > ... > > > > > > > > > static ssize_t fanotify_write(struct file *file, const char __user *buf, size_t count, loff_t *pos) > > > > > > { > > > > > > - struct fanotify_response response = { .fd = -1, .response = -1 }; > > > > > > + struct fanotify_response response; > > > > > > struct fsnotify_group *group; > > > > > > int ret; > > > > > > + size_t size = min(count, sizeof(struct fanotify_response)); > > > > > > > > > > > > if (!IS_ENABLED(CONFIG_FANOTIFY_ACCESS_PERMISSIONS)) > > > > > > return -EINVAL; > > > > > > > > > > > > group = file->private_data; > > > > > > > > > > > > - if (count < sizeof(response)) > > > > > > + if (count < offsetof(struct fanotify_response, extra_info_buf)) > > > > > > return -EINVAL; > > > > > > > > > > Is this why you decided to shrink the fanotify_response:response field > > > > > from 32-bits to 16-bits? I hope not. I would suggest both keeping > > > > > the existing response field as 32-bits and explicitly checking for > > > > > writes that are either the existing/compat length as well as the > > > > > newer, longer length. > > > > > > > > No. I shrank it at Jan's suggestion. I think I agree with you that > > > > the response field should be kept at u32 as it is defined in userspace > > > > and purge the doubt about what would happen with a new userspace with > > > > an old kernel. > > > > > > Hum, for the life of me I cannot find my response you mention here. Can you > > > send a link so that I can refresh my memory? It has been a long time... > > > > https://listman.redhat.com/archives/linux-audit/2020-October/017066.html > > https://listman.redhat.com/archives/linux-audit/2020-October/017067.html > > Thanks! > > > > > > > + > > > > > > +#define FANOTIFY_RESPONSE_EXTRA_LEN_MAX \ > > > > > > + (sizeof(union { \ > > > > > > + struct fanotify_response_audit_rule r; \ > > > > > > + /* add other extra info structures here */ \ > > > > > > + })) > > > > > > + > > > > > > struct fanotify_response { > > > > > > __s32 fd; > > > > > > - __u32 response; > > > > > > + __u16 response; > > > > > > + __u16 extra_info_type; > > > > > > + char extra_info_buf[FANOTIFY_RESPONSE_EXTRA_LEN_MAX]; > > > > > > }; > > > > > > > > > > Since both the kernel and userspace are going to need to agree on the > > > > > content and formatting of the fanotify_response:extra_info_buf field, > > > > > why is it hidden behind a char array? You might as well get rid of > > > > > that abstraction and put the union directly in the fanotify_response > > > > > struct. It is possible you could also get rid of the > > > > > fanotify_response_audit_rule struct this way too and just access the > > > > > rule scalar directly. > > > > > > > > This does make sense and my only concern would be a variable-length > > > > type. There isn't any reason to hide it. If userspace chooses to use > > > > the old interface and omit the type field then it defaults to NONE. > > > > > > > > If future types with variable data are defined, the first field could be > > > > a u32 that unions with the rule number that won't change the struct > > > > size. > > > > > > Struct fanotify_response size must not change, it is part of the kernel > > > ABI. In particular your above change would break userspace code that is > > > currently working just fine (e.g. allocating 8 bytes and expecting struct > > > fanotify_response fits there, or just writing sizeof(struct > > > fanotify_response) as a response while initializing only first 8 bytes). > > > > Many kernel ABIs have been expanded without breaking them. > > > > Is it reasonable for a userspace program to use a kernel structure > > without also using its size for allocation and initialization? > > Well, I'm not sure whether to call it reasonable but it certainly happens > and we are generally obliged to keep backwards compatibility (the golden > "don't break userspace" rule). There's a lot of stupid things userspace could do that would render it impossible to ever change or improve anything, including fixing security bugs, with that absolutist approach. > > > How I'd suggest doing it now (and I'd like to refresh my memory from my > > > past emails you mention because in the past I might have thought something > > > else ;)) is that you add another flag to 'response' field similar to > > > FAN_AUDIT - like FAN_EXTRA_INFO. If that is present, it means extra info is > > > to be expected after struct fanotify_response. > > > > That's an interesting possibility... I'm trying to predict if that > > would be a problem for an old kernel... In process_access_response() it > > would fallthrough to the default case of -EINVAL but do so safely. > > Yes, old kernel would just refuse such response so new userspace can adapt > to that. Fair enough. > > > The extra info would always start with a header like: > > > > > > struct fanotify_response_info_header { > > > __u8 info_type; > > > __u8 pad; > > > __u16 len; /* This is including the header itself */ > > > } > > > > > > And after such header there would be the 'blob' of data 'len - header size' > > > long. We use this same scheme when passing fanotify events to userspace > > > and it has proven to be lightweight and extensible. It covers the situation > > > when in the future audit would decide it wants other data (just change data > > > type), it would also cover the situation when some other subsystem wants > > > its information passed as well - there can be more structures like this > > > attached at the end, we can process the response up to the length of the > > > write. > > > > This reminds me of the RFC2367 (PF_KEYv2, why is that still right there > > at the tip of my fingers?) > > > > > Now these are just possible future extensions making sure we can extend the > > > ABI without too much pain. In the current implementation I'd just return > > > EINVAL whenever more than FANOTIFY_RESPONSE_MAX_LEN (16 bytes) is written > > > and do very strict checks on what gets passed in. It is also trivially > > > backwards compatible (old userspace on new kernel works just fine). > > > > Old userspace on new kernel would work fine with this idea, the v2 patch > > posted, or with leaving the response field of struct fanotify_response > > as __u32. > > So I've realized the idea to reduce 'response' field to __u16 will not work > for big endian architectures (that was a thinko of my old suggestion from > 2020). Old userspace binaries will break with that. So we have to keep > 'response' to __u32 and just add a flag like I'm suggesting now or > something like that. Ok, this is the nagging suspicion I had. This decides it in favour of not changing that u32 response. > > > If you want to achieve compatibility of running new userspace on old kernel > > > (I guess that's desirable), we have group flags for that - like we > > > introduced FAN_ENABLE_AUDIT to allow for FAN_AUDIT flag in response we now > > > need to add a flag like FAN_EXTENDED_PERMISSION_INFO telling the kernel it > > > should expect an allow more info returning for permission events. At the > > > same time this is the way for userspace to be able to tell whether the > > > kernel supports this. I know this sounds tedious but that's the cost of > > > extending the ABI in the compatible way. We've made various API mistakes in > > > the past having to add weird workarounds to fanotify and we don't want to > > > repeat those mistakes :). > > > > > > One open question I have is what should the kernel do with 'info_type' in > > > response it does not understand (in the future when there are possibly more > > > different info types). It could just skip it because this should be just > > > additional info for introspection (the only mandatory part is in > > > fanotify_response, however it could surprise userspace that passed info is > > > just getting ignored. To solve this we would have to somewhere report > > > supported info types (maybe in fanotify fdinfo in proc). I guess we'll > > > cross that bridge when we get to it. > > > > Well, one possibility is to return -EINVAL to signal the kernel is out > > of date. > > Yes, I think we've settled with Amir on returning -EINVAL in case unknown > info is spotted. Good. > Honza > -- > Jan Kara <jack@xxxxxxxx> > SUSE Labs, CR - RGB -- Richard Guy Briggs <rgb@xxxxxxxxxx> Sr. S/W Engineer, Kernel Security, Base Operating Systems Remote, Ottawa, Red Hat Canada IRC: rgb, SunRaycer Voice: +1.647.777.2635, Internal: (81) 32635