Re: file handle in statx (was: Re: How to cope with subvolumes and snapshots on muti-user systems?)

Andreas Dilger <adilger@xxxxxxxxx> · Wed, 13 Dec 2023 15:45:41 -0700

On Dec 12, 2023, at 5:02 PM, NeilBrown <neilb@xxxxxxx> wrote:
> 
> On Wed, 13 Dec 2023, Kent Overstreet wrote:
>> On Wed, Dec 13, 2023 at 09:57:22AM +1100, NeilBrown wrote:
>>> On Wed, 13 Dec 2023, Kent Overstreet wrote:
>>>> On Mon, Dec 11, 2023 at 11:40:16PM +0000, David Howells wrote:
>>>>> Kent Overstreet <kent.overstreet@xxxxxxxxx> wrote:
>>>>> 
>>>>>> I was chatting a bit with David Howells on IRC about this, and floated
>>>>>> adding the file handle to statx. It looks like there's enough space
>>>>>> reserved to make this feasible - probably going with a fixed maximum
>>>>>> size of 128-256 bits.
>>>>> 
>>>>> We can always save the last bit to indicate extension space/extension record,
>>>>> so we're not that strapped for space.
>>>> 
>>>> So we'll need that if we want to round trip NFSv4 filehandles, they
>>>> won't fit in existing struct statx (nfsv4 specs 128 bytes, statx has 96
>>>> bytes reserved).
>>>> 
>>>> Obvious question (Neal): do/will real world implementations ever come
>>>> close to making use of this, or was this a "future proofing gone wild"
>>>> thing?
>>> 
>>> I have no useful data.  I have seen lots of filehandles but I don't pay
>>> much attention to their length.  Certainly some are longer than 32 bytes.
>>> 
>>>> 
>>>> Say we do decide we want to spec it that large: _can_ we extend struct
>>>> statx? I'm wondering if the userspace side was thought through, I'm
>>>> sure glibc people will have something to say.
>>> 
>>> The man page says:
>>> 
>>>     Therefore, do not simply set mask to UINT_MAX (all bits set), as
>>>     one or more bits may, in the future, be used to specify an
>>>     extension to the buffer.
>>> 
>>> I suspect the glibc people read that.
>> 
>> The trouble is that C has no notion of which types are safe to pass
>> across a dynamic library boundary, so if we increase the size of struct
>> statx and someone's doing that things will break in nasty ways.
>> 
> 
> Maybe we don't increase the size of struct statx.
> Maybe we declare
> 
>   struct statx2 {
>     struct statx base;
>     __u8 stx_handle[128];
>   }
> 
> and pass then when we request STX_HANDLE.

This would be extremely fragile, in the sense that "struct statx2" breaks
if "struct statx" adds any fields.

Not getting into the question of whether the FH _should_ be added to statx
or not, but I wanted to chime in on the discussion about statx being able
to add new fields.

The answer is definitely yes.  Callers are expected to set STATX_* flags
for any fields that they are interested in, so if the caller were to set
STATX_FILE_HANDLE in the request, then it is expected they understand this
flag and have allocated a large enough struct statx to hold stx_file_handle
in the reply.

Apps that don't need/want stx_file_handle should not set STATX_FILE_HANDLE
and then the kernel won't spend time to fill this in, the same as other
fields that may have overhead, like stx_size.  That should avoid concerns
from Dave that this adds overhead to the hot path.  Most apps won't need or
want this field, but apps that *do* want it would be better served getting
it in a single syscall instead of two (and avoid potential TOCTOU races).

It should be possible for userspace and the kernel to increase the size of
struct statx independently, and not have any issues.  If userspace requests
a field via STATX_* flags that the kernel doesn't understand, then it will
be masked out by the kernel, and any extended fields in the struct will not
be referenced.  Likewise, if the kernel understands more fields than what
userspace requests, it shouldn't spend time to fill in those fields, since
userspace will ignores them anyway, so it is just wasted cycles.

Not going into whether statx _should_ handle variable-sized fields, but
it _could_ do so, though it would make struct handling a bit more complex.

Adding "__u32 stx_file_handle_size" and "__u32 stx_file_handle_offset"
(relative to the start of the struct) to encode where the variable sized
handle data would be packed at the end after fixed-size fields (aligned
on a __u64 boundary to avoid issues if there are multiple such fields).
Userspace would indicate the maximum size of file handle it expects via
(stx_file_handle_offset + stx_file_handle_size).  The kernel would be at
liberty to pack "struct file_handle" at that offset, or after the end of
whatever fields are actually used, and userspace would use this to access
the handle.

Not pretty, but probably better than reserving a huge fixed size field
that is not needed for most case (is NFS intending a server on every
sub-atomic particle in the universe?).  This would allow apps to access
fields after stx_file_handle_size/offset normally, and only compute a
simple offset to access the variable sized blob as needed.  The presence
of STATX_FILE_HANDLE would indicate if the size/offset fields were valid
upon return (since an old kernel would not zero the fields).

Cheers, Andreas

Attachment:
signature.asc

Description: Message signed with OpenPGP