Re: [fuse-devel] Proxmox + NFS w/ exported FUSE = EIO

Bernd Schubert <bernd.schubert@xxxxxxxxxxx> · Mon, 19 Feb 2024 20:17:10 +0100

On 2/19/24 20:05, Antonio SJ Musumeci wrote:
> On Monday, February 19th, 2024 at 5:36 AM, Bernd Schubert <bernd.schubert@xxxxxxxxxxx> wrote:
> 
>>
>>
>>
>>
>> On 2/18/24 01:48, Antonio SJ Musumeci wrote:
>>
>>> On 2/7/24 01:04, Amir Goldstein wrote:
>>>
>>>> On Wed, Feb 7, 2024 at 5:05 AM Antonio SJ Musumeci trapexit@xxxxxxxxxx wrote:
>>>>
>>>>> On 2/6/24 00:53, Amir Goldstein wrote:
>>>>> only for a specific inode object to which you have an open fd for.
>>>>> Certainly not at the sb/mount level.
>>>>
>>>> Thanks,
>>>> Amir.
>>>
>>> Thanks again Amir.
>>>
>>> I've narrowed down the situation but I'm still struggling to pinpoint
>>> the specifics. And I'm unfortunately currently unable to replicate using
>>> any of the passthrough examples. Perhaps some feature I'm enabling (or
>>> not). My next steps are looking at exactly what differences there are in
>>> the INIT reply.
>>>
>>> I'm seeing a FUSE_LOOKUP request coming in for ".." of nodeid 1.
>>>
>>> I have my FUSE fs setup about as simply as I can. Single threaded. attr
>>> and entry/neg-entry caching off. direct-io on. EXPORT_SUPPORT is
>>> enabled. The mountpoint is exported via NFS. On the same host I mount
>>> NFS. I mount it on another host as well.
>>>
>>> On the local machine I loop reading a large file using dd
>>> (if=/mnt/nfs/file, of=/dev/null). After it finished I echo 3 >
>>> drop_caches. That alone will go forever. If on the second machine I
>>> start issuing `ls -lh /mnt/nfs` repeatedly after a moment it will
>>> trigger the issue.
>>>
>>> `ls` will successfully statx /mnt/nfs and the following openat and
>>> getdents also return successfully. As it iterates over the output of
>>> getdents statx's for directories fail with EIO and files succeed as
>>> normal. In my FUSE server for each EIO failure I'm seeing a lookup for
>>> ".." on nodeid 1. Afterwards all lookups fail on /mnt/nfs. The only
>>> request that seems to work is statfs.
>>>
>>> This was happening some time ago without me being able to reproduce it
>>> so I put a check to see if that was happening and return -ENOENT.
>>> However, looking over libfuse HLAPI it looks like fuse_lib_lookup
>>> doesn't handle this situation. Perhaps a segv waiting to happen?
>>>
>>> If I remove EXPORT_SUPPORT I'm no longer triggering the issue (which I
>>> guess makes sense.)
>>>
>>> Any ideas on how/why ".." for root node is coming in? Is that valid? It
>>> only happens when using NFS? I know there is talk of adding the ability
>>> of refusing export but what is the consequence of disabling
>>> EXPORT_SUPPORT? Is there a performance or capability difference? If it
>>> is a valid request what should I be returning?
>>
>>
>> If you don't set EXPORT_SUPPORT, it just returns -ESTALE in the kernel
>> side functions - which is then probably handled by the NFS client. I
>> don't think it can handle that in all situations, though. With
>> EXPORT_SUPPORT an uncached inode is attempted to be opened with the name
>> "." and the node-id set in the lookup call. Similar for parent, but
>> ".." is used.
>>
>> A simple case were this would already fail without NFS, but with the
>> same API
>>
>> name_to_handle_at()
>> umount fuse
>> mount fuse
>> open_by_handle_at
>>
>>
>> I will see if I can come up with a simple patch that just passes these
>> through to fuse-server
>>
>>
>> static const struct export_operations fuse_export_operations = {
>> .fh_to_dentry = fuse_fh_to_dentry,
>> .fh_to_parent = fuse_fh_to_parent,
>> .encode_fh = fuse_encode_fh,
>> .get_parent = fuse_get_parent,
>> };
>>
>>
>>
>>
>> Cheers,
>> Bernd
> 
> Thank you but I'm not sure I'm able to piece together the answers to my questions from that.
> 
> Perhaps my ignorance of the kernel side is showing but how can the root node have a parent? If it can have a parent then does that mean that the HLAPI has a possible bug in lookup?
> 
> I handle "." and ".." just fine for non-root nodes. But this is `lookup(nodeid=1,name="..");`.
> 
> Given the relative directory structure:
> 
> * /dir1/
> * /dir2/
> * /dir3/
> * /file1
> * /file2
> 
> This is what I see from the kernel:
> 
> lookup(nodeid=3, name=.);
> lookup(nodeid=3, name=..);
> lookup(nodeid=1, name=dir2);
> lookup(nodeid=1, name=..);
> forget(nodeid=3);
> forget(nodeid=1);
> 
> lookup(nodeid=4, name=.);
> lookup(nodeid=4, name=..);
> lookup(nodeid=1, name=dir3);
> lookup(nodeid=1, name=..);
> forget(nodeid=4);
> 
> lookup(nodeid=5, name=.);
> lookup(nodeid=5, name=..);
> lookup(nodeid=1, name=dir1);
> lookup(nodeid=1, name=..);
> forget(nodeid=5);
> forget(nodeid=1);
> 
> 
> It isn't clear to me what the proper response is for lookup(nodeid=1, name=..). Make something up? From userspace if you stat "/.." you get details for "/". If I respond to that lookup request with the details of root node it errors out.

I might be wrong, but from my understanding of the code, "." here means
"I don't have the name, please look up the entry by ID. Entry can be any
valid directory entry. And  ".." means "I don't have the name, please
look up parent by ID". Parent has to be a directory. So for ".." and
ID=FUSE_ROOT_ID it should return your file system root dir.