Re: Report ESTALE as ENOENT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Adding in Eric and Steve.

Ric


On Oct 18, 2017 9:35 AM, "Raghavendra G" <raghavendra@xxxxxxxxxxx> wrote:
+Brian Foster

On Wed, Oct 11, 2017 at 4:11 PM, Raghavendra G <raghavendra@xxxxxxxxxxx> wrote:
We ran into a regression [2][3]. Hence reviving this thread.

[2] https://bugzilla.redhat.com/show_bug.cgi?id=1500269
[3] https://review.gluster.org/18463

On Thu, Mar 31, 2016 at 1:22 AM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote:
On Mon, Mar 28, 2016 at 04:21:00PM -0400, Vijay Bellur wrote:
> On 03/28/2016 09:34 AM, FNU Raghavendra Manjunath wrote:
> >
> >I can understand the concern. But I think instead of generally
> >converting all the ESTALE errors ENOENT, probably we should try to
> >analyze the errors that are generated by lower layers (like posix).
> >
> >Even fuse kernel module some times returns ESTALE. (Well, I can see it
> >returning ESTALE errors in some cases in the code. Someone please
> >correct me if thats not the case).  And aso I am not sure if converting
> >all the ESTALE errors to ENOENT is ok or not.
>
> ESTALE in fuse is returned only for export_operations. fuse
> implements this for providing support to export fuse mounts as nfs
> exports. A cursory reading of the source seems to indicate that fuse
> returns ESTALE only in cases where filehandle resolution fails.
>
> >
> >For fd based operations, I am not sure if ENOENT can be sent or not (as
> >though the file is unlinked, it can be accessed if there were open fds
> >on it before unlinking the file).
>
> ESTALE should be fine for fd based operations. It would be analogous
> to a filehandle resolution failing and should not be a common
> occurrence.
>
> >
> >I feel, we have to look into some parts to check if they generating
> >ESTALE is a proper error or not. Also, if there is any bug in below
> >layers fixing which can avoid ESTALE errors, then I feel that would be
> >the better option.
> >
>
> I would prefer to:
>
> 1. Return ENOENT for all system calls that operate on a path.
>
> 2. ESTALE might be ok for file descriptor based operations.

Note that operations which operate on paths can fail with ESTALE when
they attempt to look up a component within a directory that no longer
exists.

But, "man 2 rmdir"  or "man 2 unlink" doesn't list ESTALE as a valid error. Also rm doesn't seem to handle ESTALE too [3]



Maybe non-creating open("./foo") returning ENOENT would be reasonable in
this case since that's what you'd get in the local filesystem case, but
creat("./foo") returning ENOENT, for example, isn't something
applications will be written to handle.

The Linux VFS will retry ESTALE on path-based systemcalls *one* time, to
reduce the chance of ESTALE in those cases. 

I should've anticipated bug [2] due to this comment. My mistake. Bug [2] is indeed due to kernel not retrying open on receiving an ENOENT error. Glusterfs sent ENOENT because file's inode-number/nodeid changed but same path exists. The correct error would've been ESTALE, but due to our conversion of ESTALE to ENOENT, the latter was sent back to kernel.

We've an application which does very frequent renames (around 10-15 per second). So, single retry by kernel of an open failed with ESTALE is not helping us.

@Bruce/Brian,

Do you know why VFS chose an approach of retrying instead of a stricter synchronization mechanism using locking? For eg., rename and open could've been synchronized using a lock.

For eg., a rough psuedocode for open and rename could've been (please ignore ordering src,dst locks in rename)

sys_open ()
{
       LOCK (dentry->lock);
       {
            lookup path;
            open (inode)
       }
       UNLOCK (dentry->lock;
}

sys_rename ()
{
         LOCK (dst-dentry->lock);
         {
                LOCK (src-dentry->lock);
                {
                     rename (src, dst);
                }
                UNLOCK (src-dentry->lock);
        }
        UNLOCK (dst-dentry->lock);
}

@Bruce,

With the current retry model, any suggestions on how to handle applications that do frequent renames?


Looking through kernel VFS code, only open *seems* to retry (do_filep_open). I couldn't find similar logic to other path based syscalls like rmdir, unlink, stat, chmod etc

The bugzilla entry that
tracked those patches might be interesting:

        https://bugzilla.redhat.com/show_bug.cgi?id=678544

> NFS recommends that applications add special code for handling
> ESTALE [1]. Unfortunately changing application code is not easy and
> hence it does not come as a surprise that coreutils also does not
> accommodate ESTALE.

We also need to consider whether the application's handling of the
ENOENT case could be incorrect for the ESTALE case, with consequences
possibly as bad as or worse than consequences of seeing an unexpected
error.

My first intuition is that translating ESTALE to ENOENT is less safe
than not doing so, because:

        - once an ESTALE-unaware application his the ESTALE case, we
          risk a bug regardless of which we return, but if we return
          ESTALE at least the problem should be more obvious to the
          person debugging.
        - for ESTALE-aware applications, the ESTALE/ENOENT distinction
          is useful.

Another place to not convert is for  those cases where kernel retries the operation on seeing an ESTALE.

I guess we need to think through each operation and we cannot ESTALE to ENOENT always.


But I haven't really thought through examples.

--b.
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel



--
Raghavendra G


regards,
--
Raghavendra G

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux