Linux-cifs readdir behaviour when dir modified

Shyam Prasad N <nspmangalore@xxxxxxxxx> · Tue, 20 Oct 2020 11:14:11 +0530

Hi,

I spent some time in debugging this issue today:
https://gitlab.alpinelinux.org/alpine/aports/-/issues/10960

A summary of the issue:
With alpine linux containers (which uses the musl implementation of
libc), the "rm -Rf" command could fail depending upon the dir size.
The Linux cifs client filesystem behaviour is compared against ext4
behaviour.

What I saw while debugging (some of which is already covered in the
bug and related bugs):
1. The musl libc sends down small buffers as a part of it's readdir
implementation. These buffers are very small compared to it's glibc
counterpart.
2. cifs.ko is reading the whole directory from the server, but is only
able to fill the readdir results in small portions due to the bufsize
sent by the libc.
3. The libc does unlink of the dirents that have been returned so far.
4. The libc continues the readdir from the prev offset, expecting to
continue the listing.
5. cifs.ko now sees that the directory has changed, and fetches the
directory contents from the server again. However, the reply to the
user is populated from the prev offset, so directory listing does not
return files from the new beginning.
6. As a result, the final rmdir (which assumes that the directory is
now empty) fails.

Out of curiosity, I checked the ext4 code to understand the handling
of this use case, and I see this comment:
        /* If the dir block has changed since the last call to
                                        * readdir(2), then we might be
pointing to an invalid                                               *
dirent right now.  Scan from the start of the block
                             * to make sure. */
... and the corresponding code.

Now the question is whether cifs.ko is doing anything wrong?
@Steve French pointed me to this readdir documentation:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir_r.html

If a file is removed from or added to the directory after the most
recent call to opendir() or rewinddir(), whether a subsequent call to
readdir() returns an entry for that file is unspecified.

So I guess the documents don't specify the behaviour in this case.

We could go the ext4 way and reset the offset to 0 when we detect that
the directory has been modified. That would handle the "rm -Rf" use
case as expected by the user here. However, we could end up repeating
dirents over successive readdir calls.

Posting the question to the larger group, to see if it's worth the
effort to make this change. The change here seems quite simple, which
is to reset file->pos to 0, when we detect that the dir has changed.
But since it'll result in change in behaviour, I wanted to check
first.

-- 
-Shyam