On 3/17/2021 8:18 AM, Steve French wrote:
On Wed, Mar 17, 2021 at 6:25 AM Tom Talpey <tom@xxxxxxxxxx> wrote:
On 3/17/2021 2:10 AM, Steve French wrote:
Was examining why xfstest 614 failed (smb3.1.1 mount to Windows), and
noticed a couple of problems:
1) we don't requery the allocation size (number of blocks) in all
cases we should. This small fix should address that
--- a/fs/cifs/inode.c
+++ b/fs/cifs/inode.c
@@ -2390,15 +2390,16 @@ int cifs_getattr(struct user_namespace
*mnt_userns, const struct path *path,
struct cifs_tcon *tcon = cifs_sb_master_tcon(cifs_sb);
struct inode *inode = d_inode(dentry);
int rc;
/*
* We need to be sure that all dirty pages are written and the server
* has actual ctime, mtime and file length.
*/
- if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE)) &&
+ if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE |
STATX_BLOCKS)) &&
Seems obviously enough correct.
and 2) we don't set the allocation size on the case where a cached
file on this client is written, and to make it worse if we queried
Also obviously the cache needs to be kept in sync, but is it accurate to
set the allocation size before writing? That's the server's field, so
shouldn't it be written, then queried?
(post query attributes flag) at SMB3 close for compounded operations -
the Windows server (not sure about others) apparently doesn't update
the allocation size until the next open/queryinfo so we still end up
with an allocation size of 0 for a 64K file which breaks the test.
What the test is doing is quite simple:
xfs_io -f -c "truncate 64K" -c "mmap -w 0 64K" -c "mwrite -S 0xab 0
64K" -c "munmap" foo1 ; stat -c %b foo1
And it fails - due to seeing a number of blocks 0 rather than the
correct value (128). With actimeo=0 we do a subsequent open/query
operation which would cause it to pass since the second open/query
does show the correct allocation size.
Any ideas?
What actually goes on the wire diring the test? It looks like the
munmap step should be msync'ing - does cifs.ko not write the data?
Oddly enough this works to Azure but not Windows. What we see
on the wire is simple enough:
1) create/getinfo foo1 --> file not found
2) create foo1
3) oplock break of root's cached handle
4) close root handle
6) open of root/getinfo("FileFsFullSizeInformation")/Close
6) Flush foo1
7) ioctl set sparse foo1
8) setinfo FILE_ENDOFFILE_INFO foo1
9) create/setinfo(FILE_BASIC_INFO)/close of foo1
10) read 64K foo1 (all zeros)
11) write 64K foo1
12) close foo1 (post query attributes of close show size 64K,
allocation size 0 ---> should be allocation size 64K)
But it works to Azure ...
Wait, let me find my Spock ears.
Faaascinating.
Obviously the backend is doing some kind of deferred allocation.
Assuming this is NTFS, have you tried an alternative? I guess I'd
also suggest asking Neal, if you can reach him easily.
If it is deferring the block decision, I'd be cautious about
implementing a workaround like launching a second query attributes.
There's little guarantee it will be always correct.
If you can establish that the client is doing the right thing,
personally I'd implement some sort of waiver. Maybe make it dependent
on the server and its FS type, even.
I looked at MS-FSA and it explicitly states the AllocationSize is
updated as part of processing every write. Section 2.1.5.3:
If ((ByteOffset+ ByteCount) > Open.Stream.AllocationSize), the object
store MUST increase Open.Stream.AllocationSize to
BlockAlign(ByteOffset+ ByteCount, Open.File.Volume.ClusterSize). If
there is not enough disk space, the operation MUST be failed with
STATUS_DISK_FULL.
So this might be worth a message to dochelp.
Tom.