Re: [PATCH] cifs: fix allocation size on newly created files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yes - the Linux terminology is confusing.  Quoting from Ubuntu docs
e.g. about stat output

           "IO Block" in stat's output is the preferred number of
           bytes that the file system uses for reading and writing files...
           "Blocks", on the other hand, counts how many 512-bytes blocks
           are allocated on disk for the file.

So for NFS and SMB3 mounts they return 1MB for "IO Block" size.

statfs on the other hand shows what the server thinks the block size
is (often 4K) but
that is a different field.

And of course number of "blocks" in stat output is meant to return
allocation size
(in 512 byte units for historical reasons, even if most file systems
don't use 512
byte blocks anymore)


On Fri, Mar 19, 2021 at 12:52 PM Tom Talpey <tom@xxxxxxxxxx> wrote:
>
> But it's not the block size here, it's the cluster size. Block
> size is the per-io hunk, allocation size is the number of blocks
> lined up to receive it.
>
> Perhaps the safest number is the file size itself, unrounded.
>
> On 3/19/2021 1:46 PM, Steve French wrote:
> > e.g. stat reports much larger than 512 byte block size over SMB3
> >
> > # stat /mnt2/foo
> >    File: /mnt2/foo
> >    Size: 65536      Blocks: 128        IO Block: 1048576 regular file
> > Device: 34h/52d Inode: 88946092640651991  Links: 1
> >
> > and local file systems do the same ie "blocks" is unrelated to block size
> > the fs reports.  Here is an example to XFS locally
> >
> > # stat Makefile
> >    File: Makefile
> >    Size: 66247      Blocks: 136        IO Block: 4096   regular file
> > Device: 10302h/66306d Inode: 1076242180  Links: 1
> >
> > On Fri, Mar 19, 2021 at 12:42 PM Steve French <smfrench@xxxxxxxxx> wrote:
> >>
> >> We report the block size properly (typically much larger) - but the
> >> kernel API returns allocation size in 512 byte units no matter what the
> >> block size is.   Number of blocks returned for the kernel API
> >>       inode->i_blocks
> >> is unrelated to the block size (simply allocation_size/512 rounded up by 1).
> >>
> >> On Fri, Mar 19, 2021 at 12:38 PM Tom Talpey <tom@xxxxxxxxxx> wrote:
> >>>
> >>> On 3/19/2021 1:25 AM, Steve French wrote:
> >>>> Applications that create and extend and write to a file do not
> >>>> expect to see 0 allocation size.  When file is extended,
> >>>> set its allocation size to a plausible value until we have a
> >>>> chance to query the server for it.  When the file is cached
> >>>> this will prevent showing an impossible number of allocated
> >>>> blocks (like 0).  This fixes e.g. xfstests 614 which does
> >>>>
> >>>>       1) create a file and set its size to 64K
> >>>>       2) mmap write 64K to the file
> >>>>       3) stat -c %b for the file (to query the number of allocated blocks)
> >>>>
> >>>> It was failing because we returned 0 blocks.  Even though we would
> >>>> return the correct cached file size, we returned an impossible
> >>>> allocation size.
> >>>>
> >>>> Signed-off-by: Steve French <stfrench@xxxxxxxxxxxxx>
> >>>> CC: <stable@xxxxxxxxxxxxxxx>
> >>>> ---
> >>>>    fs/cifs/inode.c | 12 ++++++++++--
> >>>>    1 file changed, 10 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
> >>>> index 7c61bc9573c0..17a2c87b811c 100644
> >>>> --- a/fs/cifs/inode.c
> >>>> +++ b/fs/cifs/inode.c
> >>>> @@ -2395,7 +2395,7 @@ int cifs_getattr(struct user_namespace
> >>>> *mnt_userns, const struct path *path,
> >>>>     * We need to be sure that all dirty pages are written and the server
> >>>>     * has actual ctime, mtime and file length.
> >>>>     */
> >>>> - if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE)) &&
> >>>> + if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE |
> >>>> STATX_BLOCKS)) &&
> >>>>         !CIFS_CACHE_READ(CIFS_I(inode)) &&
> >>>>         inode->i_mapping && inode->i_mapping->nrpages != 0) {
> >>>>     rc = filemap_fdatawait(inode->i_mapping);
> >>>> @@ -2585,6 +2585,14 @@ cifs_set_file_size(struct inode *inode, struct
> >>>> iattr *attrs,
> >>>>     if (rc == 0) {
> >>>>     cifsInode->server_eof = attrs->ia_size;
> >>>>     cifs_setsize(inode, attrs->ia_size);
> >>>> + /*
> >>>> + * i_blocks is not related to (i_size / i_blksize),
> >>>> + * but instead 512 byte (2**9) size is required for
> >>>> + * calculating num blocks. Until we can query the
> >>>> + * server for actual allocation size, this is best estimate
> >>>> + * we have for the blocks allocated for this file
> >>>> + */
> >>>> + inode->i_blocks = (512 - 1 + attrs->ia_size) >> 9;
> >>>
> >>> I don't think 512 is a very robust choice, no server uses anything
> >>> so small any more. MS-FSA requires the allocation quantum to be the
> >>> volume cluster size. Is that value available locally?
> >>>
> >>> Tom.
> >>>
> >>>>     /*
> >>>>     * The man page of truncate says if the size changed,
> >>>> @@ -2912,7 +2920,7 @@ cifs_setattr_nounix(struct dentry *direntry,
> >>>> struct iattr *attrs)
> >>>>     sys_utimes in which case we ought to fail the call back to
> >>>>     the user when the server rejects the call */
> >>>>     if ((rc) && (attrs->ia_valid &
> >>>> - (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
> >>>> +     (ATTR_MODE | ATTR_GID | ATTR_UID | ATTR_SIZE)))
> >>>>     rc = 0;
> >>>>     }
> >>>>
> >>
> >>
> >>
> >> --
> >> Thanks,
> >>
> >> Steve
> >
> >
> >



--
Thanks,

Steve



[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux