Re: [PATCH 1/5] fs: add SEEK_HOLE and SEEK_DATA flags

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2011-07-18, at 11:21 AM, Josef Bacik <josef@xxxxxxxxxx> wrote:

> This just gets us ready to support the SEEK_HOLE and SEEK_DATA flags.  Turns out
> using fiemap in things like cp cause more problems than it solves, so lets try
> and give userspace an interface that doesn't suck.  We need to match solaris
> here, and the definitions are
> 
> *o* If /whence/ is SEEK_HOLE, the offset of the start of the
> next hole greater than or equal to the supplied offset
> is returned. The definition of a hole is provided near
> the end of the DESCRIPTION.
> 
> *o* If /whence/ is SEEK_DATA, the file pointer is set to the
> start of the next non-hole file region greater than or
> equal to the supplied offset.
> 
> So in the generic case the entire file is data and there is a virtual hole at
> the end.  That means we will just return i_size for SEEK_HOLE and will return
> the same offset for SEEK_DATA.  This is how Solaris does it so we have to do it
> the same way.
> 
> Thanks,
> 
> Signed-off-by: Josef Bacik <josef@xxxxxxxxxx>
> ---
> Documentation/filesystems/porting |    9 +++++++
> fs/read_write.c                   |   44 ++++++++++++++++++++++++++++++++++--
> include/linux/fs.h                |    4 ++-
> 3 files changed, 53 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting
> index 6e29954..e24e7e9 100644
> --- a/Documentation/filesystems/porting
> +++ b/Documentation/filesystems/porting
> @@ -407,3 +407,12 @@ a file off.
> a matter of switching from calling get_sb_... to mount_... and changing the
> function type.  If you were doing it manually, just switch from setting ->mnt_root
> to some pointer to returning that pointer.  On errors return ERR_PTR(...).
> +
> +[mandatory]
> +    If you implement your own ->llseek() you must handle SEEK_HOLE and
> +SEEK_DATA.  You can hanle this by returning -EINVAL, but it would be nicer to
> +support it in some way.  The generic handler assumes that the entire file is
> +data and there is a virtual hole at the end of the file.  So if the provided
> +offset is less than i_size and SEEK_DATA is specified, return the same offset.
> +If the above is true for the offset and you are given SEEK_HOLE, return the end
> +of the file.  If the offset is i_size or greater return -ENXIO in either case.

Rather than documenting (only) the way to have a "compliant but not useful SEEK_{HOLE,DATA} implementation, it makes sense to document the desired implementation here as well (what is in the commit comment).

> diff --git a/fs/read_write.c b/fs/read_write.c
> index 5520f8a..5907b49 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -64,6 +64,23 @@ generic_file_llseek_unlocked(struct file *file, loff_t offset, int origin)
>            return file->f_pos;
>        offset += file->f_pos;
>        break;
> +    case SEEK_DATA:
> +        /*
> +         * In the generic case the entire file is data, so as long as
> +         * offset isn't at the end of the file then the offset is data.
> +         */
> +        if (offset >= inode->i_size)
> +            return -ENXIO;
> +        break;
> +    case SEEK_HOLE:
> +        /*
> +         * There is a virtual hole at the end of the file, so as long as
> +         * offset isn't i_size or larger, return i_size.
> +         */
> +        if (offset >= inode->i_size)
> +            return -ENXIO;
> +        offset = inode->i_size;
> +        break;
>    }
> 
>    if (offset < 0 && !unsigned_offsets(file))
> @@ -128,12 +145,13 @@ EXPORT_SYMBOL(no_llseek);
> 
> loff_t default_llseek(struct file *file, loff_t offset, int origin)
> {
> +    struct inode *inode = file->f_path.dentry->d_inode;
>    loff_t retval;
> 
> -    mutex_lock(&file->f_dentry->d_inode->i_mutex);
> +    mutex_lock(&inode->i_mutex);
>    switch (origin) {
>        case SEEK_END:
> -            offset += i_size_read(file->f_path.dentry->d_inode);
> +            offset += i_size_read(inode);
>            break;
>        case SEEK_CUR:
>            if (offset == 0) {
> @@ -141,6 +159,26 @@ loff_t default_llseek(struct file *file, loff_t offset, int origin)
>                goto out;
>            }
>            offset += file->f_pos;
> +            break;
> +        case SEEK_DATA:
> +            /*
> +             * In the generic case the entire file is data, so as
> +             * long as offset isn't at the end of the file then the
> +             * offset is data.
> +             */
> +            if (offset >= inode->i_size)
> +                return -ENXIO;
> +            break;
> +        case SEEK_HOLE:
> +            /*
> +             * There is a virtual hole at the end of the file, so
> +             * as long as offset isn't i_size or larger, return
> +             * i_size.
> +             */
> +            if (offset >= inode->i_size)
> +                return -ENXIO;
> +            offset = inode->i_size;
> +            break;
>    }
>    retval = -EINVAL;
>    if (offset >= 0 || unsigned_offsets(file)) {
> @@ -151,7 +189,7 @@ loff_t default_llseek(struct file *file, loff_t offset, int origin)
>        retval = offset;
>    }
> out:
> -    mutex_unlock(&file->f_dentry->d_inode->i_mutex);
> +    mutex_unlock(&inode->i_mutex);
>    return retval;
> }
> EXPORT_SYMBOL(default_llseek);
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index b5b9792..c9156f3 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -32,7 +32,9 @@
> #define SEEK_SET    0    /* seek relative to beginning of file */
> #define SEEK_CUR    1    /* seek relative to current file position */
> #define SEEK_END    2    /* seek relative to end of file */
> -#define SEEK_MAX    SEEK_END
> +#define SEEK_DATA    3    /* seek to the next data */
> +#define SEEK_HOLE    4    /* seek to the next hole */
> +#define SEEK_MAX    SEEK_HOLE
> 
> struct fstrim_range {
>    __u64 start;
> -- 
> 1.7.5.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux