Re: [RFC][PATCH] Implement SEEK_HOLE/SEEK_DATA

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 28, 2007 at 01:56:49PM -0800, Brad Boyer wrote:
> 
> I have a few comments. Some of them are inline with the code.
> 
> The one generic thing is that the Sun documentation specifically
> says to check pathconf(_PC_MIN_HOLE_SIZE) on the filesystem to see
> if it supports this request. Have you looked into adding this to
> the Linux version of pathconf? I haven't tried to look at the latest
> glibc, but 2.3.2 doesn't appear to have it. If we do have it, this
> request should really go to the kernel to ask the individual
> filesystem.  However, it looks like pathconf is entirely implemented
> in glibc.  Should we add a system call or some other way to pass
> pathconf requests into the kernel? I know pathconf currently returns
> some inaccurate results in some cases due to this limitation. There
> are some existing ones like _PC_LINK_MAX that glibc tries to guess
> the correct result based on statfs and can get wrong in any
> non-standard setup.
>

Since it appears pathconf doesn't actually interface with the kernel I think
that its not really worth chasing down.  I don't think every fs maintainer is
really going to want to chase down a glibc person in order to change what
pathconf returns for their particular fs.
 
> 
> On Wed, Nov 28, 2007 at 03:02:07PM -0500, Josef Bacik wrote:
> > diff --git a/fs/read_write.c b/fs/read_write.c
> > index ea1f94c..cf61e1e 100644
> > --- a/fs/read_write.c
> > +++ b/fs/read_write.c
> > @@ -31,10 +31,32 @@ const struct file_operations generic_ro_fops = {
> >  
> >  EXPORT_SYMBOL(generic_ro_fops);
> >  
> > +static loff_t generic_seek_hole_data(struct file *file, loff_t offset,
> > +				     int origin)
> > +{
> > +	loff_t retval = -ENXIO;
> > +	struct inode *inode = file->f_mapping->host;
> > +	sector_t block, found_block;
> > +	sector_t last_block = (i_size_read(inode) - 1) >> inode->i_blkbits;
> > +	int want = (origin == SEEK_HOLE) ? 0 : 1;
> > +
> > +	for (block = offset >> inode->i_blkbits; block <= last_block; block++) {
> > +		found_block = bmap(inode, block);
> > +
> > +		if (!!found_block == want) {
> > +			retval = block << inode->i_blkbits;
> > +			break;
> > +		}
> > +	}
> > +
> > +	return retval;
> > +}
> > +
> 
> It looks like this function can return an offset that is before the
> one passed in as an argument due to losing the lower bits. I think
> it needs to do somthing more like this in the loop initializer:
> 
>     block = (offset + (1 << inode->i_blkbits) - 1) >> inode->i_blkbits;
> 
> By adding 1 block if it wasn't already on an even block, this assures
> us that it won't go backwards from the original offset while allowing
> someone to get a block that really does start exactly on the offset.
> 
> >  loff_t generic_file_llseek(struct file *file, loff_t offset, int origin)
> >  {
> >  	long long retval;
> >  	struct inode *inode = file->f_mapping->host;
> > +	loff_t (*fn)(struct file *, loff_t, int);
> >  
> >  	mutex_lock(&inode->i_mutex);
> >  	switch (origin) {
> > @@ -43,15 +65,24 @@ loff_t generic_file_llseek(struct file *file, loff_t offset, int origin)
> >  			break;
> >  		case SEEK_CUR:
> >  			offset += file->f_pos;
> > +			break;
> > +		case SEEK_HOLE:
> > +		case SEEK_DATA:
> > +			fn = generic_seek_hole_data;
> > +			if (file->f_op->seek_hole_data)
> > +				fn = file->f_op->seek_hole_data;
> > +			offset = fn(file, offset, origin);
> >  	}
> >  	retval = -EINVAL;
> >  	if (offset>=0 && offset<=inode->i_sb->s_maxbytes) {
> > -		if (offset != file->f_pos) {
> > +		if (offset != file->f_pos && origin != SEEK_HOLE) {
> 
> Why is SEEK_HOLE special in this case? The description of SEEK_HOLE
> and SEEK_DATA in the Sun document would imply to me that they should
> be handled the same.
> 

This was my interpretation of the man page, however Joern looked at what solaris
does and it seems that SEEK_HOLE does update f_pos, so I will change this.

> >  			file->f_pos = offset;
> >  			file->f_version = 0;
> >  		}
> >  		retval = offset;
> > -	}
> > +	} else if (offset < 0)
> > +		retval = offset;
> > +
> >  	mutex_unlock(&inode->i_mutex);
> >  	return retval;

Thanks much for your comments,

Josef
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux