Re: [PATCH v3] xfs_io: [v3] add the lseek() SEEK_DATA/SEEK_HOLE support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 25, 2012 at 09:14:44AM -0500, Mark Tinguely wrote:
> Add the lseek SEEK_DATA/SEEK_HOLE support into xfs_io.
> The result from the lseek() call will be printed to the output.
> For example:
> 
> xfs_io> lseek -h 609k
> Type	Offset
> hole	630784
> 
> v1 -> v2 Add "-a" and "-r" options.
> 	 Simplify the output.
> v2 -> v3 Refactor for configure.in -> configure.ac change.
> 	 SEEK_DATA with -1 offset behaves badly on older Linux.
> 	 Display error message as "ERR <errno>".
....
> +
> +#include <linux/fs.h>

I missed this first time around - why is this include necessary?

> +#define	LSEEK_DFLAG	1 << 0
> +#define	LSEEK_HFLAG	1 << 1
> +#define	LSEEK_RFLAG	1 << 2

Need parenthesis there, i.e. "(1 << 0)", so we don't get weird bugs
due to operator precendence causing unintended behaviour, but....

> +static int
> +lseek_f(
> +	int		argc,
> +	char		**argv)
> +{
> +	off64_t		offset, roff;
> +	size_t          fsblocksize, fssectsize;
> +	int		cseg;
> +	int		flag;
> +	int		i, c;
> +
> +	flag = 0;
> +	init_cvtnum(&fsblocksize, &fssectsize);
> +
> +	while ((c = getopt(argc, argv, "adhr")) != EOF) {
> +		switch (c) {
> +		case 'a':
> +			flag |= LSEEK_HFLAG;
> +			/* fall through to pick up the DFLAG */

I'd just set the flags directly - much more obvious, less likely to
go wrong in the future...

> +	offset = cvtnum(fsblocksize, fssectsize, argv[optind]);
> +
> +	/* recursively display all information, decide where to start. */
> +	roff = lseek64(file->fd, offset, SEEK_DATA);
> +	if (roff == offset)
> +		cseg = LSEEK_DFLAG;
> +	else
> +		cseg = LSEEK_HFLAG;

I'm not really sure what these variables mean. "roff" is returned
offset, "cseg" is current segment? Perhaps a comment or a better
names is in order because right now I'm guessing at what a segment
is as it is not referenced in any of the comments...

> +	
> +	printf("Type	offset\n");
> +
> +	/* loop to handle the data / hole entries in assending order.
> +	 * Only display the EOF when the initial offset was greater
> +	 * the end of file.
> +	 */
> +	for (i = 0; flag & LSEEK_RFLAG || i < 2; i++) {

I'm very surprised gcc doesn't warn here about lack of parenthesis.

As it is, I think the code would be much clearer if you separated
the non-recursive case from the recursive case given it took me 20
minutes to work out whether this loop worked correctly for both
cases (e.g. why does it need 2 loops instead of 1 for the non
recursive case, and does the double loop behaviour give the right
results?). Something like:

	if (!(flags & RECURSIVE))
		return seek_single(offset, flags);

	return seek_recursive(offset, flags);

makes the simple case is easy to verify correct (it's just a
single seek), and the recursive case doesn't get complicated by the
requirements of the single seek case. That results in code that is
much easier to maintain and modify in future....

> +		if (cseg == LSEEK_DFLAG) {
> +			if (flag & LSEEK_DFLAG) {
> +				roff = lseek64(file->fd, offset, SEEK_DATA);
> +				if (roff == -1) {
> +					if (i < 2) {
> +						if (errno == ENXIO)
> +							printf("data	EOF\n");
> +						else
> +							printf("data	ERR %d\n",
> +								errno);
> +					}

Why would you only print a seek error for the first seek? I
would have thought that the recursive case also needs differentiate
between an error and EOF detection.

> +				} else
> +					printf("data	%lld\n", roff);
> +			}
> +			/* set the offset and type for the next iteration */
> +			cseg = LSEEK_HFLAG;
> +			roff = lseek64(file->fd, offset, SEEK_HOLE);
> +			if (roff != -1)
> +				offset = roff;
> +			continue;

This extra seek only needs to be done if LSEEK_HFLAG is not set
to move the offset to the point where the next SEEK_DATA will find
the next data extent....

> +		}
> +
> +		/* cseg == LSEEK_HFLAG */
> +		if (flag & LSEEK_HFLAG) {
> +			roff = lseek64(file->fd, offset, SEEK_HOLE);

But if LSEEK_HFLAG is set, we do the same seek anyway on the next
loop iteration. The logic currently is:

	loop {
		if (in data) {
			if (seek data) {
				move to next data
				print
			}
			move to next hole
			update offset
			continue
		}
		/* in hole */
		if (seek hole) {
			move to next hole
			print
		}
		move to next data
		update offset
	}

Effectively, the loop is only dealing with a data extent or a hole
extent in each loop. But given that holes and data always alternate,
the same canbe acheived in a single loop:

	loop {
		move to next data
		if (data)
			print
		update offset
		move to next hole
		if (hole)
			print
		update offset
	}

And the initial loop start (i.e. offset starts in a hole or data)
can be handled with a simple "skip data" variable that is cleared on
the first pass through the loop. This gets rid of needing to track
the current segment type and gets rid of the redundant seeks to the
same offset when dumping both data and holes....

And the printing can become a simple functions that takes a "data"
or "hole" string and can be common for all callers (single and
recursive)...

> +lseek_init(void)
> +{
> +	lseek_cmd.name = "lseek";
> +	lseek_cmd.altname = "seek";

I'd just call them both "seek" - the "l" part of the lseek() call is
an implementation detail.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs


[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux