Re: [PATCH] Provide a dirname() function when NO_LIBGEN_H=YesPlease

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Junio,

On Wed, 30 Sep 2015, Junio C Hamano wrote:

> Johannes Schindelin <johannes.schindelin@xxxxxx> writes:
> 
> > diff --git a/compat/basename.c b/compat/basename.c
> > index d8f8a3c..10dba38 100644
> > --- a/compat/basename.c
> > +++ b/compat/basename.c
> > @@ -13,3 +13,29 @@ char *gitbasename (char *path)
> >  	}
> >  	return (char *)base;
> >  }
> > +
> > +char *gitdirname(char *path)
> > +{
> > +	char *p = path, *slash, c;
> > +
> > +	/* Skip over the disk name in MSDOS pathnames. */
> > +	if (has_dos_drive_prefix(p))
> > +		p += 2;
> 
> Not a new problem, but many callers of has_dos_drive_prefix()
> hardcodes that "2" in various forms.  I wonder if this is something
> we should relieve callers of by tweaking the semantics of it, e.g.
> by returning 2 (or howmanyever bytes should be skipped) from the
> function, changing it to skip_dos_drive_prefix(&p), etc.

In the upcoming v2, this is addressed.

> > +	/* POSIX.1-2001 says dirname("/") should return "/" */
> > +	slash = is_dir_sep(*p) ? ++p : NULL;
> > +	while ((c = *(p++)))
> 
> I am confused by this.  What is the invariant on 'p' at the
> beginning of the body of this while loop in each iteration?

The idea was that 'p' looks at whatever is the next character. And 'slash'
records the location of the latest slash we have seen (to be overwritten
with a '\0' to terminate the dirname).

Since '/' should not be shortened to the empty string, I special-cased
absolute paths to actually shift the 'slash' to one byte later. A little
dirty, but it worked.

Except that Ramsay's tests pointed out that I did not even look at what
the specs say, and I even fixed basename() in the meantime (and I think
dirname() looks more readable, but feel free to contradict me there).

> > +		if (is_dir_sep(c)) {
> > +			char *tentative = p - 1;
> > +
> > +			/* POSIX.1-2001 says to ignore trailing slashes */
> > +			while (is_dir_sep(*p))
> > +				p++;
> > +			if (*p)
> > +				slash = tentative;
> > +		}
> 
> I would have expected the function to scan from the end/right/tail.

Why scan twice (once to find the end, then to find the slashes)?

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]