Re: [PATCH v2 1/2] abspath: add a function to resolve paths with missing components

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"brian m. carlson" <sandals@xxxxxxxxxxxxxxxxxxxx> writes:

> We'd like to canonicalize paths such that we can preserve any number of
> trailing components that may be missing.

Sorry, but at least to me, the above gives no clue what kind of
operation is desired to be done on paths.  How would one preserve
what does not exist (i.e. are missing)?

Do you mean some leading components in a path point at existing
directories and after some point a component names a directory
that does not exist, so everything after that does not yet exist
until you "mkdir -p" them?

I guess my confusion comes primarily from the fuzziness of the verb
"canonicalize" in the sentence.  We want to handle a/b/../c/d and
there are various combinations of missng and existing directories,
e.g. a/b may not exist or a/b may but a/c may not, etc.  Is that
what is going on?  Makes me wonder if it makes sense to canonicalize
a/b/../c/d into a/c/d when a/b does not exist in the first place,
though.

> Let's add a function to do
> that that calls strbuf_realpath to find the canonical path for the
> portion we do have and then append the missing part.  We adjust
> strip_last_component to return us the component it has stripped and use
> that to help us accumulate the missing part.

OK, so if we have a/b/c/d and know a/b/c/d does not exist on the
filesystem, we start by splitting it to a/b/c and d, see if a/b/c
exists, and if not, do the same recursively to a/b/c to split it
into a/b and c, and prefix the latter to 'd' that we split earlier
(i.e. now we have a/b and c/d), until we have an existing directory
on the first half?

> Note that it is intentional that we invoke strbuf_realpath here,
> repeatedly if necessary, because on Windows that function is replaced
> with a version that uses the proper system semantics for
> canonicalization.  Trying to adjust strbuf_realpath to perform this kind
> of canonicalization with an additional option would fail to work
> properly on Windows.  The present approach is equivalent to
> strbuf_realpath for cases where the path exists, and the only other
> cases where we will use this function the additional overhead of
> multiple invocations is not significant.
>
> Signed-off-by: brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx>
> ---
>  abspath.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++----
>  cache.h   |  1 +
>  2 files changed, 47 insertions(+), 4 deletions(-)
>
> diff --git a/abspath.c b/abspath.c
> index 6f15a418bb..092bb33b64 100644
> --- a/abspath.c
> +++ b/abspath.c
> @@ -11,8 +11,12 @@ int is_directory(const char *path)
>  	return (!stat(path, &st) && S_ISDIR(st.st_mode));
>  }
>  
> -/* removes the last path component from 'path' except if 'path' is root */
> -static void strip_last_component(struct strbuf *path)
> +/*
> + * Removes the last path component from 'path' except if 'path' is root.
> + *
> + * If last is not NULL, the last path component is copied to last.
> + */
> +static void strip_last_component(struct strbuf *path, struct strbuf *last)
>  {
>  	size_t offset = offset_1st_component(path->buf);
>  	size_t len = path->len;
> @@ -20,6 +24,10 @@ static void strip_last_component(struct strbuf *path)
>  	/* Find start of the last component */
>  	while (offset < len && !is_dir_sep(path->buf[len - 1]))
>  		len--;
> +
> +	if (last)
> +		strbuf_addstr(last, path->buf + len);
> +
>  	/* Skip sequences of multiple path-separators */
>  	while (offset < len && is_dir_sep(path->buf[len - 1]))
>  		len--;
> @@ -118,7 +126,7 @@ char *strbuf_realpath(struct strbuf *resolved, const char *path,
>  			continue; /* '.' component */
>  		} else if (next.len == 2 && !strcmp(next.buf, "..")) {
>  			/* '..' component; strip the last path component */
> -			strip_last_component(resolved);
> +			strip_last_component(resolved, NULL);
>  			continue;
>  		}
>  
> @@ -169,7 +177,7 @@ char *strbuf_realpath(struct strbuf *resolved, const char *path,
>  				 * strip off the last component since it will
>  				 * be replaced with the contents of the symlink
>  				 */
> -				strip_last_component(resolved);
> +				strip_last_component(resolved, NULL);
>  			}
>  
>  			/*
> @@ -202,6 +210,40 @@ char *strbuf_realpath(struct strbuf *resolved, const char *path,
>  	return retval;
>  }
>  
> +/*
> + * Like strbuf_realpath, but trailing components which do not exist are copied
> + * through.
> + */
> +char *strbuf_realpath_missing(struct strbuf *resolved, const char *path)
> +{
> +	struct strbuf remaining = STRBUF_INIT;
> +	struct strbuf trailing = STRBUF_INIT;
> +	struct strbuf component = STRBUF_INIT;
> +
> +	strbuf_addstr(&remaining, path);
> +
> +	while (remaining.len) {
> +		if (strbuf_realpath(resolved, remaining.buf, 0)) {
> +			strbuf_addbuf(resolved, &trailing);
> +
> +			strbuf_release(&component);
> +			strbuf_release(&remaining);
> +			strbuf_release(&trailing);
> +
> +			return resolved->buf;
> +		}
> +		strip_last_component(&remaining, &component);
> +		strbuf_insertstr(&trailing, 0, "/");
> +		strbuf_insertstr(&trailing, 1, component.buf);

I may be utterly confused, but is this where

    - we started with a/b/c/d, pushed 'd' into trailing and decided
      to redo with a/b/c

    - now we split the a/b/c into a/b and c, and adjusting what is
      in trailing from 'd' to 'c/d'

happens place?  It's a bit sad that we need to repeatedly use
insertstr to prepend in front, instead of appending.

> +		strbuf_reset(&component);
> +	}
> +
> +	strbuf_release(&component);
> +	strbuf_release(&remaining);
> +	strbuf_release(&trailing);
> +	return NULL;
> +}
> +
>  char *real_pathdup(const char *path, int die_on_error)
>  {
>  	struct strbuf realpath = STRBUF_INIT;
> diff --git a/cache.h b/cache.h
> index c0072d43b1..e1e17e108e 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1320,6 +1320,7 @@ static inline int is_absolute_path(const char *path)
>  int is_directory(const char *);
>  char *strbuf_realpath(struct strbuf *resolved, const char *path,
>  		      int die_on_error);
> +char *strbuf_realpath_missing(struct strbuf *resolved, const char *path);
>  char *real_pathdup(const char *path, int die_on_error);
>  const char *absolute_path(const char *path);
>  char *absolute_pathdup(const char *path);



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux