Hi brian, On Sat, 10 Oct 2020, brian m. carlson wrote: > On 2020-10-09 at 21:10:04, Junio C Hamano wrote: > > "brian m. carlson" <sandals@xxxxxxxxxxxxxxxxxxxx> writes: > > > > > We'd like to canonicalize paths such that we can preserve any number of > > > trailing components that may be missing. > > > > Sorry, but at least to me, the above gives no clue what kind of > > operation is desired to be done on paths. How would one preserve > > what does not exist (i.e. are missing)? > > > > Do you mean some leading components in a path point at existing > > directories and after some point a component names a directory > > that does not exist, so everything after that does not yet exist > > until you "mkdir -p" them? > > > > I guess my confusion comes primarily from the fuzziness of the verb > > "canonicalize" in the sentence. We want to handle a/b/../c/d and > > there are various combinations of missng and existing directories, > > e.g. a/b may not exist or a/b may but a/c may not, etc. Is that > > what is going on? Makes me wonder if it makes sense to canonicalize > > a/b/../c/d into a/c/d when a/b does not exist in the first place, > > though. > > The behavior that I'm proposing is the realpath -m behavior. If the > path we're canonicalizing doesn't exist, we find the closest parent that > does exist, canonicalize it (à la realpath(3)), and then append the > components that don't exist to the canonicalized portion. FWIW I was immediately able to think of a handful scenarios where this functionality would come in handy, but I am probably not a typical example for the median reader. So maybe a concrete example or two why this could be handy could be shown in the cover letter? Thanks, Dscho > > > > Let's add a function to do > > > that that calls strbuf_realpath to find the canonical path for the > > > portion we do have and then append the missing part. We adjust > > > strip_last_component to return us the component it has stripped and use > > > that to help us accumulate the missing part. > > > > OK, so if we have a/b/c/d and know a/b/c/d does not exist on the > > filesystem, we start by splitting it to a/b/c and d, see if a/b/c > > exists, and if not, do the same recursively to a/b/c to split it > > into a/b and c, and prefix the latter to 'd' that we split earlier > > (i.e. now we have a/b and c/d), until we have an existing directory > > on the first half? > > Correct. > > > > +/* > > > + * Like strbuf_realpath, but trailing components which do not exist are copied > > > + * through. > > > + */ > > > +char *strbuf_realpath_missing(struct strbuf *resolved, const char *path) > > > +{ > > > + struct strbuf remaining = STRBUF_INIT; > > > + struct strbuf trailing = STRBUF_INIT; > > > + struct strbuf component = STRBUF_INIT; > > > + > > > + strbuf_addstr(&remaining, path); > > > + > > > + while (remaining.len) { > > > + if (strbuf_realpath(resolved, remaining.buf, 0)) { > > > + strbuf_addbuf(resolved, &trailing); > > > + > > > + strbuf_release(&component); > > > + strbuf_release(&remaining); > > > + strbuf_release(&trailing); > > > + > > > + return resolved->buf; > > > + } > > > + strip_last_component(&remaining, &component); > > > + strbuf_insertstr(&trailing, 0, "/"); > > > + strbuf_insertstr(&trailing, 1, component.buf); > > > > I may be utterly confused, but is this where > > > > - we started with a/b/c/d, pushed 'd' into trailing and decided > > to redo with a/b/c > > > > - now we split the a/b/c into a/b and c, and adjusting what is > > in trailing from 'd' to 'c/d' > > > > happens place? It's a bit sad that we need to repeatedly use > > insertstr to prepend in front, instead of appending. > > Yes, that's true. It really isn't avoidable, though, with the functions > the way that they are. We can't use the original path and keep track of > the offset because it may contain multiple path separators and we don't > want to include those in the path. > -- > brian m. carlson: Houston, Texas, US >