Re: fast-import fails in read-only tree

Jeff King <peff@xxxxxxxx> · Sat, 30 Jan 2016 00:13:40 -0500

On Fri, Jan 29, 2016 at 09:28:44AM -0500, Stefan Monnier wrote:

> > The primary goal of fast-import is to write that packfile.  It kind of
> > sounds like you are using the wrong tool for the job.
> 
> Yes, I realize that.  But in some cases it's the best tool available.
> `fast-import' is very close to being a "generic access API" which can be
> used instead of something like libgit.  I think it'd be good to push it
> yet a bit closer.

I'm not sure I agree. Git tries to make its innards available via
flexible plumbing commands. If we're not succeeding, I think that should
be fixed, rather than trying to shoe-horn an unrelated command to do the
job, even if it would be less code.

> > Can you elaborate on what you are sending to fast-import (preferably
> > with a concrete example)?
> 
> I'm sending a stream of "progress <foo>; cat-blob <foo>", basically.
> 
> The concrete example is in [BuGit](https://gitlab.com/monnier/bugit),
> see for example https://gitlab.com/monnier/bugit/commit/3678dcb8830a9c79c6f3404d75d63e6dd07bfe4c

You can use custom cat-file formatting to output your "name" strings as
part of the same field. IOW, something like:

  git cat-file -p refs/heads/bugit-master:numbers |
  awk '{print $3 " " $4 }' |
  git cat-file --batch="%(rest)" |
  while read number; do
    read id; # assuming blob contents are single-line
    read _junk; # assumes blob ended in its own newline
    $fun "$id" "$number"
  done

That's from a fairly cursory reading of that bugit patch, though, so I
might be missing some requirement.

> Yes, I switched to using "cat-file --batch" instead, but it's less
> convenient (I can't intersperse ad-hoc info in the output, the way I can
> with "progress" in fast-import) and there are cases where the list of
> files I need to extract cannot be determined without first looking at
> some of those extracted files (I currently have been able to avoid
> this in BuGit, luckily).

I think the example above should handle the "intersperse" thing.

If you're really going to do a lot of interactive back-and-forth access
of objects, though, I think you want to set up pipes to cat-file. It's a
little tedious to allocate fifos, but something like:

  mkfifo in out
  (exec git cat-file --batch <in >out) &
  exec 8>in
  exec 9<out
  echo $sha >&8
  read mode type size <&9
  read content ;# or read $size, or read until newline
  echo $content >&8 ;# imagine content is another sha to look up
  ...read from &9, etc..

The fifos and numbered descriptors are annoying, but that's shell for
you. I suspect using "fast-import" wouldn't be much different.

One feature I do think would be useful (and almost implemented when I
added --batch-check=<format>) is a formatter for the object content,
with a pretty modifier. I.e., it would be nice to do:

  echo $some_tree |
  git cat-file --batch-check="%(objectsize:pretty) %(contents:pretty)"

to work as the rough equivalent of "git cat-file -p" (but here you could
feed multiple trees and get multiple answers).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html