Hi René & Taylor, On Wed, 26 Jan 2022, Taylor Blau wrote: > On Wed, Jan 26, 2022 at 10:34:04AM +0100, René Scharfe wrote: > > Am 26.01.22 um 09:41 schrieb Johannes Schindelin via GitGitGadget: > > > Note: originally, Scalar was implemented in C# using the .NET API, where > > > we had the luxury of a comprehensive standard library that includes > > > basic functionality such as writing a `.zip` file. In the C version, we > > > lack such a commodity. Rather than introducing a dependency on, say, > > > libzip, we slightly abuse Git's `archive` command: Instead of writing > > > the `.zip` file directly, we stage the file contents in a Git index of a > > > temporary, bare repository, only to let `git archive` have at it, and > > > finally removing the temporary repository. > > > > git archive allows you to include untracked files in an archive with its > > option --add-file. You can see an example in Git's Makefile; search for > > GIT_ARCHIVE_EXTRA_FILES. It still requires a tree argument, but the > > empty tree object should suffice if you don't want to include any > > tracked files. It doesn't currently support streaming, though, i.e. > > files are fully read into memory, so it's impractical for huge ones. That's a good point. I did not want to invent any `fast-import`-like streaming protocol just for the sake of supporting the "funny" use case of `scalar diagnose`, so I invented a new option `--add-file-with-content=<path>:<content>` (with the obvious limitation that the `<path>` cannot contain any colon, if that is desired, users will still need to write out untracked files). > Using `--add-file` would likely be preferable to setting up a temporary > repository just to invoke `git archive` in it. Johannes would be the > expert to ask whether or not big files are going to be a problem here > (based on a cursory scan of the new functions in scalar.c, I don't > expect this to be the case). Indeed, it is unlikely that any large files are included. > The new stage_directory() function _could_ add `--add-file` arguments in > a loop around readdir(), but it might also be nice to add a new > `--add-directory` function to `git archive` which would do the "heavy" > lifting for us. I went one step further and used `write_archive()` to do the heavy-lifting. That way, we truly avoid spawning any separate process let alone creating any throw-away repository. Ciao, Dscho