On Mon, Nov 29 2021, Han-Wen Nienhuys wrote: > On Mon, Nov 29, 2021 at 11:14 AM Ævar Arnfjörð Bjarmason > <avarab@xxxxxxxxx> wrote: >> > This helps for reftable support, and will help if we want to reconsider >> > under which conditions reflogs get created/updated. >> >> Having looked at this in a bit more detail than last time >> (https://lore.kernel.org/git/211123.864k83w3y4.gmgdl@xxxxxxxxxxxxxxxxxxx/) >> I applaud the goals, but to be blunt the specific approach just seems a >> bit backwards to me. >> >> As noted in that message I have patches to tweak the "verbose" mode to >> be backend-independent, which as we see from your series is one thing in >> the files backend that consumes the "message" and assumes things about >> newlines. > > In v2, I went with Jun's suggestion, and left the newlines alone, just > trimming them in refs/debug.c . I think that makes most of your mail > irrelevant? To whatever immediate problem you're trying to solve? Probably. I'm mainly pointing out that we can make some of these APIs nicer & a bit more abstract from the details of the file backend. Or, as is the case with "ident" v.s. "committer_name/committer_email" in these reflog callbacks no existing caller actually cares about that format, so if/when we get to that reftable integration (which seems to have the two as seperate fields) splitting those up in refs.[ch] probably makes sense. >> Perhaps reftable is capable of just handing the underlying code pointers >> into the mmap()'d file, so we could even skip all (re)allocations? Or if >> not, that certainly seems like a sensible thing to anticipate in a >> backend-independent interface. We could do that in the file backend if >> we were a bit smarter and used mmap() instead of the >> fopen()/read-in-a-loop pattern. > > It sounds like premature optimization. Reading reflogs is not usually > a performance sensitive operation. Also, if you hand out mmap'd > pointers, how would the reftable storage code know when it is safe to > close and munmap the file? The same way we do the fopen/fclose lifetime now, i.e. you could rely on them for the iteration, which seems to be a common pattern in code that needs this. I was aiming more for the lack of premature optimization there, i.e. instead of byte-twiddling things by injecting \0s we could just have char */size_t offsets (as noted elsewhere), which would also nicely allow binary storage formats, if those formats happen to have an embedded string. Is that true of reftable?