On Tue, Jun 23, 2020 at 04:30:23PM -0400, Eric Sunshine wrote: > > I'm not sure what you'd write, then. You can't mention "mybranch" > > anymore if it was anonymized. Are you suggesting to make the example: > > > > git rev-list -- foo.c > > > > by itself? > > Sorry, I meant to provide an example like this: > > For example, if you have a bug which reproduces with `git rev-list > sensitive -- secret.c`, you can run: > > $ git fast-export --anonymize --all \ > --seed-anonymized=sensitive:foo \ > --seed-anonymized=secret.c:bar.c \ > >stream > > After importing the stream, you can then run `git rev-list foo -- > bar.c` in the anonymized repository. Thanks, that makes sense. I took this as-is for my reroll (modulo the change of option name discussed elsewhere). > Hmm, perhaps your original attempt can be extended slightly to state > it more explicitly? > > Note that paths and refnames are split into tokens at slash > boundaries. The command above would anonymize `subdir/foo.c` as > something like `path123/secret.c`; you could then search for > `secret.c` in the anonymized repository to determine the final > pathname. > > To make referencing the final pathname simpler, you can seed > anonymization for each path component; so, if you also anonymize > `subdir` to `publicdir`, then the final pathname would be > `publicdir/secret.c`. Thanks, I took this modulo some fixups to match the example above, and to avoid the use of the word "seed" based on our other discussion. > This makes me wonder if --seed-anonymized should do its own > tokenization so that --seed-anonymized=subdir/foo:public/bar is > automatically understood as anonymizing "subdir" to "public" _and_ > "foo" to "bar". But that potentially gets weird if you say: > > --seed-anonymized=a/b:q/p --seed-anonymized=a/c:y/z > > in which case you've given conflicting replacements for "a". (I > suppose it could issue a warning message in that case.) Right, I think you get into weird corner cases. Another issue is that not all items are tokenized (e.g., if your author name was foo/bar, you'd want that replaced as a whole). Probably you could add both the broken-down and full inputs. Yet another issue is that you can't add a token with a ":" due to the syntax. This is an infrequently-enough-used feature that I think it's worth keeping things simple, even if they're a little less convenient to invoke. > Lack of a warning or error could be kind of bad if the person doesn't > check the fast-export file before sending it out and only discovers > later that: > > git fast-export --seed-anonymized=foo:bar > > didn't perform _any_ anonymization at all. Good point. I'd hope people would glance at the output before sending it out, but given that it's a potential safety issue, it probably is worth detecting this case. I'll add it to my re-roll. -Peff