On Tue, May 27, 2014 at 6:37 PM, Jeff King <peff@xxxxxxxx> wrote: > On Tue, May 27, 2014 at 04:17:34PM +0200, Erik Faye-Lund wrote: > >> The combination of "git clean" and fat fingers can some times cause >> data-loss, which can be frustrating. >> >> So let's add a flag that imports the files to be deleted into the >> object-database, in a way similar to what git-stash does. Maintain >> a reflog of the previously backed up clean-runs. >> >> Signed-off-by: Erik Faye-Lund <kusmabite@xxxxxxxxx> >> --- >> I've had a similar patch laying around for quite a while, but since >> f538a91 ("git-clean: Display more accurate delete messages"), this >> patch is a lot less nasty than before. So here you go, perhaps >> someone else has fat fingers and hate to lose work? > > I've definitely considered doing something like this before (and for > "git reset --hard"). My biggest concern would be poor performance in > some cases. But since it's optional, and one can presumably override it > with --no-backup for a specific large cleanup, it would not hurt anybody > who does not want to play with it. I also made it opt-in rather than opt-out by default. Perhaps it shouldn't be, though - dunno. >> + /* load HEAD into the index */ >> + >> + tree = parse_tree_indirect(sha1); >> + if (!tree) >> + die(_("Failed to unpack tree object %s"), sha1); >> + >> + parse_tree(tree); >> + init_tree_desc(&t, tree->buffer, tree->size); >> + >> + memset(&opts, 0, sizeof(opts)); >> + opts.head_idx = -1; >> + opts.src_index = &the_index; >> + opts.dst_index = &the_index; >> + opts.index_only = 1; >> + >> + if (unpack_trees(1, &t, &opts)) { >> + /* We've already reported the error, finish dying */ >> + exit(128); >> + } > > This bit of the code surprised me. I guess you are trying to re-create > the index state of the HEAD so that the commit you build on top of it > contains _only_ the untracked files as changes, and not whatever > intermediate index state you had. That makes some sense to me, as clean > is never touching the index state. TBH, I didn't really think this stuff through, I basically just hacked on this until I got it to save away superfluous files when the index matched HEAD. So this part is more accidental than designed. I'm not very familiar with the index-maniuplation code in Git either. I *think* the right thing to do would be to save the tree of HEAD *plus* those deleted files in this case. That way it only records the destruction itself. This does *not* seem to be what currently happens. If I have a change staged, that staged change also gets included in the commit. That's not ideal, I think. > Though taking a step back for a moment, I'd like to think about doing > something similar for "reset --hard", which is the other "destructive" > operation in git[1]. In that case, I think saving the index state is also > useful, because it is being reset, too (and while those blobs are > recoverable, the exact state is sometimes useful). I agree. I guess that should essentially do a "full" git-stash. > If we were to do that, should it be a separate ref? Or should there be a > single backup ref for such "oops, undo that" operations? If the latter, > what should that ref look like? I think it would look something like > refs/stash, with the index and the working tree state stored as separate > commits (even though in the "clean" case, the index state is not likely > to be that interesting, it is cheap to store and makes the recovery > tools uniform to use). Hmm, perhaps. I do like the concept of a "git undo" of sorts, but perhaps that'll raise the expectations even further? Or maybe raising them a good thing, so people add missing features? :) > And if we are going to store it like that, should we just be using "git > stash save --keep-index --include-untracked"? I think we would just need > to teach it a "--no-reset" option to leave the tracked files as-is. Hm, interesting. But it does leave me with a bit of a bad feeling; git stash is currently a shell-script, and I want us to move *away* from depending on those rather than towards... Or perhaps I just convinced myself to port git-stash to C? I guess the full script won't be needed, only the heavy lifting... > I realize that I went a few steps down the "if..." chain there to get to > "should we just be using stash?". But it would also make the code here > dirt-simple. Simplicity is good, and it's always good to hear some different ideas on how to reach a goal. > [1] Actually, "reset --hard" may be more of an education issue, as > simply running "git stash" has the same effect, but keeps a backup. > It's just that "reset --hard" is advertised as the solution to many > problems. > >> + if (!active_cache_tree) >> + active_cache_tree = cache_tree(); >> + >> + if (!cache_tree_fully_valid(active_cache_tree)) { >> + if (cache_tree_update(active_cache_tree, >> + (const struct cache_entry * const *)active_cache, >> + active_nr, 0) < 0) >> + die("failed to update cache"); >> + } > > I'd have thought you could use write_cache_as_tree, which backs "git > write-tree", but there is currently no way to convince it not to write > out the new cache. This is little enough code that it may not be worth > refactoring write_cache_as_tree to handle it. > I think not having to maintain multiple copies might make it worth factoring it out. Sure, it's not *that* complicated, but it *is* pretty well-contained. >> + /* create a reflog, if there isn't one */ >> + git_snpath(logfile, sizeof(logfile), "logs/%s", ref); >> + if (stat(logfile, &st)) { >> + FILE *fp = fopen(logfile, "w"); >> + if (!fp) >> + warning(_("Can not do reflog for '%s'\n"), ref); >> + else >> + fclose(fp); >> + } > > Kind of gross that we have to do this ourselves (and somewhat contrary > to efforts elsewhere to make the ref code more abstract), but I see that > "git stash" does the same thing. Hmm, it seems like log_ref_setup might be able to do the work for us, if it wold take a flag to force O_CREAT. Perhaps that's worth the effort? > Should you fopen with "a" here, to avoid a race condition where another > process creates it between the stat() and the fopen(), in which case we > would truncate what they wrote? You could even just get rid of the > stat(), then, like: > > if ((fp = fopen(logfile, "a"))) > fclose(fp); > else > warning(_("Can not do reflog for '%s'"), ref); > > Also note that your warning has an extra "\n" in it. Good catch, thanks. Both details :) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html