[TOPIC 14/17] Aspects of merge-ort: cool, or crimes against humanity?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



1. Elijah: ORT stands for Ostensibly Recursive’s Twin. As a merge strategy, just like you can call ‘git merge -s recursive’ you can call ‘git merge -s ort’. Git’s option parsing doesn’t require the space after the ‘-s’.

2. Major question is about performance & possible layering violations. Merge recursive calls unpacks trees to walks trees, then needs to get all file and directory names and so walks the trees on the right again, and then the trees on the left again. Then diff needs to walk sets of trees twice, and then insert_stage_data() does a bunch more narrow tree walks for rename detection. Lots of tree walking. Replaced that with two tree walks.

3. Using traverse_trees() instead of unpack_trees(), and avoid the index entirely (not even touching or creating cache_entry’s), and building up information as I need. I’m not calling diffcore_std(), but instead directly calling diffcore_rename(). Is this horrifying? Or is it justified by the performance gains?

4. Peff: both, some of it sounds like an improvement, but maybe there were hidden benefits previously.

5. Elijah: I write to a tree before I do anything.

6. Peff: I like that. Seems like a clean up to me. We have written libgit2-like code for merging server-side

7. Elijah: I’ve been adding tests for the past few years, more to add, feel good about it.

8. Jonathan N: If you are using a lower-layer thing, I would not say you’re not doing anything you shouldn’t. But if you docs say you should not to use diffcore_rename(), you can update the docs to say that it’s fine to use it.

9. Elijah: three places directly write tree objects. All have different data structures they are writing from. Should I pull them out? But then my data structure was also different, so I’d have a fourth.

10. Peff: not worried because trees are simple. Worried about policy logic. Can’t write a tree entry with a double slash. Want this to be enforced everywhere, but no idea how hard that would be to implement. Not about lines of code, but consistency of policy. Fearful that only one place does it.

11. Elijah: I know merge-ort checks this, but it’s not nearby, so it could change.

12. Peff: as bad as it is to round trip through the index, it may bypass quality checks, which you will need to manually implement.

13. Elijah: usability side, with the tree I’ve created, I could have .git/AUTOMERGE. I have an old tree, a new tree, and a checkout can get me there. Fixed a whole bunch of bugs for sparsity and submodules.

14. Elijah: If we use this to on-the-fly remerge as part of git-log in order to compare the merge commit to what the automatic merging would have done, where/how should we write objects as we go?

15. Jonathan N: can end up with proliferation of packs, would be nice to have similar to fast import and have in memory store. Dream not to have loose files written ever.

16. Peff: I like your dream. But fast import packs are bad. We assume that packs are good, and thus need to use GC aggressively. This increases pollution of that problem. I know about objects, but not written to disc, risk that you can write objects that are broken, but git doesn’t know because git thinks it has the object but it’s only in memory. Log is conceptually a read operation, but this would create the need for writes.

17. Elijah: you could write into a temporary directory. Worried about `gc --auto` in the middle of my operation. If I write to a temp pack I could potentially avoid it.

18. Elijah: large files. Rename detection might not work efficiently OR correctly for sufficiently large files (binary or not). Limited bucket size means that completely different files treated as renames when both are over 8MB. Should big files just not be compared?

19. Peff: maybe we should fix the hash…

20. Elijah: present situation is broken, maybe we can cheat in the short term, and avoid fixing?

21. Peff: seems more correct for now, but we’d need to document

22. Elijah: checkout --overwrite-ignore flag. Should merge have the same flag.

23. Jonathan N: gitignore original use case was build outputs which can be regenerate. But then some people want to ignore `.hg` which is much more precious.

24. Peff: we can plumb it through later to other commands

25. Brian: CI doesn’t really care. Moving between branches it would complain. For checkout and merge it makes sense to support just destroying.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux