On Tue, Sep 27, 2022 at 06:52:02PM -0700, Victoria Dye wrote: > Junio C Hamano wrote: > > * vd/fix-unaligned-read-index-v4 (2022-09-23) 1 commit > > - read-cache: avoid misaligned reads in index v4 > > > > The codepath that reads from the index v4 had unaligned memory > > accesses, which has been corrected. > > > > Expecting a reroll. > > cf. <Yy4nkEnhuzt2iH+R@xxxxxxxxxxxxxxxxxxxxxxx> > > cf. <bb3a2470-7ff5-e4a6-040a-96e0e3833978@xxxxxxxxx> > > source: <pull.1366.git.1663962236069.gitgitgadget@xxxxxxxxx> > > > > How drastic an update were you expecting for this re-roll? To keep the fix > minimal (that is, focused on 'create_from_disk()'), I was planning to just > add some comments explaining the implementation (in response to [1]). If the > goal is to get this merged quickly, I'd want to avoid a larger refactor > (suggested in [2] & [3]), since doing so would either make the > implementations of "read from disk" ('create_from_disk()') and "write to > disk" ('copy_cache_entry_to_ondisk()') different/difficult to compare, or > would involve a more invasive refactor to update both functions [4]. The first "cf" there is my initial request for a v2, but I since retracted that. I have no objection to adding more comments, but I am happy enough without them (like Junio, it may be that I'm overly familiar with how I expect our get_be() functions to handle alignment). I think even if we want to go further in the near term, it is still best to build it on top of your basic fix. And we can take that fix and make further refactoring (if any) its own topic. I don't think even that basic fix needs to happen before the release, though. While it is a slight regression that SANITIZE=undefined does not pass in the 2.38 release candidates: - the bug really was there all along. It's just that a new test triggers it. - I don't think it's a sign of actual problems; we are forming an unaligned pointer via a funny cast, which is technically UB, but I don't think any real-world platforms actually care, since we dereference it only via get_be16(). So it's mostly just a minor annoyance for running the tests; we're probably better not to change any code, even trivially, this late in the release cycle. I was going to point to my branch with commits that are slightly less rough than what I posted, for somebody to work on it later if interested. But after applying a minimal amount of polish, I think they are pretty reasonable, so I'll actually share them here. IMHO it would be OK to proceed with them even if we don't redo the writing side. But I can see the argument that they should come together. I'm also OK if we just drop it. It was a fun puzzle to make it all work, and I do like the result. But arguably it's just churn, and in particular I haven't measured the re-ordering patch to see if it produces a performance difference. I doubt it does, but it's important enough that we should get a clear answer. So here are my refactoring patches for the read side, built on top of your earlier patch. They resolve the "yuck" comment by storing and copying individual fields (which is much less verbose than you'd think because of struct assignment). I left the pointer type as signed; there are still come casts, but the same ones that were necessary before my series. [1/3]: pack-bitmap: make read_be32() public [2/3]: read-cache: read on-disk entries in byte order [3/3]: read-cache: use read_be32() in create_from_disk() compat/bswap.h | 22 ++++++++++++++++++ pack-bitmap.c | 12 ---------- read-cache.c | 60 +++++++++++++++++++++++++------------------------- 3 files changed, 52 insertions(+), 42 deletions(-) -Peff