On Fri, Jan 19, 2024 at 08:05:59PM -0500, Jeff King wrote: > On Thu, Jan 18, 2024 at 02:41:56PM +0100, Patrick Steinhardt wrote: > > > Refactor the code to stop using `stat_validity_check()`. Instead, we > > manually stat(3P) the file descriptors to make relevant information > > available. On Windows and MSYS2 the result will have both `st_dev` and > > `st_ino` set to 0, which allows us to address the first issue by not > > using the stat-based cache in that case. It also allows us to make sure > > that we always compare `st_dev` and `st_ino`, addressing the second > > issue. > > I didn't think too hard about the details, but does this mean that > every user of stat_validity_check() has the same issue? The other big > one is packed-refs (for which the code was originally written). Should > this fix go into that API? In theory, the issue is the same for the `packed-refs` file. But in practice it's much less likely to be an issue: - The file gets rewritten a lot less frequently than the "tables.list" file, making the race less likely in the first place. It can only happen when git-pack-refs(1) races with concurrent readers, whereas it can happen for any two concurrent process with reftables. - Due to entries in the `packed-refs` being of variable length it's less likely that the size will be exactly the same after the file has been rewritten. For reftables we have constant-length entries in the "tables.list", so it's likely to happen there. - It is very unlikely that we have the same issue with inode reuse with the packed-refs file. The only reason why we have it with the reftable backend is that it is very likely that we end up writing to "tables.list" twice, once for the normal update and once for auto compaction. So overall, I doubt that it's all that critical in practice for the packed-refs backend. It _is_ possible to happen, but chances are significantly lower. I cannot recall a single report of this issue, which underpins how unlikely it seems to be for the files backend. Also, applying the same fix for the packed-refs would essentially mean that the caching mechanism is now ineffective on Windows systems where we do not have proper `st_dev` and `st_ino` values available. I think this is a no-go in the context of packed-refs because we don't have a second-level caching mechanism like we do in the reftable backend. It's not great that we have to reread the "tables.list" file on Windows systems for now, but at least it's better than having to reread the complete "packed-refs" file like we'd have to do with the files backend. Patrick
Attachment:
signature.asc
Description: PGP signature