On Wed, Jan 08, 2025 at 07:39:37AM -0800, Junio C Hamano wrote: > Patrick Steinhardt <ps@xxxxxx> writes: > > >> It may still make sense to drop the first hunk, and consider how to > >> proceed when you further want to reduce the unnecessary dependencies > >> for external users of the reftable library, though. Are there > >> correctness implications if git_rand() in format_name() yields non > >> random results (like, always using "rnd = 0" instead of calling > >> git_rand())? I seriously hope not. And if there is no correctness > >> implications, perhaps we can replace it with rand() or even constant > >> "0"? > > > > No, there aren't any implications on correctness in that case. Sure, the > > randomized delays not being randomized can lead to more contention. But > > even when the randomized suffix for tables is deterministic we wouldn't > > have an issue as the files are still distinguished by their update > > indices. > > OK, so they both can be turned into a simple rand() that is expected > to work more reliably especially on more exotic systems (meaning: > the ability the system providers can test their rand() is much > better than our ability to test our git_rand() there)? It would > help us solve the immediate issue reported, while removing one git > specific function from the reftable library? Hm. The problem is when Git dies in the middle of a transaction: 1. We write the temporary table. 2. We compute the not-so-random suffix. 3. We write the temporary "tables.list" file. 4. We move the temporary table into place using the not-so-random suffix. 5. Git dies before updating "tables.list". Now we have the temporary table moved into place, but "tables.list" hasn't been updated yet. When the next Git process comes along and wants to update the table it would result in an error if it computed the same suffix. The reftable library knows to clean up such stale tables when not referenced by the "tables.list" file, but it doesn't do so on every write. So this would likely still cause issues in practice. I already though about this scenario when writing my mail, but didn't really think about it as "correctness". But I guess it is. Also, based on the feedback from Randall it's not only the reftable backend that has issues. It's a more general problem on ia64, where many tests are failing. So even if we fixed this one case, it's likely that other cases would still die when running low on entropy. I dunno. It feels like a platform issue, not like a Git issue, when the RNG cannot provide us a couple of integers. The OpenSSL backend seems unfit for use to me as none of the other backends have the same issue. Patrick