On Fri, Sep 6, 2024 at 2:44 PM Junio C Hamano <gitster@xxxxxxxxx> wrote: > Two and two half comments on [check_collision]. > > * We compare 4k at a time here, while copy.c copies 8k at a time, > and bulk-checkin.c uses 16k at a time. Outside the scope of this > topic, we probably should pick one number and stick to it, unless > we have measured to pick perfect number for each case (and I know > I picked 8k for copy.c and 16k for bulk-checkin.c both out of > thin air). In Ye Olden Days, 4k would be fine, and going back 40+ years even 512 would be fine. For *writing* modern systems often prefer at least 128K or even 1M; anything under 8K is Right Out. For *writing* it tends to be less important due to caches. Still: > * I would have expected at least we would fstat() them to declare > difference immediately after we find their sizes differ, for > example. As we assume that calling into this function should be > rare, we prefer not to pay in complexity for performance here? Another benefit of calling `stat` is that you get `st_blksize`, which is the system's recommended I/O block size. I almost commented about this earlier but the "should be rare" thing held me back. :-) (I have no comments on the rest of the comments.) Chris