On 9/19/24 8:40 AM, 刘钟博 wrote:
In my work, I found that the prefetch task of maintenance often failed, causing the fetch command to take a long time to execute in monorepo. After investigation, it was found that the maintenance.lock file was not deleted correctly for various reasons, resulting in the inability to trigger subsequent maintenance tasks.
This is unfortunately a common occurrence. It seems to be related to the Git background processes being killed in such a way that does not allow the standard lock cleanup mechanism to kick in. At least, I haven't been able to find a reason why Git would be failing with something like a segfault which would also cause leftover .lock files.
So is it recommended to add some mechanism to ensure that maintenance.lock can be correctly restored when it is not deleted? For example, add pid information to maintenance.lock, or add a lock timeout mechanism.
I can speak from experience of previously having a lock timeout that this could cause problems where maintenance processes start running on the same repo concurrently. The reason for this in the past was due to being blocked on credential manager prompts. I was vaguely remembering fixing that issue with credential prompts, but then realized the change was only made to microsoft/git [1]. That same change reverted the removal of "stale" .lock files. I should put this together for an upstream patch series, finally. [1] https://github.com/microsoft/git/pull/598
I'm not sure if I missed any information, but if this is feasible, I would be happy to contribute such a patch.
I will CC you on my submission for the credential changes, as a way to help introduce you to the code in this area. Thanks, -Stolee