Re: [PATCH 0/3] Another approach to large transactions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 20, 2015 at 4:07 PM, Stefan Beller <sbeller@xxxxxxxxxx> wrote:
> On Mon, Apr 20, 2015 at 3:51 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
>> Stefan Beller <sbeller@xxxxxxxxxx> writes:
>>
>>> The problem comes from guessing the number of fds we're allowed to use.
>>> At first I thought it was a fundamental issue with the code being broken, but
>>> it turns out we just need a larger offset as we apparently have 9 files open
>>> already, before the transaction even starts.
>>> I did not expect the number to be that high, which is why I came up with the
>>> arbitrary number of 8 (3 for stdin/out/err, maybe packed refs and reflog so I
>>> guessed, 8 would do fine).
>>>
>>> I am not sure if the 9 is a constant or if it scales to some unknown
>>> property yet.
>>> So to make the series work, all we need is:
>>>
>>> - int remaining_fds = get_max_fd_limit() - 8;
>>> + int remaining_fds = get_max_fd_limit() - 9;
>>>
>>> I am going to try to understand where the 9 comes from and resend the patches.
>>
>> I have a suspicion that the above is an indication that the approach
>> is fundamentally not sound.  9 may be OK in your test repository,
>> but that may fail in a repository with different resource usage
>> patterns.
>
> You put my concerns in a better wording.
>
>>
>> On the core management side, xmalloc() and friends retry upon
>> failure, after attempting to free the resource.  I wonder if your
>> codepath can do something similar to that, perhaps?
>
> But then we'd need to think about which fds can be 'garbage collected'.
> The lock files certainly can be closed and reopened. The first 3 fd not so.
> So we'd need to maintain a data structure of file descriptors good/bad
> for this reclaiming.
>
>>
>> On the other hand, it may be that this "let's keep it open as long
>> as possible, as creat-close-open-write-close is more expensive" may
>> not be worth the complexity.  I wonder if it might not be a bad idea
>> to start with a simpler rule, e.g. "use creat-write-close for ref
>> updates outside transactions, and creat-close-open-write-close for
>> inside transactions, as that is likely to be multi-ref updates" or
>> something stupid and simple like that?
>
> I thought about any ref about goes through transaction nowadays.
> Having the current patches the first n locks are creat-write-close,
> while the remaining locks have the  creat-close-open-write-close
> pattern, so it slows only the large transactions.
>
> My plan is to strace all open calls and check if the aforementioned
> 9 open files are just a constant.

When running the test locally, i.e. not in the test suite, but typing
the commands
myself into the shell, Git is fine with having just 5 file descriptors left.
The additional 4 required fds come from beign run inside the test suite.

When strace-ing git, I cannot see any possible other fds which would require
having some left over space required. So I'd propose we'd just take a reasonable
number not too small for various test setups like 32 and then go with the
proposed patches.

I'll just resend the patches to have a new basis for discussion.

>
>>
>> Michael?
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]