Re: [PATCH] util: make it more robust to calculate timeout value

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 18.05.2015 03:11, Wang Yufei wrote:
> On 2015/5/15 19:16, Daniel P. Berrange wrote:
> 
>> On Fri, May 15, 2015 at 01:09:09PM +0200, Michal Privoznik wrote:
>>> On 15.05.2015 08:26, zhang bo wrote:
>>>> When we change system clock to years ago, a certain CPU may use up 100% cputime.
>>>> The reason is that in function virEventPollCalculateTimeout(), we assign the 
>>>> unsigned long long result to an INT variable, 
>>>>         *timeout = then - now; // timeout is INT, and then/now are long long
>>>>         if (*timeout < 0)
>>>>             *timeout = 0;
>>>> there's a chance that variable @then minus variable @now may be a very large number 
>>>> that overflows INT value expression, then *timeout will be negative and be assigned to 0.
>>>> Next the 'poll' in function virEventPollRunOnce() will get into an 'endless' while loop there.
>>>> thus, the cpu that virEventPollRunOnce() thread runs on will go up to 100%. 
>>>>
>>>> Although as we discussed before in https://www.redhat.com/archives/libvir-list/2015-May/msg00400.html
>>>> it should be prohibited to set-time while other applications are running, but it does 
>>>> seems to have no harm to make the codes more robust.
>>>>
>>>> Signed-off-by: Wang Yufei <james.wangyufei@xxxxxxxxxx>
>>>> Signed-off-by: Zhang Bo <oscar.zhangbo@xxxxxxxxxx>
>>>> ---
>>>>  src/util/vireventpoll.c | 5 +++--
>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/src/util/vireventpoll.c b/src/util/vireventpoll.c
>>>> index ffda206..5f5a149 100644
>>>> --- a/src/util/vireventpoll.c
>>>> +++ b/src/util/vireventpoll.c
>>>> @@ -357,9 +357,10 @@ static int virEventPollCalculateTimeout(int *timeout)
>>>>              return -1;
>>>>  
>>>>          EVENT_DEBUG("Schedule timeout then=%llu now=%llu", then, now);
>>>> -        *timeout = then - now;
>>>> -        if (*timeout < 0)
>>>> +        if (then < now)
>>>>              *timeout = 0;
>>>> +        else
>>>> +            *timeout = (then - now) & 0x7FFFFFFF;
>>>
>>> You're trying to make this an unsigned value. What's wrong with plain
>>> typecast?
>>>
>>>>      } else {
>>>>          *timeout = -1;
>>>>      }
>>>>
>>>
>>> I must say this is ugly. If the system clock is changed, all the
>>> timeouts should fire, shouldn't they? Otherwise important events can be
>>> missed.
>>
>> As I said in previous thread, I think that this is really just papering
>> over one specific issue, and you are still going to have a multitude of
>> problems with every app on the system when you change the system clock
>> in this kind of way. I'm just not convinced we should be trying to hack
>> around it in this one case, as it is just giving us a false illusion
>> that things are going to continue to work, when in reality they'll just
>> break somewhere else instead. eg the pthread_cond_wait() timeouts.
>>
> 
> 
> You're right, this patch can not fix system clock changed problem. I'm trying
> to fix the bug that assigning an unsigned long long value to an int variable, and
> it can fix cpu up to 100% bug. What I do is decreasing the bad effect to the whole
> OS. At least we can do something right.

That's why I told it's ugly. Libvirt it meant to run on many platforms,
even there where an integer is not 4 bytes long. Therefore we use plain
typecast when needed instead of masking sign bit. For instance, on
platforms where int is 2bytes, this patch will not work at all.

Michal

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list




[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]