Re: busy loops in kernel

Asim <linkasim@xxxxxxxxx> · Sat, 21 Jan 2012 14:32:08 -0800

Yeah - these are not the strongest parts of the analyses. Lately, as
we have added more busses, interprocedural analyses and supported
newer kernels, we've seen more problems. In the first example, CIL(our
underlying analysis language) cannot distinguish between for and while
loops. To distinguish for loops and to eliminate certain class of
timers, we try to detect for loops and counters by searching for
integers that increment/decrement by 1. Here, it increments by 25, and
it is not caught as a for loop. Then, the intermediate code generated
for this is :

#line 185
    tmp___7 = readl((void const volatile   *)(mem + i));
#line 185
    if (tmp___7 == 305419896U) {
#line 185
      tmp___8 = readl((void const volatile   *)((mem + i) + 4));
#line 185
      if (tmp___8 == 878082066U) {
#line 185
        tmp___9 = readl((void const volatile   *)((mem + i) + 8));
#line 185
        if (tmp___9 == 1450709556U) {
#line 185
          tmp___10 = readl((void const volatile   *)((mem + i) + 12));
#line 185
          if (tmp___10 == 2014458966U) {
#line 189
            return ((struct ivtv_mailbox  volatile  *)((mem + i) + 16));
          }

Here it detects, that the exit condition is dependent on a read from
device, and marks this loop as a suspect. Apart from random
increments, we also face problems with casts used while detecting loop
exit conditions/for loops. Second one is a bug in how i2c_transfer
function is handled (i2c bus was added sometime this year). I've fixed
it and will update results shorty...

As I mentioned in my previous email, I would perhaps give priority to
the simpler examples that are plenty..

Thanks,
Asim

On Sat, Jan 21, 2012 at 12:50 PM, Julia Lawall <julia.lawall@xxxxxxx> wrote:
> Do you have any idea about the two for loops near the end of the message
> below?  It seems strange that they would be included, unless I am completely
> overlooking something.
>
> julia
>
>
> On Sat, 21 Jan 2012, Asim wrote:
>
>> Yeah - most of the busy waits that are correct are short and fairly
>> obvious while loops and there are plenty of them. There are some
>> complex ones too.
>>
>> However, we do have false positive rate of 5- 8% (as measured on
>> 2.6.18 kernel) - so in some (~100) cases for example the while loops
>> where variables may be re-used it may report the false positive. As
>> far as kernel timing is concerned - we do have mechanisms to account
>> for these based on our experience with the timers used in 2.6 kernel.
>> I will add these timing mechanisms and refresh the results - and
>> continue to update as I encounter more diverse mechanisms to account
>> for timing. But I strongly believe this a good subset to look at to
>> catch these problems.
>>
>> Thanks,
>> Asim
>>
>> On Sat, Jan 21, 2012 at 11:04 AM, Julia Lawall <julia.lawall@xxxxxxx>
>> wrote:
>>>
>>> I looked at this one:
>>>
>>> ticks 2  Infinite Loop:228  Infinite Loop:600    CC [M]
>>> drivers/i2c/busses/i2c-xiic.o
>>>
>>> Line 200 looks like what you described in your previous message.  I don't
>>> understand the problem in line 600, though:
>>>
>>>        while ((fifo_space >= 2) && (first || (i2c->nmsgs > 1))) {
>>>                if (!first) {
>>>                        i2c->nmsgs--;
>>>                        i2c->tx_msg++;
>>>                        i2c->tx_pos = 0;
>>>                } else
>>>                        first = 0;
>>>
>>>                if (i2c->tx_msg->flags & I2C_M_RD) {
>>>                        /* we dont date putting several reads in the FIFO
>>> */
>>>                        xiic_start_recv(i2c);
>>>                        return;
>>>                } else {
>>>                        xiic_start_send(i2c);
>>>                        if (xiic_tx_space(i2c) != 0) {
>>>                                /* the message could not be completely
>>> sent
>>> */
>>>                                break;
>>>                        }
>>>                }
>>>
>>>                fifo_space = xiic_tx_fifo_space(i2c);
>>>        }
>>>
>>> It looks like on every iteration execpt the first one, i2c->nmsgs is
>>> decremented, so it will not long be greater than one.
>>>
>>> The next two rely on kernel timing mechansims:
>>>
>>> ticks 1  Infinite Loop:284    CC [M]  drivers/i2c/busses/i2c-eg20t.o
>>>   while (ktime_lt(ktime_get(), ns_val))
>>> ticks 1  Infinite Loop:84    CC [M]  drivers/i2c/busses/i2c-pca-isa.o
>>>   ret = time_before(jiffies, timeout); ... while (ret)
>>>
>>> Are these timing mechanisms unsafe?
>>>
>>> In this case:
>>>
>>> ticks 1  Infinite Loop:338    CC [M]  drivers/infiniband/hw/amso1100/c2.o
>>>
>>> it looks like it is actually the outer loop, starting on line 336, that
>>> is
>>> unsafe.
>>>
>>> For this one:
>>>
>>> ticks 1  Infinite Loop:47    CC [M]
>>>  drivers/infiniband/hw/amso1100/c2_intr.o
>>>
>>> I am not completely sure to understand the code.  c2dev->hints_read is
>>> nonlocal and is never explicitly reset to 0.  Perhaps this code is only
>>> executed once per instance of cdev.  In any case hints_read is
>>> incremented
>>> on each iteration.  But perhaps the value of
>>> be16_to_cpu(*c2dev->hint_count)
>>> can also change at each access?
>>>
>>> This is clearly the sort of loop you described:
>>>
>>> ticks 1  Infinite Loop:735    CC [M]
>>>  drivers/infiniband/hw/amso1100/c2_qp.o
>>>
>>> Skipping ahead a bit, I don't at all see the problem in the following
>>> code:
>>>
>>> ticks 1  Infinite Loop:184    CC [M]
>>>  drivers/media/video/ivtv/ivtv-firmware.o
>>>
>>>        for (i = 0; i < size; i += 0x100) {
>>>                if (readl(mem + i)      == 0x12345678 &&
>>>                    readl(mem + i + 4)  == 0x34567812 &&
>>>                    readl(mem + i + 8)  == 0x56781234 &&
>>>                    readl(mem + i + 12) == 0x78123456) {
>>>                        return (volatile struct ivtv_mailbox __iomem
>>> *)(mem +
>>> i + 16);
>>>                }
>>>        }
>>>
>>> I don't see the problem here either:
>>>
>>> ticks 1  Infinite Loop:93    CC [M]
>>>  drivers/media/video/pvrusb2/pvrusb2-eeprom.o
>>>
>>>        for (tcnt = 0; tcnt < EEPROM_SIZE; tcnt += pcnt) {
>>>                pcnt = 16;
>>>                if (pcnt + tcnt > EEPROM_SIZE) pcnt = EEPROM_SIZE-tcnt;
>>>                ...
>>>        }
>>>
>>> The next report is on similar code:
>>>
>>> ticks 1  Infinite Loop:3536    CC [M]
>>>  drivers/media/video/pvrusb2/pvrusb2-hdw.o
>>>
>>> These two indeed might be dangerous:
>>>
>>> ticks 2  Infinite Loop:481  Infinite Loop:676    CC [M]
>>>  drivers/media/video/saa7134/saa7134-tvaudio.o
>>>
>>> julia
--
To unsubscribe from this list: send the line "unsubscribe kernel-janitors" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html