i2c-ocores timeout bug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, I have hit upon a bug in this driver in the 2.6.32 which caused
memory corrupt and crash in my kernel. It appears to be still present
in 3.0-rc3 so as you are listed as the current  kernel i2c maintainers
which covers i2c-ocores I thought you could comment on it and the
proposed fix I enclose below.

Basically the i2c-ocores xfer function, kicks off an interrupt driven
array of i2c_msg transfers and waits 1second for it to complete and
just
returns -ETIMEDOUT if it times out. The interrupt driven transfer is
*not* stopped on a time out and continues referencing the i2c_msg
structure. In my
case an i2c read the interrupt handler keeps adding bytes to the
structure. Unfortunately after the time out the structure was released
and it's length field was changed to a large number and the interrupt
driver kept happily writing bytes way past the end of the original
buffer until the kernel crashed or locked up!

Below is a fix that works for me in 2.6.32 which removes the i2c_msg
buffer from the interrupt handler and changes the handler to check for
it's removal.

diff --git a/drivers/i2c/busses/i2c-ocores.c b/drivers/i2c/busses/i2c-ocores.c
index 8dee246..21e57a0 100644
--- a/drivers/i2c/busses/i2c-ocores.c
+++ b/drivers/i2c/busses/i2c-ocores.c
@@ -137,6 +137,12 @@ static void ocores_process(struct ocores_i2c *i2c)
                wake_up(&i2c->wait);
                return;
        }
+       if (msg == 0) {
+       /* Caller has with drawn the request, buffer no longer available */
+           i2c->state = STATE_ERROR;
+           oc_setreg(i2c, OCI2C_CMD, OCI2C_CMD_STOP);
+           return;
+       }

        /* error? */
        if (stat & OCI2C_STAT_ARBLOST) {
@@ -223,8 +229,13 @@ static int ocores_xfer(struct i2c_adapter *adap,
struct i2c_msg *msgs, int num)
        if (wait_event_timeout(i2c->wait, (i2c->state == STATE_ERROR) ||
                               (i2c->state == STATE_DONE), HZ))
                return (i2c->state == STATE_DONE) ? num : -EIO;
-       else
+       else {
+               printk(KERN_NOTICE
+               "ocores_xfer() i2c transfer to %d-%#x timed out\n",
+                       i2c_adapter_id(adap), msgs->addr);
+               i2c->msg = 0; /* remove the caller's request which
will be re-used */
                return -ETIMEDOUT;
+       }
 }

 static void ocores_init(struct ocores_i2c *i2c)

Andrew

Some more details of the failure -

Here's the user space command that causes the lock up:

   i2cdump -y 4 0x50 i

Note: With out the i, it performs one byte transfers and is fine. It
seems my i2c-ocores i2c transfers go *very* slowly and in order to get
the above command to not time out I had to bump up the i2c-ocores time
out value to 3s. I suspect that's not necessary for normal situation
which is why this bug may not have been noticed before. Also most
other transactions are usually one byte device probes which don't get
a response. In my case the interrupts appear to be slow - taking some
15 bytes over one second and there is plenty more data if the
interrupt driver wants to keep reading.

Here's the transfer function and you can see how it just waits for
1sec and then fails the request with out stopping the interrupt driver
from continuing to reference the request data:

 static int ocores_xfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num)
{
	struct ocores_i2c *i2c = i2c_get_adapdata(adap);

	i2c->msg = msgs;
	i2c->pos = 0;
	i2c->nmsgs = num;
	i2c->state = STATE_START;

	oc_setreg(i2c, OCI2C_DATA,
			(i2c->msg->addr << 1) |
			((i2c->msg->flags & I2C_M_RD) ? 1:0));

	oc_setreg(i2c, OCI2C_CMD, OCI2C_CMD_START);

	if (wait_event_timeout(i2c->wait, (i2c->state == STATE_ERROR) ||
			       (i2c->state == STATE_DONE), HZ))
		return (i2c->state == STATE_DONE) ? num : -EIO;
	else
		return -ETIMEDOUT;
}

My kernel would crash with various faults including:  "Thread overran
stack, or stack corrupted".
I added in some debugging into the i2c-ocores module and traced things
down to apparently a corruption of the len field of the i2c transfer
i2c_msg structure:

The above i2cdump command generates an i2c_msg read transfer for 32
bytes of data. At the 16th byte of read data the length field is
over-written with 49844 :

Read finished 1 xfers left
ocores_process() state=1 status=0x41
ocores_process() Read addr=0x50 len=32
ocores_process() state=3 status=0x41
Read [0] = 0x6 len=32
ocores_process() state=3 status=0x41
Read [1] = 0x40 len=32
ocores_process() state=3 status=0x41
Read [2] = 0x55 len=32
ocores_process() state=3 status=0x41
Read [3] = 0x0 len=32
ocores_process() state=3 status=0x41
Read [4] = 0xfb len=32
ocores_process() state=3 status=0x41
Read [5] = 0x0 len=32
ocores_process() state=3 status=0x41
Read [6] = 0x50 len=32
ocores_process() state=3 status=0x41
Read [7] = 0x0 len=32
ocores_process() state=3 status=0x41
Read [8] = 0x0 len=32
ocores_process() state=3 status=0x41
Read [9] = 0x0 len=32
ocores_process() state=3 status=0x41
Read [10] = 0xcf len=32
ocores_process() state=3 status=0x41
Read [11] = 0xde len=32
ocores_process() state=3 status=0x41
Read [12] = 0xe6 len=32
ocores_process() state=3 status=0x41
Read [13] = 0xdc len=32
ocores_process() state=3 status=0x41
Read [14] = 0xea len=32
ocores_process() state=3 status=0x41
Read [15] = 0xff len=49844
ocores_process() state=3 status=0x41
Read [16] = 0x96 len=49844
ocores_process() state=3 status=0x41
....

        It happily continues past the end of the buffer at 32 ...

Read [28] = 0x18 len=49844
ocores_process() state=3 status=0x41
Read [29] = 0xa5 len=49844
ocores_process() state=3 status=0x41
Read [30] = 0x31 len=49844
ocores_process() state=3 status=0x41
Read [31] = 0x27 len=49844
ocores_process() state=3 status=0x41
Read [32] = 0x1f len=49844
ocores_process() state=3 status=0x41
Read [33] = 0x7 len=49844
ocores_process() state=3 status=0x41
Read [34] = 0x3d len=49844
ocores_process() state=3 status=0x41
Read [35] = 0xe6 len=49844
ocores_process() state=3 status=0x41
Read [36] = 0x0 len=49844
ocores_process() state=3 status=0x41
Read [37] = 0x64 len=49844
.....

   It then continues till it stomps on something and gets some sort of fault....

Read [220] = 0x8 len=58644
ocores_process() state=3 status=0x41
Read [221] = 0x60 len=58644
ocores_process() state=3 status=0x41
Read [222] = 0x43 len=58644
ocores_process() state=3 status=0x41
Read [223] = 0x49 len=58644
ocores_process() state=3 status=0x41
Read [224] = 0x0 len=58644
PANIC: double fault, gdt at c1400000 [255 bytes]
double fault, tss at c1404300
eip = c032b5a3, esp = c03fa000
eax = c03fa020, ebx = df16e4e0, ecx = 0000007b, edx = 00000009
esi = de545580, edi = c0119427
Read [28] = 0x18 len=49844
--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux GPIO]     [Linux SPI]     [Linux Hardward Monitoring]     [LM Sensors]     [Linux USB Devel]     [Linux Media]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux