Hi Andrew, On Wed, 15 Jun 2011 08:33:45 +1000, Andrew Worsley wrote: > Hi, I have hit upon a bug in this driver in the 2.6.32 which caused > memory corrupt and crash in my kernel. It appears to be still present > in 3.0-rc3 so as you are listed as the current kernel i2c maintainers > which covers i2c-ocores I thought you could comment on it and the > proposed fix I enclose below. The same MAINTAINERS file also says that Peter Korsgaard is maintaining the i2c-ocores driver, so he would be an even better recipient for your request. Cc'd. Peter, can you please review the patch? Thanks. > > Basically the i2c-ocores xfer function, kicks off an interrupt driven > array of i2c_msg transfers and waits 1second for it to complete and > just > returns -ETIMEDOUT if it times out. The interrupt driven transfer is > *not* stopped on a time out and continues referencing the i2c_msg > structure. In my > case an i2c read the interrupt handler keeps adding bytes to the > structure. Unfortunately after the time out the structure was released > and it's length field was changed to a large number and the interrupt > driver kept happily writing bytes way past the end of the original > buffer until the kernel crashed or locked up! > > Below is a fix that works for me in 2.6.32 which removes the i2c_msg > buffer from the interrupt handler and changes the handler to check for > it's removal. > > diff --git a/drivers/i2c/busses/i2c-ocores.c b/drivers/i2c/busses/i2c-ocores.c > index 8dee246..21e57a0 100644 > --- a/drivers/i2c/busses/i2c-ocores.c > +++ b/drivers/i2c/busses/i2c-ocores.c > @@ -137,6 +137,12 @@ static void ocores_process(struct ocores_i2c *i2c) > wake_up(&i2c->wait); > return; > } > + if (msg == 0) { > + /* Caller has with drawn the request, buffer no longer available */ > + i2c->state = STATE_ERROR; > + oc_setreg(i2c, OCI2C_CMD, OCI2C_CMD_STOP); > + return; > + } > > /* error? */ > if (stat & OCI2C_STAT_ARBLOST) { > @@ -223,8 +229,13 @@ static int ocores_xfer(struct i2c_adapter *adap, > struct i2c_msg *msgs, int num) > if (wait_event_timeout(i2c->wait, (i2c->state == STATE_ERROR) || > (i2c->state == STATE_DONE), HZ)) > return (i2c->state == STATE_DONE) ? num : -EIO; > - else > + else { > + printk(KERN_NOTICE > + "ocores_xfer() i2c transfer to %d-%#x timed out\n", > + i2c_adapter_id(adap), msgs->addr); > + i2c->msg = 0; /* remove the caller's request which > will be re-used */ > return -ETIMEDOUT; > + } > } > > static void ocores_init(struct ocores_i2c *i2c) > > Andrew > > Some more details of the failure - > > Here's the user space command that causes the lock up: > > i2cdump -y 4 0x50 i > > Note: With out the i, it performs one byte transfers and is fine. It > seems my i2c-ocores i2c transfers go *very* slowly and in order to get > the above command to not time out I had to bump up the i2c-ocores time > out value to 3s. I suspect that's not necessary for normal situation > which is why this bug may not have been noticed before. Also most > other transactions are usually one byte device probes which don't get > a response. In my case the interrupts appear to be slow - taking some > 15 bytes over one second and there is plenty more data if the > interrupt driver wants to keep reading. > > Here's the transfer function and you can see how it just waits for > 1sec and then fails the request with out stopping the interrupt driver > from continuing to reference the request data: > > static int ocores_xfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num) > { > struct ocores_i2c *i2c = i2c_get_adapdata(adap); > > i2c->msg = msgs; > i2c->pos = 0; > i2c->nmsgs = num; > i2c->state = STATE_START; > > oc_setreg(i2c, OCI2C_DATA, > (i2c->msg->addr << 1) | > ((i2c->msg->flags & I2C_M_RD) ? 1:0)); > > oc_setreg(i2c, OCI2C_CMD, OCI2C_CMD_START); > > if (wait_event_timeout(i2c->wait, (i2c->state == STATE_ERROR) || > (i2c->state == STATE_DONE), HZ)) > return (i2c->state == STATE_DONE) ? num : -EIO; > else > return -ETIMEDOUT; > } > > My kernel would crash with various faults including: "Thread overran > stack, or stack corrupted". > I added in some debugging into the i2c-ocores module and traced things > down to apparently a corruption of the len field of the i2c transfer > i2c_msg structure: > > The above i2cdump command generates an i2c_msg read transfer for 32 > bytes of data. At the 16th byte of read data the length field is > over-written with 49844 : > > Read finished 1 xfers left > ocores_process() state=1 status=0x41 > ocores_process() Read addr=0x50 len=32 > ocores_process() state=3 status=0x41 > Read [0] = 0x6 len=32 > ocores_process() state=3 status=0x41 > Read [1] = 0x40 len=32 > ocores_process() state=3 status=0x41 > Read [2] = 0x55 len=32 > ocores_process() state=3 status=0x41 > Read [3] = 0x0 len=32 > ocores_process() state=3 status=0x41 > Read [4] = 0xfb len=32 > ocores_process() state=3 status=0x41 > Read [5] = 0x0 len=32 > ocores_process() state=3 status=0x41 > Read [6] = 0x50 len=32 > ocores_process() state=3 status=0x41 > Read [7] = 0x0 len=32 > ocores_process() state=3 status=0x41 > Read [8] = 0x0 len=32 > ocores_process() state=3 status=0x41 > Read [9] = 0x0 len=32 > ocores_process() state=3 status=0x41 > Read [10] = 0xcf len=32 > ocores_process() state=3 status=0x41 > Read [11] = 0xde len=32 > ocores_process() state=3 status=0x41 > Read [12] = 0xe6 len=32 > ocores_process() state=3 status=0x41 > Read [13] = 0xdc len=32 > ocores_process() state=3 status=0x41 > Read [14] = 0xea len=32 > ocores_process() state=3 status=0x41 > Read [15] = 0xff len=49844 > ocores_process() state=3 status=0x41 > Read [16] = 0x96 len=49844 > ocores_process() state=3 status=0x41 > .... > > It happily continues past the end of the buffer at 32 ... > > Read [28] = 0x18 len=49844 > ocores_process() state=3 status=0x41 > Read [29] = 0xa5 len=49844 > ocores_process() state=3 status=0x41 > Read [30] = 0x31 len=49844 > ocores_process() state=3 status=0x41 > Read [31] = 0x27 len=49844 > ocores_process() state=3 status=0x41 > Read [32] = 0x1f len=49844 > ocores_process() state=3 status=0x41 > Read [33] = 0x7 len=49844 > ocores_process() state=3 status=0x41 > Read [34] = 0x3d len=49844 > ocores_process() state=3 status=0x41 > Read [35] = 0xe6 len=49844 > ocores_process() state=3 status=0x41 > Read [36] = 0x0 len=49844 > ocores_process() state=3 status=0x41 > Read [37] = 0x64 len=49844 > ..... > > It then continues till it stomps on something and gets some sort of fault.... > > Read [220] = 0x8 len=58644 > ocores_process() state=3 status=0x41 > Read [221] = 0x60 len=58644 > ocores_process() state=3 status=0x41 > Read [222] = 0x43 len=58644 > ocores_process() state=3 status=0x41 > Read [223] = 0x49 len=58644 > ocores_process() state=3 status=0x41 > Read [224] = 0x0 len=58644 > PANIC: double fault, gdt at c1400000 [255 bytes] > double fault, tss at c1404300 > eip = c032b5a3, esp = c03fa000 > eax = c03fa020, ebx = df16e4e0, ecx = 0000007b, edx = 00000009 > esi = de545580, edi = c0119427 > Read [28] = 0x18 len=49844 -- Jean Delvare -- To unsubscribe from this list: send the line "unsubscribe linux-i2c" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html