On looking at this issue a bit further (please note (and be foregiving),
I know very little about the kernel/SCSI implementation), it looks more
and more to be very similar to another kernel-panic issue posted some 6
months ago (http://www.spinics.net/lists/linux-scsi/msg44077.html).
The kernel-panic, which occurs at boot-time in udev/ata_id.c when
issuing an ioctl SG_IO sg3 SCSI ATA Pass-through Identify command,
appears to arise from DMA'ing into an incorrectly aligned user data
buffer pointed to by sg_io_hdr.dxferp .
My guess is that in the past, use of sg3 would not involve DMA by
default, but now, with libata ATA Pass-Through commands, it does (I also
may be totally wrong about that, just a thought). I recall documentaion
somewhere which emphasized that if direct I/O (DMA) is to used in sg,
one should page-align the SCSI response data buffer.. With sg using
indirect I/O this wouldn't be necessary, of course, but perhaps now with
libata, it is. Just guessing here.
If I patch ata_id.c to use a page-aligned *sg_io_hdr.dxferp, the
kernel-panic goes away.
The same kernel-panic (same traceback) also occurs in sg__sat_identify.c
(sg3-utils-1.30) once the system is running, issuing, eg,
sg__sat_identify -vv -p /dev/dvd
Patching sg__sat_identify.c in the same way eliminates the kernel-panic
here as well. The code in sg__sat_identify.c and ata_id.c implementing
the sg3 SCSI ATA Pass-through Identify command is more or less the same.
Here's the relevant (vanilla linux-2.6.37) kernel code path, beginning in
drivers/scsi/sg.c (@NNN = at line number):
sg_ioctl @772 in drivers/scsi/sg.c
sg_new_write @801 in drivers/scsi/sg.c
sg_common_write @708 in drivers/scsi/sg.c
sg_start_req @735 in drivers/scsi/sg.c
blk_rq_map_user @1707 in drivers/scsi/sg.c
__blk_rq_map_user @142 in block/blk-map.c
blk_rq_aligned @57 in block/blk-map.c
queue_dma_alignment @1060 in include/linux/blkdev.h
include/linux/blkdev.h
1057 static inline int blk_rq_aligned(struct request_queue *q, unsigned
long addr,
1058 unsigned int len)
1059 {
1060 unsigned int alignment = queue_dma_alignment(q) |
q->dma_pad_mask;
1061 return !(addr & alignment) && !(len & alignment);
1062 }
1052 static inline int queue_dma_alignment(struct request_queue *q)
1053 {
1054 return q ? q->dma_alignment : 511;
1055 }
In drivers/scsi/scsi_lib.c (request_queue @1609) we get a default 2-byte
alignment mask (possibly overridden later by the kernel or underlying
driver I assume), which is then used in queue_dma_alignment @1054 in
include/linux/blkdev.h
So, for the current drive(s) at hand, as noted in
http://www.spinics.net/lists/linux-scsi/msg44077.html),
if the driver (and/or kernel) does not override the default 2-byte
alignment mask, and such an alignment
is inappropriate, we may have problems when DMA-ing into the specified
user data buffer pointed to
by sg_io_hdr.dxferp .
In conclusion, then, it would seem that some cd drive/controllers either
incorrectly specify (or don't specify) DMA buffer alignment
requirements, or that the kernel uses inappropriate alignments in some
cases on its own.
I'm not sure where or how to fix this in the kernel, aside from using a
default page-alignemet in drivers/scsi/scsi_lib.c (rather than 2-byte)
(line 1648); this, however, is probably way too 'blunt' and may impact
other drivers and/or system performance ...:
drivers/scsi/scsi_lib.c
1609 struct request_queue *__scsi_alloc_queue(struct Scsi_Host *shost,
1610 request_fn_proc *request_fn)
1611 {
.
.
1643 /*
1644 * set a reasonable default alignment on word boundaries: the
1645 * host and device may alter it using
1646 * blk_queue_update_dma_alignment() later.
1647 */
1648 blk_queue_dma_alignment(q, 0x03); <--- Use PAGESIZE-1,
rather than 0x03 ?
.
.
Any thoughts ?
thanks much for everyone's time,
John
On 01/11/2011 08:25 AM, Tejun Heo wrote:
Hello,
On Mon, Jan 10, 2011 at 09:10:12AM +0100, Hannes Reinecke wrote:
First, sorry for not posting something about this sooner - I'd
pinged Kay on IRC about it, and I *promise* I had planned to
forward it to the scsi/ati guys, but work has been hell this
week. Anyway, here's the initial report we got about it, along
with a lot of debugging by other folks (including the OP, who
I think is 'resonance' in that thread):
http://www.linuxquestions.org/questions/slackware-14/current-randomly-timed-kernel-oops-on-bootup-of-two-test-boxen-852843/
It's all Tejun's fault.
Gees, Hannes. That's very kind of you. :-P
kernel crashing in ata_sff_data_xfer / ioread32 ...
Looks like we're trying a read to a page which wasn't
mapped/allocated properly.
And yes, it definitely should be fixed in the kernel first.
Yeah, definitely. It isn't clear from the thread.
* Is it a regression?
* Can this be triggered by simply running ata_id or does it need any
other condition to trigger?
I don't recall any related change in the area, at least in libata, so
it's a bit surprising. If it's a regression, I think it's more likely
to be something between userland and libata. The user buffer mapping
code for sg commands is quite scary after all.
Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-hotplug" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html