Re: Early-boot kernel panics from udev-165/extras/ata_id/ata_id.c

John Stanley <jpsinthemix@xxxxxxxxxxx> · Sun, 16 Jan 2011 23:03:06 -0500

On looking at this issue a bit further (please note (and be foregiving), 
I know very little about the kernel/SCSI implementation), it looks more 
and more to be very similar to another kernel-panic issue posted some 6 
months ago (http://www.spinics.net/lists/linux-scsi/msg44077.html).

The kernel-panic, which occurs at boot-time in udev/ata_id.c when 
issuing an ioctl SG_IO sg3 SCSI ATA Pass-through Identify command, 
appears to arise from DMA'ing into an incorrectly aligned user data 
buffer pointed to by sg_io_hdr.dxferp .

My guess is that in the past, use of sg3 would not involve DMA by 
default, but now, with libata ATA Pass-Through commands, it does (I also 
may be totally wrong about that, just a thought). I recall documentaion 
somewhere which emphasized that if direct I/O (DMA) is to used in sg, 
one should page-align the SCSI response data buffer.. With sg using 
indirect I/O this wouldn't be necessary, of course, but perhaps now with 
libata, it is. Just guessing here.

If I patch ata_id.c to use a page-aligned *sg_io_hdr.dxferp, the 
kernel-panic goes away.

The same kernel-panic (same traceback) also occurs in sg__sat_identify.c 
(sg3-utils-1.30) once the system is running, issuing, eg,

  sg__sat_identify -vv -p  /dev/dvd

Patching sg__sat_identify.c in the same way eliminates the kernel-panic 
here as well. The code in sg__sat_identify.c and ata_id.c implementing 
the sg3 SCSI ATA Pass-through Identify command is more or less the same.

Here's the relevant (vanilla linux-2.6.37) kernel code path, beginning in
drivers/scsi/sg.c (@NNN = at line number):

  sg_ioctl            @772  in drivers/scsi/sg.c
  sg_new_write        @801  in drivers/scsi/sg.c
  sg_common_write     @708  in drivers/scsi/sg.c
  sg_start_req        @735  in drivers/scsi/sg.c
  blk_rq_map_user     @1707 in drivers/scsi/sg.c
  __blk_rq_map_user   @142  in block/blk-map.c
  blk_rq_aligned      @57   in block/blk-map.c
  queue_dma_alignment @1060 in include/linux/blkdev.h

include/linux/blkdev.h
1057 static inline int blk_rq_aligned(struct request_queue *q, unsigned 
long addr,
1058                                  unsigned int len)
1059 {
1060         unsigned int alignment = queue_dma_alignment(q) | 
q->dma_pad_mask;
1061         return !(addr & alignment) && !(len & alignment);
1062 }

1052 static inline int queue_dma_alignment(struct request_queue *q)
1053 {
1054         return q ? q->dma_alignment : 511;
1055 }

In drivers/scsi/scsi_lib.c (request_queue @1609) we get a default 2-byte 
alignment mask (possibly overridden later by the kernel or underlying 
driver I assume), which is then used in queue_dma_alignment @1054 in 
include/linux/blkdev.h

So, for the current drive(s) at hand, as noted in 
http://www.spinics.net/lists/linux-scsi/msg44077.html),
if the driver (and/or kernel) does not override the default 2-byte 
alignment mask, and such an alignment
is inappropriate, we may have problems when DMA-ing into the specified 
user data buffer pointed to
by sg_io_hdr.dxferp .

In conclusion, then, it would seem that some cd drive/controllers either 
incorrectly specify (or don't specify) DMA buffer alignment 
requirements, or that the kernel uses inappropriate alignments in some 
cases on its own.

I'm not sure where or how to fix this in the kernel, aside from using a 
default page-alignemet in drivers/scsi/scsi_lib.c (rather than 2-byte) 
(line 1648); this, however, is probably way too 'blunt' and may impact 
other drivers and/or system performance ...:

drivers/scsi/scsi_lib.c
1609 struct request_queue *__scsi_alloc_queue(struct Scsi_Host *shost,
1610                                          request_fn_proc *request_fn)
1611 {
.
.
1643         /*
1644          * set a reasonable default alignment on word boundaries: the
1645          * host and device may alter it using
1646          * blk_queue_update_dma_alignment() later.
1647          */
1648         blk_queue_dma_alignment(q, 0x03); <--- Use PAGESIZE-1, 
rather than 0x03 ?
.
.

Any thoughts ?

thanks much for everyone's time,
John

On 01/11/2011 08:25 AM, Tejun Heo wrote:
Hello,

On Mon, Jan 10, 2011 at 09:10:12AM +0100, Hannes Reinecke wrote:
First, sorry for not posting something about this sooner - I'd
pinged Kay on IRC about it, and I *promise* I had planned to
forward it to the scsi/ati guys, but work has been hell this
week.  Anyway, here's the initial report we got about it, along
with a lot of debugging by other folks (including the OP, who
I think is 'resonance' in that thread):
http://www.linuxquestions.org/questions/slackware-14/current-randomly-timed-kernel-oops-on-bootup-of-two-test-boxen-852843/

It's all Tejun's fault.
Gees, Hannes.  That's very kind of you. :-P

kernel crashing in ata_sff_data_xfer / ioread32 ...
Looks like we're trying a read to a page which wasn't
mapped/allocated properly.

And yes, it definitely should be fixed in the kernel first.
Yeah, definitely.  It isn't clear from the thread.

* Is it a regression?

* Can this be triggered by simply running ata_id or does it need any
   other condition to trigger?

I don't recall any related change in the area, at least in libata, so
it's a bit surprising.  If it's a regression, I think it's more likely
to be something between userland and libata.  The user buffer mapping
code for sg commands is quite scary after all.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-hotplug" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html