On 08/22/2013 11:34 PM, Michael Schmitz wrote:
This might have nothing to do with the current issue, but during the course of debugging this two byte dma write, I stumbled across this;Hello Tuomas,No idea which functions you refer to regarding PIO (reading from FIFO). Anyway, the DMA will read from the same FIFO that the CPU would, if I understand DMA correctly. There really should be no difference-esp.scsi.c:1058 versus esp.scsi.c:1120I suppose this is because reconnect_with_tag() cannot rely on the tag information already being in the FIFO when it is called from esp_reconnect(). esp_reconnect() is called from the reselection interrupt so the fifo data is already present.In: http://pdf.datasheetcatalog.com/datasheet/AdvancedMicroDevices/mXxxsy.pdf page 42: Enable Selection/Reselection Command (Command Code 44H/C4H) explains why the bytes might end up in FIFO; The chip DMA got disabled at some point. Also, you need to explicitly issue a specific (don't know which, yet) command to let the DMA access the FIFO.The chip DMA should be activated by the ESP_CMD_DMA bit that is set in all uses of send_dma_cmd. I can't see where the DMA would have been disabled.The DMA transfer function can be changed to fetch less than seven bytes from the FIFO by PIO instead of setting up DMA transfer (just poll the FIFO after sending the command to the ESP, instead of first setting up the DMA then sending the command),I'd rather find out why and where the DMA is disabled before we resort to PIO. It is also possible that the chip interrupt is cleared prematurely by reading the interrupt register, in which case the interrupt is not seen by The Loop.That's the attitude - my suggestion was purely pragmatic, in order to overcome that particular roadblock and see whether there's further issues. But fixing this properly would be much preferred. David Miller is still maintainer of the ESP code - I can't think of anyone better suited to answer ESP specific questions really.
[ 2434.370000] ESP: reconnect tag, IRQ(0:10:97), kernel BUG at mm/slab.c:3011! [1]
I had modified the code in esp_scsi.c inside the reconnect_with_tag function to kzalloc the two bytes, single mapping it to "from device" direction in order to check whether the issue was with the dma writing directly to the command block. The dma command was directed to write the two bytes to the area kalloc'd, after which they were copied to the real command block, and the kalloc'd area was unmapped and freed. This would work a couple times when the driver attempted to get the two bytes, but eventually, the above mentioned kernel bug would kick in. I had made sure to unmap and free the two bytes before the function could exit.
Another oddity [2] was that I assigned the kalloc'd bytes to 0xff and 0xfe, and although the code to copy the two bytes to the real command block was after the IRQ2 timeout loop (that continuously calls zorro_esp_irq_pending() ), these values were already sitting in the command block before the loop would exit and the values be copied. I could not prevent this from happening even when I disabled optimizations and/or made the copy through a temporary volatile u8. The last two values from the printk are command_block[0] and command_block[1].
[1] zesp050.cap:836 (printk message in zorro_esp_irq_pending() was disabled) [2] zesp049.cap:988 -Tuomas
Attachment:
zesp049.cap.gz
Description: GNU Zip compressed data
Attachment:
zesp050.cap.gz
Description: GNU Zip compressed data