Bart, On 2/22/18 12:53, Bart Van Assche wrote: > On Thu, 2018-02-22 at 02:23 +0000, Damien Le Moal wrote: >> On Wed, 2018-02-21 at 09:23 -0800, Bart Van Assche wrote: >>> [ ... ] >> This does not compile. > > This patch depends on another patch that is not yet in Martin's tree. See also > https://marc.info/?l=linux-scsi&m=151675130615597. I should have mentioned this > in the patch description. OK. Got it. >> Testing this, the rcu hang is now gone. > > Thanks for the testing :-) > >> However, the behavior of the error recovery is still different from what I >> see in 4.15 and 4.14. For my test case, an unaligned write to a sequential >> zone on a ZAC drive connected to an AHCI port, the report zone issued during >> the disk revalidation after the write error fails with a timeout, which causes >> capacity change to 0, port reset and recovery again. Eventually, everything >> comes back up OK, but it takes some time. >> >> I am investigating to make sure I am not hitting a device FW bug to confirm if >> this is a kernel problem. > > This patch was tested with the SRP protocol. I'm not an ATA expert but I hope > that someone who is more familiar with ATA than I can chime in. Well, I would expect the retry of the unaligned write to fail immediately while the report zones is still on-going. These are both NCQ commands, so that would result in the queue to be aborted and eh to see the failed write retry and the aborted report zones, which should be restarted right away. But report zones timeout after 30sec... Example with CONFIG_DEBUG_OBJECTS_RCU_HEAD turned on (check time stamps): [ 43.978457] ata6.00: exception Emask 0x0 SAct 0x20 SErr 0x0 action 0x0 [ 43.985239] ata6.00: irq_stat 0x40000008 [ 43.989272] ata6.00: failed command: WRITE FPDMA QUEUED [ 43.994657] ata6.00: cmd 61/08:28:08:00:60/00:00:10:00:00/40 tag 5 ncq dma 4096 out [ 43.994657] res 43/04:08:08:00:60/00:00:10:00:00/00 Emask 0x400 (unknown error) <F> [ 44.011157] ata6.00: status: { DRDY SENSE ERR } [ 44.015825] ata6.00: error: { ABRT } [ 44.268565] ata6.00: configured for UDMA/133 [ 44.273208] ata6: EH complete [ 44.303440] ata6.00: exception Emask 0x0 SAct 0x10020 SErr 0x0 action 0x0 [ 44.310743] ata6.00: irq_stat 0x40000008 [ 44.314989] ata6.00: failed command: WRITE FPDMA QUEUED [ 44.320634] ata6.00: cmd 61/08:28:08:00:60/00:00:10:00:00/40 tag 5 ncq dma 4096 out [ 44.320634] res 43/04:08:08:00:60/00:00:10:00:00/00 Emask 0x400 (unknown error) <F> [ 44.337787] ata6.00: status: { DRDY SENSE ERR } [ 44.342576] ata6.00: error: { ABRT } [ 44.374416] ata6.00: configured for UDMA/133 [ 44.379094] ata6: EH complete [ 74.638316] ata6.00: exception Emask 0x1 SAct 0x110 SErr 0x0 action 0x6 frozen [ 74.645999] ata6.00: irq_stat 0x40000008 [ 74.650001] ata6.00: failed command: RECEIVE FPDMA QUEUED [ 74.655544] ata6.00: cmd 65/00:20:00:00:00/01:02:00:00:00/40 tag 4 ncq dma 131072 in [ 74.655544] res 40/00:00:00:00:38/00:00:3f:06:00/40 Emask 0x5 (timeout) [ 74.671001] ata6.00: status: { DRDY } [ 74.674768] ata6.00: failed command: WRITE FPDMA QUEUED [ 74.680140] ata6.00: cmd 61/08:40:08:00:60/00:00:10:00:00/40 tag 8 ncq dma 4096 out [ 74.680140] res 40/00:00:08:00:60/00:00:10:00:00/40 Emask 0x1 (device error) [ 74.695966] ata6.00: status: { DRDY } [ 74.699730] ata6: hard resetting link [ 75.013606] ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 75.067939] ata6.00: configured for UDMA/133 [ 75.072465] ata6.00: device reported invalid CHS sector 0 [ 75.078167] ata6: EH complete [ 75.078188] sd 5:0:0:0: [sdd] REPORT ZONES lba 0 failed with 0/8 [ 75.087601] sd 5:0:0:0: [sdd] 0 512-byte logical blocks: (0 B/0 B) [ 75.093903] sd 5:0:0:0: [sdd] 4096-byte physical blocks [ 75.099276] sdd: detected capacity change from 14000519643136 to 0 [ 106.189951] ata6.00: exception Emask 0x1 SAct 0x400010 SErr 0x0 action 0x6 frozen [ 106.197683] ata6.00: irq_stat 0x40000008 [ 106.201730] ata6.00: failed command: WRITE FPDMA QUEUED [ 106.207096] ata6.00: cmd 61/08:20:08:00:60/00:00:10:00:00/40 tag 4 ncq dma 4096 out [ 106.207096] res 40/00:00:08:00:60/00:00:10:00:00/40 Emask 0x1 (device error) [ 106.222930] ata6.00: status: { DRDY } [ 106.226690] ata6.00: failed command: RECEIVE FPDMA QUEUED [ 106.232235] ata6.00: cmd 65/00:b0:00:00:00/01:02:00:00:00/40 tag 22 ncq dma 131072 in [ 106.232235] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x5 (timeout) [ 106.247834] ata6.00: status: { DRDY } [ 106.251641] ata6: hard resetting link [ 106.565274] ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 106.600020] ata6.00: configured for UDMA/133 [ 106.604501] ata6: EH complete [ 106.604526] sd 5:0:0:0: [sdd] REPORT ZONES lba 0 failed with 0/8 [ 106.607669] print_req_error: I/O error, dev sdd, sector 274726920 [ 107.037705] sdd: detected capacity change from 0 to 14000519643136 [ 110.117135] sd 5:0:0:0: [sdd] 27344764928 512-byte logical blocks: (14.0 TB/12.7 TiB) [ 110.125441] sd 5:0:0:0: [sdd] 4096-byte physical blocks [ 110.130782] sd 5:0:0:0: [sdd] 52156 zones of 524288 logical blocks As can be seen at the end of the log, the disk comes back up and the write failure reported back to the application (dd in this case). Re-running dd again result in the same behavior once more. But surprisingly, running it a third time, I get directly a scsi error (no ata error message). No report zones failure. [ 386.748116] sd 5:0:0:0: [sdd] tag#26 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 386.759463] sd 5:0:0:0: [sdd] tag#26 Sense Key : Illegal Request [current] [ 386.767276] sd 5:0:0:0: [sdd] tag#26 Add. Sense: Unaligned write command [ 386.774805] sd 5:0:0:0: [sdd] tag#26 CDB: Write(16) 8a 00 00 00 00 00 10 60 00 08 00 00 00 08 00 00 [ 386.784700] print_req_error: I/O error, dev sdd, sector 274726920 Which look like the result of a previous command since nothing seems to have been sent to libata... (the dd call is repeated exactly every time with the same target sector that is after the zone write pointer). Finally, running a fourth time, dd simply hangs with nothing happening, and dd cannot be killed. Reboot to recover. Now, the same test with CONFIG_DEBUG_OBJECTS_RCU_HEAD turned off, I get this: [ 463.680615] ata6.00: exception Emask 0x0 SAct 0x40000 SErr 0x0 action 0x0 [ 463.687884] ata6.00: irq_stat 0x40000008 [ 463.691904] ata6.00: failed command: WRITE FPDMA QUEUED [ 463.697284] ata6.00: cmd 61/08:90:08:00:60/00:00:10:00:00/40 tag 18 ncq dma 4096 out [ 463.697284] res 43/04:08:08:00:60/00:00:10:00:00/00 Emask 0x400 (unknown error) <F> [ 463.713798] ata6.00: status: { DRDY SENSE ERR } [ 463.718462] ata6.00: error: { ABRT } [ 463.912968] ata6.00: configured for UDMA/133 [ 463.917706] ata6: EH complete [ 463.944550] ata6.00: exception Emask 0x0 SAct 0x800010 SErr 0x0 action 0x0 [ 463.951962] ata6.00: irq_stat 0x40000008 [ 463.956293] ata6.00: failed command: WRITE FPDMA QUEUED [ 463.961963] ata6.00: cmd 61/08:b8:08:00:60/00:00:10:00:00/40 tag 23 ncq dma 4096 out [ 463.961963] res 43/04:08:08:00:60/00:00:10:00:00/00 Emask 0x400 (unknown error) <F> [ 463.979399] ata6.00: status: { DRDY SENSE ERR } [ 463.984336] ata6.00: error: { ABRT } [ 464.016416] ata6.00: configured for UDMA/133 [ 464.020894] ata6: EH complete [ 464.050581] ata6.00: exception Emask 0x0 SAct 0x3000 SErr 0x0 action 0x0 [ 464.057602] ata6.00: irq_stat 0x40000008 [ 464.061761] ata6.00: failed command: WRITE FPDMA QUEUED [ 464.067318] ata6.00: cmd 61/08:60:08:00:60/00:00:10:00:00/40 tag 12 ncq dma 4096 out [ 464.067318] res 43/04:08:08:00:60/00:00:10:00:00/00 Emask 0x400 (unknown error) <F> [ 464.084371] ata6.00: status: { DRDY SENSE ERR } [ 464.089324] ata6.00: error: { ABRT } [ 464.121189] ata6.00: configured for UDMA/133 [ 464.125640] ata6: EH complete [ 464.363596] ata6.00: exception Emask 0x0 SAct 0x800400 SErr 0x0 action 0x0 [ 464.371012] ata6.00: irq_stat 0x40000008 [ 464.375257] ata6.00: failed command: WRITE FPDMA QUEUED [ 464.380937] ata6.00: cmd 61/08:50:08:00:60/00:00:10:00:00/40 tag 10 ncq dma 4096 out [ 464.380937] res 43/04:08:08:00:60/00:00:10:00:00/00 Emask 0x400 (unknown error) <F> [ 464.398337] ata6.00: status: { DRDY SENSE ERR } [ 464.403186] ata6.00: error: { ABRT } [ 464.495690] ata6.00: configured for UDMA/133 [ 464.500367] ata6: EH complete [ 464.739616] ata6.00: exception Emask 0x0 SAct 0x4800 SErr 0x0 action 0x0 [ 464.746735] ata6.00: irq_stat 0x40000008 [ 464.750729] ata6.00: failed command: WRITE FPDMA QUEUED [ 464.756095] ata6.00: cmd 61/08:70:08:00:60/00:00:10:00:00/40 tag 14 ncq dma 4096 out [ 464.756095] res 43/04:08:08:00:60/00:00:10:00:00/00 Emask 0x400 (unknown error) <F> [ 464.772637] ata6.00: status: { DRDY SENSE ERR } [ 464.777294] ata6.00: error: { ABRT } [ 464.809308] ata6.00: configured for UDMA/133 [ 464.813726] ata6: EH complete [ 464.843549] ata6.00: exception Emask 0x0 SAct 0xc000 SErr 0x0 action 0x0 [ 464.850525] ata6.00: irq_stat 0x40000008 [ 464.854622] ata6.00: failed command: WRITE FPDMA QUEUED [ 464.860050] ata6.00: cmd 61/08:70:08:00:60/00:00:10:00:00/40 tag 14 ncq dma 4096 out [ 464.860050] res 43/04:08:08:00:60/00:00:10:00:00/00 Emask 0x400 (unknown error) <F> [ 464.876819] ata6.00: status: { DRDY SENSE ERR } [ 464.881488] ata6.00: error: { ABRT } [ 464.913349] ata6.00: configured for UDMA/133 [ 464.917912] sd 5:0:0:0: [sdd] tag#14 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 464.926472] sd 5:0:0:0: [sdd] tag#14 Sense Key : Illegal Request [current] [ 464.933569] sd 5:0:0:0: [sdd] tag#14 Add. Sense: Unaligned write command [ 464.940472] sd 5:0:0:0: [sdd] tag#14 CDB: Write(16) 8a 00 00 00 00 00 10 60 00 08 00 00 00 08 00 00 [ 464.949740] print_req_error: I/O error, dev sdd, sector 274726920 [ 464.956040] ata6: EH complete quick error recovery and return to dd. But running a second time and dd hangs. Nothing happening. debugfs information gives me this: ./hctx0/cpu7/completed:13 2 ./hctx0/cpu7/merged:0 ./hctx0/cpu7/dispatched:13 2 ./hctx0/cpu6/completed:0 0 ./hctx0/cpu6/merged:0 ./hctx0/cpu6/dispatched:0 0 ./hctx0/cpu5/completed:5 3 ./hctx0/cpu5/merged:0 ./hctx0/cpu5/dispatched:5 3 ./hctx0/cpu4/completed:47 4 ./hctx0/cpu4/merged:0 ./hctx0/cpu4/dispatched:48 4 ./hctx0/cpu3/completed:6 85 ./hctx0/cpu3/merged:0 ./hctx0/cpu3/dispatched:6 85 ./hctx0/cpu2/completed:42 8 ./hctx0/cpu2/merged:0 ./hctx0/cpu2/dispatched:42 8 ./hctx0/cpu1/completed:3 92 ./hctx0/cpu1/merged:0 ./hctx0/cpu1/dispatched:3 92 ./hctx0/cpu0/completed:7 4 ./hctx0/cpu0/merged:0 ./hctx0/cpu0/dispatched:7 4 ./hctx0/active:0 ./hctx0/run:307 ./hctx0/queued:322 ./hctx0/dispatched: 0 12 ./hctx0/dispatched: 1 331 ./hctx0/dispatched: 2 0 ./hctx0/dispatched: 4 0 ./hctx0/dispatched: 8 0 ./hctx0/dispatched: 16 0 ./hctx0/dispatched: 32+ 0 ./hctx0/sched_tags_bitmap:00000000: 0000 0000 0000 0800 ./hctx0/sched_tags:nr_tags=62 ./hctx0/sched_tags:nr_reserved_tags=0 ./hctx0/sched_tags:active_queues=0 ./hctx0/sched_tags:bitmap_tags: ./hctx0/sched_tags:depth=62 ./hctx0/sched_tags:busy=1 ./hctx0/sched_tags:bits_per_word=8 ./hctx0/sched_tags:map_nr=8 ./hctx0/sched_tags:alloc_hint={11, 34, 50, 31, 52, 8, 0, 15} ./hctx0/sched_tags:wake_batch=7 ./hctx0/sched_tags:wake_index=0 ./hctx0/sched_tags:ws={ ./hctx0/sched_tags: {.wait_cnt=7, .wait=inactive}, ./hctx0/sched_tags: {.wait_cnt=7, .wait=inactive}, ./hctx0/sched_tags: {.wait_cnt=7, .wait=inactive}, ./hctx0/sched_tags: {.wait_cnt=7, .wait=inactive}, ./hctx0/sched_tags: {.wait_cnt=7, .wait=inactive}, ./hctx0/sched_tags: {.wait_cnt=7, .wait=inactive}, ./hctx0/sched_tags: {.wait_cnt=7, .wait=inactive}, ./hctx0/sched_tags: {.wait_cnt=7, .wait=inactive}, ./hctx0/sched_tags:} ./hctx0/sched_tags:round_robin=1 ./hctx0/tags_bitmap:00000000: 0000 0000 ./hctx0/tags:nr_tags=31 ./hctx0/tags:nr_reserved_tags=0 ./hctx0/tags:active_queues=0 ./hctx0/tags:bitmap_tags: ./hctx0/tags:depth=31 ./hctx0/tags:busy=0 ./hctx0/tags:bits_per_word=4 ./hctx0/tags:map_nr=8 ./hctx0/tags:alloc_hint={15, 5, 0, 10, 21, 8, 0, 15} ./hctx0/tags:wake_batch=3 ./hctx0/tags:wake_index=0 ./hctx0/tags:ws={ ./hctx0/tags: {.wait_cnt=3, .wait=inactive}, ./hctx0/tags: {.wait_cnt=3, .wait=inactive}, ./hctx0/tags: {.wait_cnt=3, .wait=inactive}, ./hctx0/tags: {.wait_cnt=3, .wait=inactive}, ./hctx0/tags: {.wait_cnt=3, .wait=inactive}, ./hctx0/tags: {.wait_cnt=3, .wait=inactive}, ./hctx0/tags: {.wait_cnt=3, .wait=inactive}, ./hctx0/tags: {.wait_cnt=3, .wait=inactive}, ./hctx0/tags:} ./hctx0/tags:round_robin=1 ./hctx0/ctx_map:00000000: 00 ./hctx0/flags:alloc_policy=RR SHOULD_MERGE|SG_MERGE ./sched/starved:0 ./sched/batching:1 ./sched/write_fifo_list:000000003d5af16e {.op=WRITE, .cmd_flags=SYNC|IDLE, .rq_flags=IO_STAT|HASHED, complete=0, .tag=-1, .internal_tag=51, .cmd=Mode Sense(10) 5a 00 08 00 00 00 00 00 08 00, .retries=0, .result = 0x0, .flags=TAGGED, .timeout=0.000, allocated 282.418 s ago} ./state:SAME_COMP|IO_STAT|INIT_DONE|POLL|WC|FLUSH_NQ|STATS|REGISTERED|SCSI_PASSTHROUGH Clearly, the write command is seating in the scheduler queue but nothing is happening. I checked if this would be a bug with mq-deadline zone locking, but it does not appear to be so (the target zone is unlocked). Since turning on/off CONFIG_DEBUG_OBJECTS_RCU_HEAD changes the behavior, I am inclined to think of a timing/race issue somewhere. Running with scsi_debug turned on (and a raiser high log level) also suppresses the problem, so another indicator of a timing issue. I do not fully understand what is going on yet. Still looking. Hannes, Tejun, help ! Best regards. -- Damien Le Moal, Western Digital