Hello, I have a strange problem with writing to a fiberchannel LTO-4 tape drive. First my environment: >From /proc/scsi/scsi: Host: scsi8 Channel: 00 Id: 00 Lun: 00 Vendor: IBM Model: ULTRIUM-TD4 Rev: 74H4 Type: Sequential-Access ANSI SCSI revision: 03 Host: scsi8 Channel: 00 Id: 00 Lun: 01 Vendor: ADIC Model: Scalar i500 Rev: 400G Type: Medium Changer ANSI SCSI revision: 03 >From /var/log/dmesg: ACPI: PCI Interrupt 0000:09:00.0[A] -> GSI 16 (level, low) -> IRQ 16 PCI: Setting latency timer of device 0000:09:00.0 to 64 scsi8 : on PCI bus 09 device 00 irq 16 st 7:0:0:0: st2: try direct i/o: yes (alignment 512 B) st 7:0:0:0: Attached scsi generic sg36 type 1 lpfc 0000:09:00.0: 2:1303 Link Up Event x1 received Data: x1 x1 x10 x2 lpfc 0000:09:00.0: 2:(0):0108 No retry ELS command x4 to remote NPORT xfffffe Retried:3 Error:x3/18 scsi 8:0:0:0: Sequential-Access IBM ULTRIUM-TD4 74H4 PQ: 0 ANSI: 3 ACPI: PCI Interrupt 0000:09:00.1[B] -> <5>st 8:0:0:0: Attached scsi tape st3 st 8:0:0:0: st3: try direct i/o: yes (alignment 512 B) st 8:0:0:0: Attached scsi generic sg37 type 1 GSI 17 (level, low) -> IRQ 17 scsi 8:0:0:1: Medium Changer ADIC Scalar i500 400G PQ: 0 ANSI: 3 scsi 8:0:0:1: Attached scsi generic sg38 type 8 PCI: Setting latency timer of device 0000:09:00.1 to 64 The tape drive is directly attached to the Emulex Lightpulse FC controller. It is part of a Quantum (ADIC) i500 library, the library control interface is provided via LUN 1. Changing media works fine. I use the backup system Amanda (<http://www.amanda.org>), it worked/works fine with other tape drives (DLT-IV, SDLT-I and SDLT-II (sometimes called SDLT220 and SDLT600) and DLT-S4). Amanda acesses the tape drives via the non-rewinding device (/dev/nst0 etc.) As far as I know it does nothing special. The only reason I mention Amanda is that I was not able to reproduce the following problem with basic tools (dd and mt). The problem: While writing to the tape an error is returned. The kernel reports the following (the message appears in /var/log/messages): Dec 11 16:26:37 uxrs74 kernel: st3: Sense Key : Illegal Request [current] Dec 11 16:26:37 uxrs74 kernel: st3: Add. Sense: Invalid message error I have no idea what this means and which steps I should take. The amanda log file (/var/log/amanda/WeeklySet1/amflush) shows this: taper: r: switching to next holding chunk '/var/spool/amanda/server._.0.144' taper: r: switching to next holding chunk '/var/spool/amanda/server._.0.145' taper: reader-side: got label Set1-5-04 filenum 1 driver: state time 1556.183 free kps: 60000 space: 5075004836 taper: writing idle-dumpers: 12 qlen tapeq: 3 runq: 0 roomq: 0 wakeup: 0 driver-idle: not-idle driver: interface-state time 1556.183 if default: free 60000 driver: hdisk-state time 1556.183 hdisk 0: free 700190684 dumpers 0 hdisk 1: fre 950492456 dumpers 0 hdisk 2: free 950492460 dumpers 0 hdisk 3: free 798084516 dumpers 0 hdisk 4: free 530198076 dumpers 0 hdisk 5: free 530870476 dumpers 0 hdisk 6: free 614676168 dumpers 0 driver: result time 1556.183 from taper: DONE 00-00001 Set1-5-04 1 "[sec 1468.628 kb 152807296 kps 104047.6 {wr: writers 4775229 rdwait 444.006 wrwait 1007.537 filemark 4.438}]" driver: finished-cmd time 1831.009 taper wrote server:/ driver: send-cmd time 1831.009 to taper: FILE-WRITE 00-00002 /var/spool/amanda/anotherserver._.0 anotherserver UNKNOWNFEATURE / 0 20071209 0 driver: startaflush: LARGESTFIT anotherserver / 45242148 615188062 taper: writing end marker. [Set1-5-04 ERR kb 152822528 fm 2] Some explanation: It writes out 1 GB chunks to tape, and then the data for "server" is written out. And at the end it starts writing out the data for "anotherserver", and then the error arrives. >From an Amanda perspective this looks normal; just like if there were a media error while writing to tape. Above example suggests that this problem might be due to some control command when finishing or starting a new data blob (they are separated by so-called file marks (terminology as used in the mt manual page)). But I believe I have seen the same problem in the middle of data blobs too, but right now I do not find the right log file. And an only partially related problem, discussing this could be off-topic on linux-scsi: I tried to strace the taper process that writes to the tape. (I hoped to see some magic control command sent to the tape.) But strace instantly returned: ~# strace -p 9488 -fF Process 9488 attached - interrupt to quit --- SIGSTOP (Stopped (signal)) @ 0 (0) --- --- SIGSTOP (Stopped (signal)) @ 0 (0) --- restart_syscall(<... resuming interrupted call ...>) = 32768 read(4, ptrace: umoven: Input/output error 0xffffffff, 1690719488) = 0 _exit(64) = ? Process 9488 detached An Internet search for "ptrace: umoven" gave some hits, but none explained what it means and what the potential reasons are. If you have an idea please contact me. So thats everything I tried. Next step might be to contact Quantum, but I'd like to give them a more specific problem description. Ideas to reproduce this with mt/dd would be helpful too. Sven - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html