On Thu, May 19, 2005 at 03:26:11PM -0400, Alex Deucher wrote: > I have Nexsan ATAbeast SAN connected to an AMD64 (sun v20z) and > SPARC64 (sun 220R) server using lpfc HBAs (using the in kernel lpfc > driver, kernel 2.6.12-rc4). About once every 4-5 days, the server > loses its connection to the SAN and I get these messages in my log: > May 19 09:01:08 nutcracker scsi1 (0:0): rejecting I/O to offline device > May 19 09:01:08 nutcracker metapage_read_end_io: I/O error > May 19 09:01:08 nutcracker scsi1 (0:0): rejecting I/O to offline device > May 19 09:01:08 nutcracker metapage_read_end_io: I/O error > May 19 09:01:08 nutcracker ERROR: (device dm-4): DT_GETPAGE: dtree page corrupt > May 19 09:01:09 nutcracker scsi1 (0:0): rejecting I/O to offline device > May 19 09:01:09 nutcracker metapage_read_end_io: I/O error > May 19 09:01:09 nutcracker ERROR: (device dm-4): DT_GETPAGE: dtree page corrupt > > Nothing unusual shows up in the SAN logs. I've already adjusted the > cache flushing on the SAN and changed the scsi timeouts to 45 seconds. > I asked emulex about it, but I'm wondering if this is something in > the scsi layer. Has anyone else had similar problems or know what the > problem may be? Yes, could be a timeout, but the device would not go offline unless we could not talk to it at all after the timeout (TUR failed, or of course some bug). There should be earlier errors about the device being offline, look for and post those. - : send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html