-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 > In the USB case when it comes back with a new name, as far as I > know there is no mechanism to handle that anywhere in the kernel. Is there a mechanism for other devices? >> Is XFS is not stable enough to function without a need to reboot >> in case of a relatively minor HW failure? Minor meaning affecting >> only some disks. > > It's not a question of XFS stability, IMHO. XFS was talking to > device A; device A went away and never came back. Well, it kinda did come back, but that's different story. > The issue of being unable to repair it seems to have been a result > of files still open on the (disappeared) device? Once you resolved > that, all was well, and no reboot was needed, correct? Yup, but xfs was still active without a trace in /proc/mounts, which what confused me. > I suggested the reboot as a big-hammer fix to clear the mysterious > stale mount; turns out that was not required, apparently. I don't like that particular hammer. Personal opinion, sure, but it seems to me that reboot is what you do when you don't know what went wrong or you know it's totally fubar. In this case, IMHO, not fubar. > If ustat(device) was reporting that it's mounted, but > /proc/partitions didn't show it, then the device was in some kind > of limbo state, I guess, and that sort of umount handling is below > XFS (or any other filesystem), as far as I know. I'm confused here. /dev/old was not in /proc/partitions or /proc/mounts, /dev/new was in /proc/partitions but not in /proc/mounts, even after disconnect and reconnect of the drive the /dev/new refused to be acted on by xfs_check or xfs_repair. How did that happen? All right, apparently there was a slate xfs instance in the kernel, not visible anywhere, but that was attached to /dev/old, why did xfs_repair fail to work on /dev/new until the stale xfs instance in the kernel finished shutting down. > What initiated the unmount, was it you (after the USB disconnect) > or some udev magic? The disconnect of the USB drive, specifically the internal HUB in the notebook failed (don't know how), I reset it from ssh (keyboard is also on the hub), see below, I didn't find any messages from any user space system, but they might not log everything, but there were messages about the XFS driver detecting the error, the USB hub being fubar-ed, the device being off-line, so I'm guessing it was the panic action, or maybe userspace. I'm not sure, I wasn't able to find out how XFS handles errors, there's nothing in the manual pages and google didn't help. Do you know? I.e. the equivalent of errors=remount_ro, or whatever. One page claimed xfs doesn't recognize this option. My system has the defaults and it's ubuntu/precise, if that helps. Martin May 2 15:49:06 lennie kernel: [344344.325232] sd 11:0:0:0: rejecting I/O to offline device May 2 15:49:39 lennie kernel: [344377.367220] hub 2-1:1.0: hub_port_status failed (err = -110) May 2 15:49:44 lennie kernel: [344382.459545] hub 2-1:1.0: hub_port_status failed (err = -110) May 2 15:49:50 lennie kernel: [344387.551918] hub 2-1:1.0: hub_port_status failed (err = -110) May 2 15:49:50 lennie kernel: [344388.413611] sd 6:0:0:0: rejecting I/O to offline device May 2 15:49:50 lennie kernel: [344388.413650] sd 6:0:0:0: rejecting I/O to offline device May 2 15:49:50 lennie kernel: [344388.413668] sd 6:0:0:0: rejecting I/O to offline device May 2 15:49:52 lennie kernel: [344390.062780] sd 6:0:0:0: rejecting I/O to offline device May 2 15:49:52 lennie kernel: [344390.062837] ffff8801034da000: 80 ab 4d 03 01 88 ff ff 00 00 70 b4 f0 7f 00 00 ..M.......p..... May 2 15:49:52 lennie kernel: [344390.062844] XFS (sdb104): Internal error xfs_dir2_data_reada_verify at line 226 of file /build/buildd/linux-lts-raring-3.8.0/fs/xfs/xfs_dir2_data.c. Caller 0xffffffffa079e33f May 2 15:49:52 lennie kernel: [344390.062844] May 2 15:49:52 lennie kernel: [344390.062852] Pid: 642, comm: kworker/0:1H Tainted: G C 3.8.0-39-generic #57~precise1-Ubuntu May 2 15:49:52 lennie kernel: [344390.062854] Call Trace: May 2 15:49:52 lennie kernel: [344390.062902] [<ffffffffa07a018f>] xfs_error_report+0x3f/0x50 [xfs] May 2 15:49:52 lennie kernel: [344390.062921] [<ffffffffa079e33f>] ? xfs_buf_iodone_work+0x3f/0xa0 [xfs] May 2 15:49:52 lennie kernel: [344390.062939] [<ffffffffa07a01fe>] xfs_corruption_error+0x5e/0x90 [xfs] May 2 15:49:52 lennie kernel: [344390.062966] [<ffffffffa07da159>] xfs_dir2_data_reada_verify+0x59/0xa0 [xfs] May 2 15:49:52 lennie kernel: [344390.062986] [<ffffffffa079e33f>] ? xfs_buf_iodone_work+0x3f/0xa0 [xfs] May 2 15:49:52 lennie kernel: [344390.062994] [<ffffffff8108e54a>] ? finish_task_switch+0x4a/0xf0 May 2 15:49:52 lennie kernel: [344390.063013] [<ffffffffa079e33f>] xfs_buf_iodone_work+0x3f/0xa0 [xfs] May 2 15:49:52 lennie kernel: [344390.063019] [<ffffffff81078de1>] process_one_work+0x141/0x4a0 May 2 15:49:52 lennie kernel: [344390.063024] [<ffffffff81079dd8>] worker_thread+0x168/0x410 May 2 15:49:52 lennie kernel: [344390.063029] [<ffffffff81079c70>] ? manage_workers+0x120/0x120 May 2 15:49:52 lennie kernel: [344390.063034] [<ffffffff8107f300>] kthread+0xc0/0xd0 May 2 15:49:52 lennie kernel: [344390.063039] [<ffffffff8107f240>] ? flush_kthread_worker+0xb0/0xb0 May 2 15:49:52 lennie kernel: [344390.063046] [<ffffffff816ff56c>] ret_from_fork+0x7c/0xb0 May 2 15:49:52 lennie kernel: [344390.063050] [<ffffffff8107f240>] ? flush_kthread_worker+0xb0/0xb0 May 2 15:49:52 lennie kernel: [344390.063054] XFS (sdb104): Corruption detected. Unmount and run xfs_repair May 2 15:49:52 lennie kernel: [344390.067128] sd 6:0:0:0: rejecting I/O to offline device May 2 15:49:52 lennie kernel: [344390.067158] XFS (sdb104): metadata I/O error: block 0x8a6ec930 ("xfs_trans_read_buf_map") error 117 numblks 8 May 2 15:49:52 lennie kernel: [344390.067179] ffff8801034da000: 80 ab 4d 03 01 88 ff ff 00 00 70 b4 f0 7f 00 00 ..M.......p..... May 2 15:49:52 lennie kernel: [344390.067184] XFS (sdb104): Internal error xfs_dir2_block_verify at line 71 of file /build/buildd/linux-lts-raring-3.8.0/fs/xfs/xfs_dir2_block.c. Call er 0xffffffffa07d7f3e -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBCgAGBQJTY9vCAAoJELsEaSRwbVYrIhsQAIDDL7yshllWCBcxSDmfdefh PMTgMxvzprexd+5xqh14klDySA78FZM44bzMd5mjABQ+GvE0hhbB6kLMQSuySXWi c+nNtpZXsW7R+o5D0GymWF1PYn3KfbE/aJ3lrLtA6yddwV0KanB4SxD45HoiKGdJ 1a2uLZB4G8ZjvyO6tQYn63R9GMWIX0mK5TovzrXY5JRaTIhYxwwTJjKzQpT+N67m nWb86Ve3ahDQHBZx1hhf/xRtKYjgPENH57goKyZqdcmUlTgm2AUhsN0tbfm5T1sX Bb0f4ZOebkfdhXfq5Sk/Eysz7gL+CdPwETJUwr/Z42QFUZfkK1/G1bbJTXZeXi8B cngPk65VxV4UCGX3nzVpg5wk7scelIFULrmUM8FgiR3+SN6oZ4cWycQLGYr44j4k UchuHcZpuMvCiHIPXWGk1PASIWUqdy7eroj900pVVGBMRwyiNe3pmbVHOpjK2owi KaCUiDB86WuKK9V5SSWL3UgVfjy994vZEIvOczaf7+vKfkhW4OX2MJNXDGmWW0/E 3JFbIrD8ETPGhYR2+emRZhOa6op8I5buvkegfMLgWhRxh5jlxxeZ6e2ZdUHc8Ty2 r8xaKnoJArehYzUKxqPCBLwRNljGBMrZ+F1O2Ifemm4cWtocmG56Ae3WvbM+btEH 2po38EG9LNPvuquUJqxy =+zQ+ -----END PGP SIGNATURE----- _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs