Re: Bug when mounting XFS with external SATA drives in USB enclosures

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 22, 2019 at 8:21 PM Pedro Ribeiro <pedrib@xxxxxxxxx> wrote:
>
> Hi,
>
> I have been trying to find out the cause of a bug that's affecting all
> my external hard drive backups.
>
> I have three external drives, in different USB enclosures, with the same
> configuration and the same problem.
>
> Drive A: 2TB HDD, USB3 Seagate self enclosed drive
> Drive B: 4TB HDD, USB3 Toshiba self enclosed drive
> Drive C: 512MB SSD, Crucial MX500 with USB-C third party enclosure
>
> All of the drives have a dm-crypt / LUKS on top, with a XFS partition
> inside. Drive A is a few months old, Drive B is about 3 years old, drive
> C about 1.5 years old. They are seldomly used (they're backup drives) so
> they are all fine mechanically.
>
> The problem is when I attach any of the drives, enter the LUKS password
> and then try to mount, this happens:
> [   66.039772] XFS (dm-0): Mounting V5 Filesystem
> [   66.060934] XFS (dm-0): log recovery read I/O error at daddr 0x0 len
> 8 error -5
> [   66.060939] XFS (dm-0): empty log check failed
> [   66.060940] XFS (dm-0): log mount/recovery failed: error -5
> [   66.061064] XFS (dm-0): log mount failed
>
> No matter what I do, using all the recovery tools, etc, it's impossible
> to mount...
>
> The thing is that is there is NOTHING wrong with these drives. The above
> happens when running my specific, stripped and locked down kernel config.
>
> If I take Debian's 4.19 kernel config, put it on a 5.3.11 tree, run make
> oldconfig and just answer the defaults on all prompts, all of the drives
> above mount fine:
> [   46.184068] XFS (dm-0): Mounting V5 Filesystem
> [   46.412566] XFS (dm-0): Ending clean mount
>
> I hit this problem recently when I moved from kernel 4.18.20, which I
> was using for a long time, to 5.3.X. In kernel 4.18.20, I did not have
> any problems with my specific stripped down config.
>
> I have asked for help in IRC at #xfs, and one of the guys there (ailiop)
> was very helpful in trying to track down the problem, but we ultimately
> failed, hence why I'm asking for help here.
>
> I'm attaching the kernel configs and the dmesg outputs. There is nothing
> obvious in the kernel config diff that should make this happen... it's a
> very weird bug.
>
> Regards,
> Pedro

What about checking for differences in kernel messages between the
stripped down and stocked kernel, during device discovery. That is
connect no drives, boot the stripped kernel with the problem, connect
one of the problem USB devices, record the kernel messages that
result. Repeat that with the stock Debian kernel that doesn't exhibit
the bug.

My guess is this is some obscure USB related bug. There are a ton of
bugs with USB enclosure firmware, controllers, and drivers.

Also, is this USB enclosure directly connected to the computer? Or to
a powered hub? I have inordinate problems with USB enclosures directly
connected to an Intel NUC, but when connected to a Dyconn USB hub with
external power source, the problems all go away. And my understanding
is the hub doesn't just act like a repeater. It pretty much rewrites
the entire stream. So there's something screwy going on either with
the Intel controller I have, or the USB-SATA bridge chip, that causes
confusing that the hub eliminates.

And it may be that your stripped down kernel has turned off some
obscure USB related error checking or mode switching that this
particular setup needs.


-- 
Chris Murphy



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux