On Sun, Jan 3, 2021 at 11:06 PM Andrej Podzimek via users <users@xxxxxxxxxxxxxxxxxxxxxxx> wrote: > > Are you sure you are opening the right LUKS device in the live environment? Is the LUKS device readable (e.g. just using "cat /dev/mapper/dm_crypt > /dev/null")? (Does its size look right, e.g. in "lsblk -p"?) Do you get any read errors in dmesg (for NVME / SAS / SATA)? If you pipe your direct partition read through "pv -arb" ("pv -arb /dev/mapper/dm_crypt > /dev/null") (or another cat-like tool that shows the data rate), does it look reasonable? Yes it is fully readable. I just got a full ddrescue image that had 0 bad-sectors. So nothing is wrong with my disk. This is the ddrescue output: GNU ddrescue 1.25 Press Ctrl-C to interrupt Initial status (read from mapfile) rescued: 998575 MB, tried: 0 B, bad-sector: 0 B, bad areas: 0 Current status ipos: 0 B, non-trimmed: 0 B, current rate: 0 B/s opos: 0 B, non-scraped: 0 B, average rate: 0 B/s non-tried: 0 B, bad-sector: 0 B, error rate: 0 B/s rescued: 998575 MB, bad areas: 0, run time: 0s pct rescued: 100.00%, read errors: 0, remaining time: n/a time since last successful read: n/a Finished As you can see there are no bad sectors. $ pv -arb /dev/mapper/dm_crypt > /dev/null 452GiB [92.0MiB/s] [97.5MiB/s] The data rate is also reasonable. > > Saving a binary image of your device would be a good first step — if the device is still readable. > Yes, I did that that's why you are getting a late reply. > What makes you so sure that this is a Btrfs problem, as opposed to a SSD or hard drive failure or a RAM failure causing data corruption? > (Were there no other errors before the Btrfs errors in "dmesg"?) I think it is BTRFS because I recently had to do a lot of snapshot creation and restoration. Also, I don't think my RAM is to blame since I have never had a problem with it, even now I have been on my live system for about 14 hrs, since I had to get all my work done from there. > > > While data loss of any kind is (understandably) frustrating, claiming that Btrfs is “unstable” is plain wrong and unhelpful and it is unlikely to motivate Btrfs experts to chime in and help. > :-/ > I believe it's better to call this out, rather than worry about hurting peoples feelings. > A few suggestions: > 0. Take a binary backup of your Btrfs device, if it’s still readable. Done. > 1. Check your RAM. Does the machine have ECC? You may want to give it a few hours of memtest, no matter what. > I don't think my RAM is at fault. What is an ECC ? I will give it a memtest irregardless and get back to you, but I think it will be a waste of time. > 2. Check your SSD / disk whether it’s reading at a reasonable pace and showing nothing suspicious in "smartctl -A" and "dmesg". > SmartCTL output: https://pastebin.com/raw/B6AdLZXt I ran the smartctl test a month ago, since I though there was something wrong with my HDD but the guys on the mailing list told me I did not have to worry. https://listi.jpberlin.de/pipermail/smartmontools-support/2020-November/000560.html > 3. Then there are a few tools (see man btrfs-check, man btrfs-rescue, man btrfs-restore) you might want to try, depending on the situation. Some of them require help from Btrfs experts (at which point you may want to ask on their kernel mailing lists). > Yeah that's the only option I have left. -- Regards, Sreyan Chakravarty _______________________________________________ users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx