If it is only giving btrfs errors on 6. 9.x and not the rescue kernel and 6.8.x that would seem like a potential kernel bug. Run on 6.8.x and wait for say 6.10 would be best.
--On Fri, Jul 26, 2024 at 8:59 AM John Mellor <john.mellor@xxxxxxxxx> wrote:On 2024-07-26 8:25 a.m., Richard Shaw wrote:
On Thu, Jul 25, 2024 at 6:29 PM Jeffrey Walton <noloader@xxxxxxxxx> wrote:
On Thu, Jul 25, 2024 at 2:15 PM Richard Shaw <hobbes1069@xxxxxxxxx> wrote:
>
> I recently had the Fedora install on my laptop go sideways (Ryzen 5 4500U w/ nvme disk).
>
> The filesystem was going readonly so I installed System Rescue CD to a thumb drive to investigate. Sure enough I had 4 unrecoverable errors.
>
> I don't keep anything critical on it so I decided to just reinstall with Fedora 40. Installation went fine but I did notice weird dnf output on my first updated buy everything SEEMED fine...
>
> I rebooted after the update and tried to log in when after a minute or two the system froze. Rebooted and sure enough a `dmesg | grep BTRFS` showed an error.
>
> Back to booting with System Rescue CD neither a `btrfs check --check-data-csum` or after mounting, a `btrfs scrub` show any errors.
>
> So who's right? And if there is an error, what's causing it? I've checked the drive with smartctl and even let the factory HP firmware diag tools run in a loop overnight checking everything without error.
The (1) irrecoverable disk errors from the original install, and (2)
the errors from the current install, and (3) the errors from dnf
indicate (to me) you have a failed NVMe drive. I used to see the
symptoms all the time when using SDcards in ARM dev boards. I would
put a swap file on the dev board (due to lack of resources), and the
drives would fail within about 6 months with the symptoms you
describe.
Now the interesting part (to me) is, (4) lack of errors reported by
some tools. That indicates to me a Chinese drive that misreports drive
size and statistics. They usually show up on thumb drives, but I
experienced one on a SSD drive years ago. Also see
<https://www.google.com/search?q=counterfeit+drive+misreport+size>.
All in all, I would replace the NVMe drive with a new one from a
trusted source. Not Amazon or eBay.
It's the drive that came with the laptop so unlikely to be a cheap/phony drive but the mystery does get deeper...
1. I was able to see the same results even if I booted to a F40 Live USB. I'm thinking that the system caught the problem quick enough the error didn't actually get written to the disk.
2. I consistently see the problem at about 30 seconds (from dmesg) if I boot the 6.9.9 or 6.9.10 kernels that have been installed via updates. If I boot 6.8.5, the kernel that shipped with F40 I can't reproduce the problem.
Of course that's strange because if this was a widespread issue there would be tons of people complaining.
Odds are that you have bad ram or are running the processor clock higher than what it can handle. I also had this kind of issue when I had a bad video card, but the system generally froze or crashed and left the drive in an unrecoverable state. The tools for fixing a btrfs partition are generally lacking in Fedora, and the tools that come with btrfs are also useless when the failing partition is your active root partition. I don't know if Suse has better tools, but its a huge problem with Fedora recoverability.
It's an HP Envy Laptop, no ability to overclock. I did upgrade the memory when I first got it over 3 years ago from 8GB to 16GB but it's plain DDR4-3200. As I previously mentioned I let the HP diag tools run overnight and completed 14 cycles without any errors and now I just finished letting Memtest86+ run for 5 complete cycles without any errors.The only common denominator I have found so far is the two 6.9 kernels I have installed.Thanks,Richard
_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
-- _______________________________________________ users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue