Re: BTRFS partition corrupted after deleting files in /home

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 13 Jan 2021 at 05:41, Sreyan Chakravarty <sreyan32@xxxxxxxxx> wrote:
On Tue, Jan 12, 2021 at 9:16 AM Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote:
>
>
> -x has more information that might be relevant including firmware
> revision and some additional logs for recent drive reported errors
> which usually are benign. But might be clues.
>
> These two attributes I'm not familiar with
> 187 Reported_Uncorrect      0x0032   100   096   000    Old_age
> Always       -       4294967301
> 188 Command_Timeout         0x0032   100   100   000    Old_age
> Always       -       98785820672
>
> But the value is well above threshold for both so I'm not worried about it.
>
>

Here is the output of:

# smartctl -Ax /dev/sda

https://pastebin.com/raw/GrgrQrSf

I have no idea what it means.

You are not alone.    Most people stop reading at the
line: 
SMART overall-health self-assessment test result: PASSED
Before retiring I worked in remote sensing, which is a data-intensive
activity.   HDD failures were a major issue.   One sure way to kill a
drive was to start a batch job that filled a disk and then kept hammering
the drive over a long weekend when I was off somewhere without network
access.   I could usually get warranty replacements for failed drives by
submitting the smartctrl reports.  We use XFS starting on SGI IRIX and
then on linux when it became available, with striped arrays for
thruput with I/O bound processes.  XFS was designed to avoid lengthy
filesystem repair times, so getting a system back after a drive failure
just meant waiting for the tape robot to find and restore the backup tapes.

HDD's are mechanical so subject to wear.  With heavy use they tend to die
shortly after end-or-warranty.    I started replacing drives at end-or-warranty
which, along with measures to reduce runaway batch jobs, greatly reduced
the number of failures.  Your drive has been used for 1671 hours, and
1491 power-on cycles.   Mechanical device wear is often highest at startup,
so this is probably getting close to the design lifetime of a consumer laptop
HDD.

There are workloads (image processing, numerical modelling) where recovering
the work done since the last backup just means restarting a batch job and is
generally easier than trying to repair a filesystem with a bunch of partially written
HDF5 files.  

Given the age of your HDD, I would replace it.   If your laptop came with Windows,
you should be able to install Windows 10 on a small partition in order to upgrade the
BIOS and maybe run the drive vendor's diagnostics.   You may want to revisit your
choices of drive technology, filesystem, backup and recovery strategy, etc. with
your use case in mind.  


This is the problem with SMART tests, they are so esoteric that it is
difficult for a common user to make sense of it.

Let me know what you think, if you see any glaring faults.


You are to be commended for helping the btrfs developers investigate one of the
rare situations that make filesystems such a hard problem.   My experience indicates
your HDD is involved, either by old age or some BIOS or drive firmware glitch, so
your best way forward is to make sure your BIOS is current and replace the drive
with one suited to your use case.


--
George N. White III

_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
[Index of Archives]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [EPEL Devel]     [Fedora Magazine]     [Fedora Summer Coding]     [Fedora Laptop]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Desktop]     [Fedora Fonts]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Yosemite News]     [Gnome Users]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [Fedora Sparc]     [Libvirt Users]     [Fedora ARM]

  Powered by Linux