Hello, > Is your Arch install on a SSD? If so, there's probably an issue > related to the "flash retention time". > > If an SSD is not supplied with power, it loses data relatively quickly > even in the best condition. You can sometimes read that six months > could already be too long. I believe that when SSDs become old, the > period of time in which they can safely retain data without a power > supply is shortened. Reading about this just happens to be on my > to-do list, just like booting up my old desktop PC that hasn't been > connected to the mains for months, maybe even longer than half a year. Also bare in mind bit rot. HDDs used to rot due to magnetic interference, hence do not pass your HDDs through a metal detector, but there is always background EMF, it also messes up packets too, so don't go running powerlines next to your data lines :P SSDs however store bits based on highs and lows, it is very easy for bits to flip, and storing that bit with fluctuated charge can be difficult, hence why SSDs are considered only useful for short term storage and why HDDs and tape can't be replaced. Point is, there is a possibility that your system has slowly rotted over the years and the wrong thing rotted. ext4 has no bit rot detection, I know a lot of people who say ext4 is old and insecure, but bare in mind the overhead of the new filesystem features in say, btrfs. btrfs does have detection as it does checksumming on the filesystem level, however this only detects the bit rot, it can't fix it, which requires redundant storage or backups. I believe the standard for bit rot protection is run 2 SSDs both with btrfs and then RAID 1 them, when a checksum fails, it pulls the sector from the other SSD allowing decent data integrity, correct me if I am wrong, I am bit rot insecure as I don't store any data on my daily driver of use. Anyways, its said all the time for good reason, always backup your data, also configurations are a must to backup too, there is nothing worse than losing firewall configurations and needing to rewrite it all (as an example). > As it's on my todo list, I don't have any knowledge yet. It's just a > guess that an old or somehow damaged SSD might look good, pass write > and read tests, but the margin in which it can hold data without > power might be very, very short at some point. I am not a data expert, but I have done reading on this in the past. SSDs unlike HDDs have no clear sign of degredation, HDDs sound funny or begin to tick or crackle when they are dying, their IOPS drop considerably, but these are only signs and HDDs can still random die. SSDs cells will die, but they are designed to move data to redundant cells, this happens at a firmware level and it handles it itself. S.M.A.R.T checking is a good prevention method, but it is now flawless, it will flag when a drive isn't acting as suspected, but some drives can fail S.M.A.R.T and last for years still, so take it as a pinch of salt. I believe some SSD firmware allows you to check the number of redundant cells remaining, but I am not sure. Data corruption, data disappearing or the SSD disappearing entirely are good signs, but by that time, your data is gone already... I recommend reading the S.M.A.R.T ArchWiki page [1], it never hurts to check periodically to see how your drives are functioning, some drives even tell you the total read/writes, maybe its just me being obsessive about numbers, but I love seeing how these numbers gradually increase over time (along with my battery charge cycles, and how the total capacity decreases). As I said I am not a data expert, knowledge here is what I have read, and discussed with others who are experienced in this field, feel free to correct me if I made a mistake. Take care, -- Polarian GPG signature: 0770E5312238C760 Website: https://polarian.dev JID/XMPP: polarian@xxxxxxxxxxxx [1] https://wiki.archlinux.org/title/S.M.A.R.T.
Attachment:
pgpS3j0wBS0uz.pgp
Description: OpenPGP digital signature