On 02/01/2021 23:10, Teejay wrote:
On 02/01/2021 19:32, antlists wrote:
On 02/01/2021 12:37, Teejay wrote:
Hi,
Point taken, but it is what it is and I need to fix it. The internet is
full of advice about RAID - most of it contradicts itself as there is
little real consensus (welcome to the 21st Century :-) . As a newbie, I
went with what seemed right at the time. I concede I made a bad call.
Let's move on.
That's why I took on updating the wiki. To try and provide an up-to-date
reference site. Problem is, like you did, people usually find the old
duff stuff first.
To upgrade the array I used the following command:
sudo mdadm --grow /dev/md0 --level=5 --raid-devices=5 --add /dev/sde
/dev/sdf --backup-file=/tmp/grow_md0.bak
To my surprise it returned almost instantly with no errors. So I had
a look at the status:
Did your example tell you to use --backup-file? If it did, I hope it
wasn't the wiki!
If the --grow told you it needed a backup, I'd be surprised but if it
asks for it it needs it.
Once you were trying to fix things, it would be normal for it to ask for
the backup you originally gave it ...
less /proc/mdstat
and it came back as being a raid 5 array and stated that it was
reshape = 0.01% and would take several million minutes to complete!
Somewhat concerned, I left it for half an hour and tried again only
to find that the number of complete blocks was the same and the time
had grown to an even more crazy number. It was clear the process had
stalled.
uname -a ?
Linux lounge 5.4.0-58-generic #64-Ubuntu SMP Wed Dec 9 08:16:25 UTC 2020
x86_64 x86_64 x86_64 GNU/Linux
mdadm --version ?
mdadm - v4.1 - 2018-10-01
This sounds similar to a problem we've regularly seen with raid-5. And
it's noticeable with older Ubuntus.
My install of Ubuntu is not that old is it? - and it has all the
official updates.
lsb_release -a
Distributor ID: Ubuntu
Description: Ubuntu 20.04.1 LTS
Release: 20.04
Codename: focal
That's new enough. So it's not (I hope) that problem.
Ahhhhh ... you MAY be able to RMA them. What are they? If they're WD
Reds I'd RMA them as a matter of course as "unfit for purpose". If
they're BarraCudas, well, tough luck but you might get away with it.
RMA? - Newbie here!
Well, I don't know what the letters stand for, but the expression is a
pretty standard term for "return to supplier". If you return anything
that is defective, the supplier will usually ask you to "fill in an RMA".
Okay. What are those drives? I'm *guessing* your original three drives
were WD Reds. What is the type number? If they're Reds this ends in
EFAX or EFRX, if I remember correctly. I think EFAX are good and EFRX
are bad. It could be the other way round ...
That means little to me. This is what I know: The four drives that form
the working array are the three original ones and one of the new ones.
Non of them are Reds. They are all the same Seagate drives, though it is
possible they are different internally as they were purchased at
different times, I have not opened them up. The Array is working and
AFAIK it is all there, I can find not evidence to the contrary. It
mounts (I have only tried Read Only), and I can access the data, with
not apparent issues.
Dare I suggest you need to read this page ...
http://www.catb.org/~esr/faqs/smart-questions.html
If you haven't come across ESR he may be a bit of a nutter, but he is an
extremely good psychologist - it is well worth reading!
I gave you a link. If you went there, the very first link in it gave you
some advice - https://raid.wiki.kernel.org/index.php/Asking_for_help
YOU DIDN'T FOLLOW IT.
One of the things it asks for is the smartctl info for your drives - ALL
OF THEM. It'll tell you the model number of your drives. How many
different models do you have? Are they SMR? Google the model numbers and
see if you can find out! If you come back and say you can't make head or
tail of what you've found, that's fine. What's not fine is if you don't try.
I need to somehow get back to a useful state. If I could get back to
level 0 with three drives that would be great. I could then delete
some junk and then backup the data the other two drives using rsync
or something.
So I guess my questions are
1 - Can I safely get back to a three drive level 0 RAID thereby
freeing the two drives I added to allow me to make a backup of the data?
I'll let others comment.
2 - Even if I can revert, should I move my data and no longer even
use RAID 0 until I can get some decent hard drives?
Don't mess with it yet.
3 - Any other cunning ideas, at the moment I think my only option, if
I can't revert, which is to buy many TB's of storage to back up the
read only file system, which I can ill afford to do!
Okay, to throw another option into the mix, get that 12TB BarraCuda, and
copy your data across as your main drive. You're probably better off
with an IronWolf, but I don't know if they come as a 12TB and they'd
cost rather more ...
Then you can combine the other drives with btrfs as a backup volume.
That way, if the 12TB breaks you've got a backup, and if one of the 4TBs
break, btrfs means you only lose what's on that disk (which is "backed
up" on your live disk ...)
Sounds like you misunderstood what I wrote; sarcasm is not a great way
of helping someone, especially when you only half read the email!
Sorry, I don't mean to be sarcastic, and offering to spend more money
usually gets vendors on side ...
again let's move on!
and imho, you NEED to back up your data, which I think means spending
money, whether you can afford it or not :-(
I personally don't have experience playing with broken arrays (unlike
others on this list), so I don't want to advise you to do something that
trashes your array and loses everything ...
I did read the wiki, promise! More than once! - it does not seem to
cover my situation, or if it does, it makes a very good job of hiding
it. Like many such sites is was clearly written by someone that knows
exactly how everything works and forgets that the reader will not
necessary have a similar level of knowledge, in other words, it is not
for written for novices. It uses many abbreviations and has an
assumption of knowledge that is way beyond mine. Asking for stuff if
fine, but not much help if you don't know what it is asking for. While I
really appreciate the help, it is not useful if you to go on the
offensive. I said up front that I am a newbie. The best piece of advice
I could find on the wiki was not to do anything unless I was sure and to
ask for help first, thus the email. Unfortunately, I did not find the
wiki in time to avoid getting in to the mess, but when I did I followed
the advice and asked for help - so all I ask is for some help getting
out of this mess. If you need more info, I will do my best to provide it
but please remember I am at the bottom looking up, and the view from
down here is not as clear as the view from where you stand; I understand
it can be difficult to remember that, I am an engineer too, just not one
that know anything about RAID :)
You describe exactly how I often feel, so we do understand. And yes,
when your raid isn't working properly I can understand the panic. Let's
work out WHAT is wrong. I'm hoping, if your drives were purchased at
different times, the problem might be the new drives only are SMR. If it
looks like something I'm not happy dealing with, I can kick it up to
people who have more experience.
Cheers,
Wol