Nope, I haven't read the code. I only see a low sync speed (fluctuating from 20 to 80MB/s) whilst the drives can perform much better doing sequential reading and writing (250MB/s per drive and up to 600MB/s all 4 drives in total). During the sync I hear a high noise caused by heads flying there and back and that smells. The chosen drives have poor seeking performance and small caches and are probably unable to reorder the operations to be more sequential. The whole solution is 'economic' since the organisation owning the solution is poor and cannot afford better hardware. That also means RAID6 is not an option. But we shouldn't search excuses what's wrong on the chosen scenario when the code is potentially suboptimal :] We're trying to make Linux better, right? :] I'm searching for someone, who knows the code well and can confirm my findings or who could point me at anything I could try in order to increase the rebuild speed. So far I've tried changing the readahead, minimum resync speed, stripe cache size, but it increased the resync speed by few percent only. I believe I would be able to write my own userspace application for rebuilding the array offline with much higher speed ... just doing XOR of bytes at the same offsets. That would prove the current rebuild strategy is suboptimal. Of course it would mean a new code if it doesn't work like suggested and I know it could be difficult and requiring a deep knowledge of the linux-raid code that unfortunately I don't have. Any chance someone here could find time to look at that? Thank you, Jaromir Capik On 09/01/2022 14:21, Jaromír Cápík wrote: >> In case of huge arrays (48TB in my case) the array rebuild takes a couple of >> days with the current approach even when the array is idle and during that >> time any of the drives could fail causing a fatal data loss. >> >> Does it make at least a bit of sense or my understanding and assumptions >> are wrong? > >It does make sense, but have you read the code to see if it already does it? >And if it doesn't, someone's going to have to write it, in which case it > >doesn't make sense, not to have that as the default. > >Bear in mind that rebuilding the array with a new drive is completely > >different logic to doing an integrity check, so will need its own code, > >so I expect it already works that way. > > >I think you've got two choices. Firstly, raid or not, you should have > >backups! Raid is for high-availability, not for keeping your data safe! > >And secondly, go raid-6 which gives you that bit extra redundancy. >Cheers, > >Wol