Hello, so it worked as we expected and it was most probably caused by that bug. Thanks for the assistance. Patrik 2014-09-22 12:17 GMT+02:00 NeilBrown <neilb@xxxxxxx>: > On Mon, 22 Sep 2014 09:20:35 +0200 Patrik Horník <patrik@xxxxxx> wrote: > >> Well I browsed through the sources of latest mdadm version at night >> instead of sleeping :) and was searching how it got clean flag set to >> 0. And was not sure exactly about that line there and from which >> device it gets clean state. So the bug was that it can get it from not >> current device? It makes sense because first device identified by >> mdadm is old md101 device. > > Correct. > >> >> So will it work then if I use 3.3 and somehow dont give it md101 >> device? By stopping it before -A call or by manually specifying other >> drives? Or you really recommend to build latest version of mdadm? > > mdadm -S /dev/mdXX list-of-devices-that-are-working > > should start the array for you using 3.3. > Then > mdadm /dev/mdXX --re-add /dev/md101 > > will re-add the device and hopefully do a quick bitmap-based rebuild. > >> >> What is expected behaviour with 3.3.1+? Can it be started with all >> devices and it should automatically start to recover md101? If so what >> is best way, to start it without md101 and then use -re-add to add it, >> start it without md101 and use -add or start it also with md101? > > --add should have the same effect as --re-add. > > I think mdadm 3.3.1 will just assemble the array without md101 and you then > have to --re-add that yourself. To get it re-added automatically you needed > to have "policy action=re-add" or similar in mdadm.conf, and then use > mdadm -I /dev/devicename > on each device. That should put the whole array together and re-add anything > that needs it. > > I think. > > NeilBrown > > >> >> Thank you very much. >> >> 2014-09-22 8:56 GMT+02:00 NeilBrown <neilb@xxxxxxx>: >> > On Mon, 22 Sep 2014 08:34:21 +0200 Patrik Horník <patrik@xxxxxx> wrote: >> > >> >> - Well what is exact meaning of --no-degraded then? Because I am using >> >> it also on RAID6 arrays that are missing one drive and mdadm starts >> >> them. I thought until today that it is against assembling for example >> >> RAID6 array with missing more than two drives or to be more precise >> >> array with number of drives it used last time. (I did not look at the >> >> code what does it exactly. It is mdadm 3.3 on Debian.) >> > >> > Sorry, I confused myself. >> > "--no-degraded" means "Only start the array if all expected devices are >> > present". >> > So if the array "knows" that one device is missing, it will start if all >> > other devices are present. But if it "thinks" that all devices are working, >> > then it will only start if all the devices ar there. >> > >> >> >> >> - Well array was shutdown cleanly manually by mdadm -S. Cant the not >> >> clean classification be result of md101 device between find devices or >> >> result of first two assemble tries? >> > >> > If the state still says "Clean" (which it does, thanks), the mdadm should >> > treat it as 'clean'. >> > >> > I think you are probably hitting the bug fixed by >> > >> > http://git.neil.brown.name/?p=mdadm.git;a=commitdiff;h=56bbc588f7f0f3bdd3ec23f02109b427c1d3b8f1 >> > >> > which is in 3.3.1. >> > >> > So a new version of mdadm should fix it. >> > >> > NeilBrown >> > >> > >> > >> >> >> >> - Anyway as I mentioned superblock on all five devices has clean state. Example: >> >> /dev/sdk1: >> >> Magic : XXXXXXX >> >> Version : 1.2 >> >> Feature Map : 0x1 >> >> Array UUID : XXXXXXXXXXXXXXXXXXXXX >> >> Name : >> >> Creation Time : Thu Aug XXXXXXXX >> >> Raid Level : raid6 >> >> Raid Devices : 6 >> >> >> >> Avail Dev Size : 5860268943 (2794.39 GiB 3000.46 GB) >> >> Array Size : 11720536064 (11177.57 GiB 12001.83 GB) >> >> Used Dev Size : 5860268032 (2794.39 GiB 3000.46 GB) >> >> Data Offset : 262144 sectors >> >> Super Offset : 8 sectors >> >> Unused Space : before=262056 sectors, after=911 sectors >> >> State : clean >> >> Device UUID : YYYYYYYYYYYYYYYYYYYY >> >> >> >> Internal Bitmap : 8 sectors from superblock >> >> Update Time : Mon Sep 22 02:23:45 2014 >> >> Bad Block Log : 512 entries available at offset 72 sectors >> >> Checksum : ZZZZZZZZ - correct >> >> Events : EEEEEE >> >> >> >> Layout : left-symmetric >> >> Chunk Size : 512K >> >> >> >> Device Role : Active device 4 >> >> Array State : AAAAA. ('A' == active, '.' == missing, 'R' == replacing) >> >> >> >> - md101 has Events count lower by 16 than others devices. >> >> >> >> - Please I need little more assurance what is exact state of array and >> >> explain why it is behaving as it is behaving, so I can be sure what >> >> steps are needed and what happens. The data on array is important. >> >> Patrik Horník >> >> šéfredaktor www.DSL.sk >> >> Tel.: +421 905 385 666 >> >> Email: patrik@xxxxxx >> >> >> >> >> >> 2014-09-22 5:19 GMT+02:00 NeilBrown <neilb@xxxxxxx>: >> >> > On Mon, 22 Sep 2014 04:11:20 +0200 Patrik Horník <patrik@xxxxxx> wrote: >> >> > >> >> >> Hello Neil, >> >> >> >> >> >> I've got this situation unfamiliar to me on RAID6 array md1 with important data. >> >> >> >> >> >> - It is RAID6 with 6 devices, 5 are partitions and 1 is another RAID0 >> >> >> array md101 from two smaller drives. One of the smaller drives froze, >> >> >> so md101 got kicked out from md1 and marked as faulty in md1. After >> >> >> while I've stopped md1 without removing md101 from it first. Then I >> >> >> rebooted and assembled md101. >> >> >> >> >> >> - First I tried mdadm -A --no-degraded -u UUID /dev/md1 but got >> >> >> "mdadm: /dev/md1 assembled from 5 drives (out of 6), but not started." >> >> >> so I stopped the md1. >> >> >> >> >> >> - Second time I started it with -v and got: >> >> >> >> >> >> mdadm: /dev/md101 is identified as a member of /dev/md1, slot 5. >> >> >> mdadm: /dev/sdk1 is identified as a member of /dev/md1, slot 4. >> >> >> mdadm: /dev/sdi1 is identified as a member of /dev/md1, slot 1. >> >> >> mdadm: /dev/sdh1 is identified as a member of /dev/md1, slot 2. >> >> >> mdadm: /dev/sdg1 is identified as a member of /dev/md1, slot 0. >> >> >> mdadm: /dev/sde1 is identified as a member of /dev/md1, slot 3. >> >> >> mdadm: added /dev/sdi1 to /dev/md1 as 1 >> >> >> mdadm: added /dev/sdh1 to /dev/md1 as 2 >> >> >> mdadm: added /dev/sde1 to /dev/md1 as 3 >> >> >> mdadm: added /dev/sdk1 to /dev/md1 as 4 >> >> >> mdadm: added /dev/md101 to /dev/md1 as 5 (possibly out of date) >> >> >> mdadm: added /dev/sdg1 to /dev/md1 as 0 >> >> >> mdadm: /dev/md1 assembled from 5 drives (out of 6), but not started. >> >> >> >> >> >> - On third time I tried without --nodegraded with mdadm -A -v -u UUID >> >> >> /dev/md1. This is what I've got: >> >> >> >> >> >> mdadm: /dev/md101 is identified as a member of /dev/md1, slot 5. >> >> >> mdadm: /dev/sdk1 is identified as a member of /dev/md1, slot 4. >> >> >> mdadm: /dev/sdi1 is identified as a member of /dev/md1, slot 1. >> >> >> mdadm: /dev/sdh1 is identified as a member of /dev/md1, slot 2. >> >> >> mdadm: /dev/sdg1 is identified as a member of /dev/md1, slot 0. >> >> >> mdadm: /dev/sde1 is identified as a member of /dev/md1, slot 3. >> >> >> mdadm: added /dev/sdi1 to /dev/md1 as 1 >> >> >> mdadm: added /dev/sdh1 to /dev/md1 as 2 >> >> >> mdadm: added /dev/sde1 to /dev/md1 as 3 >> >> >> mdadm: added /dev/sdk1 to /dev/md1 as 4 >> >> >> mdadm: added /dev/md101 to /dev/md1 as 5 (possibly out of date) >> >> >> mdadm: added /dev/sdg1 to /dev/md1 as 0 >> >> >> mdadm: /dev/md1 assembled from 5 drives - not enough to start the >> >> >> array while not clean - consider --force. >> >> >> >> >> >> Array md1 has bitmap. All drive devices have all same Events, their >> >> >> state is clean and Device Role is Active device. md101 has active >> >> >> state and lower Events. >> >> >> >> >> >> Is this expected behavior? My theory is that it is caused by md101 and >> >> >> I should start array md1 without it (by for example stopping md101) >> >> >> and then re-add it. Is that a case or is it something else? >> >> >> >> >> >> Thanks. >> >> >> >> >> >> Best regards, >> >> >> >> >> >> Patrik >> >> > >> >> > >> >> > The array is clearly degraded as one of the devices failed and hasn't been >> >> > recovered yet, so using --nodegraded is counter productive, as you >> >> > discovered. >> >> > >> >> > It appears that the array is also marked as 'dirty'. That suggests that it >> >> > wasn't shut down cleanly. >> >> > What does "mdadm --examine" of some device show? >> >> > >> >> > You probably need to re-assemble the array with --force like it suggests, >> >> > then add the failed device and let it recover. >> >> > >> >> > NeilBrown >> >> > >> >> -- >> >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html