Hello Neil, >> However, at least for 1.2 arrays, I believe this is too restrictive, >> don't you think? If the raid slot (not desc_nr) of the device being >> re-added is *not occupied* yet, can't we just select a free desc_nr >> for the new disk on that path? >> Or perhaps, mdadm on the re-add path can select a free desc_nr >> (disc.number) for it (just as it does for --add), after ensuring that >> the slot is not occupied yet? Where it is better to do it? >> Otherwise, the re-add fails, while it can perfectly succeed (only pick >> a different desc_nr). > > I think I see what you are saying. > However my question is: is this really an issue. > Is there a credible sequence of events that results in the current code makes > an undesirable decision? Of course I do not count deliberately editing the > metadata as part of a credible sequence of events. Consider this scenario, in which the code refuses to re-add a drive: Step 1: - I created a raid1 array with 3 drives: A,B,C (and their desc_nr=0,1,2) - I failed drives B and C, and removed them from the array, and totally forgot about them for the rest of the scenario. - I added to the array two new drives: D and E, and waited for the resync to complete. The array now has the following structure: A: descr_nr=0 D: desc_nr=3 (was selected during the "add" path in mdadm, as expected) E: desc_nr=4 (was selected during the "add" path in mdadm, as expected) Step 2: - I failed drives D and E, and removed them from the array. The E drive is not used for the rest of the scenario, so we can forget about it. I wrote some data to the array. At this point, the array bitmap is dirty, and will not be cleared, since the array is degraded. Step 3: - I added one new drive (last one, I promise!) to the array - drive F, and waited for it to resync. The array now has the following structure: A: descr_nr=0 F: desc_nr=3 So F took desc_nr of D drive (desc_nr=3). This is expected according to mdadm code. Event counters at this point: A and F: events=149, events_cleared=0 D: events=109 Step 4: At this point, mdadm refuses to re-add the drive D to the array, because its desc_nr is already taken (I verified that via gdb). On the other hand, if we would have simply picked a fresh desc_nr for D, then it could be re-added I believe, because: - slots are not important for raid1 (D's slot was taken actually by F). - it should pass the check for bitmap-based resync (events in D' sb >= events_cleared of the array) Do you agree with this, or perhaps I missed something? Additional notes: - of course, such scenario is relevant only for arrays with more than single redundancy, so it's not relevant for raid5 - to simulate such scenario for raid6, need at step 3 to add the new drive to the slot, which is not the slot of the drive we're going to re-add in step4 (otherwise, it takes the D's slot, and then we really cannot re-add). This can be done as we discussed earlier. What do you think? Thanks, Alex. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html