Greetings, I am looking into a scenario, in which the md raid5/6 array is resyncing (e.g., after a fresh creation) and there is a drive failure. As written in Neil's blog entry "Closing the RAID5 write hole" (http://neil.brown.name/blog/20110614101708): "if a device fails during the resync, md doesn't take special action - it just allows the array to be used without a resync even though there could be corrupt data". However, I noticed that at this point sb->resync_offset in the superblock is not set to MaxSector. At this point if a drive is added/re-added to the array, then drive recovery starts, i.e., md assumes that data/parity on the surviving drives are correct, and uses them to rebuild the new drive. This state of data/parity being correct should be reflected as sb->resync_offset==MaxSector, shouldn't it? One issue that I ran into is the following: I reached a situation in which during array assembly: sb->resync_offset==sb->size. At this point, the following code in mdadm assumes that array is clean: info->array.state = (__le64_to_cpu(sb->resync_offset) >= __le64_to_cpu(sb->size)) ? 1 : 0; As a result, mdadm lets the array assembly flow through fine to the kernel, but in the kernel the following code refuses to start the array: if (mddev->degraded > dirty_parity_disks && mddev->recovery_cp != MaxSector) { At this point, speciying --force to mdadm --assembly doesn't help, because mdadm thinks that array is clean (clean==1), and therefore doesn't do the "force-array" update, which would knock off the sb->resync_offset value. So there is no way to start the array, unless specifying the start_dirty_degraded=1 kernel parameter. So one question is: should mdadm compare sb->resync_offset to MaxSector and not to sb->size? In the kernel code, resync_offset is always compared to MaxSector. Another question is: whether sb->resync_offset should be set to MaxSector by the kernel as soon as it starts rebuilding a drive? I think this would be consistent with what Neil wrote in the blog entry. Here is the scenario to reproduce the issue I described: # Create a raid6 array with 4 drives A,B,C,D. Array starts resyncing. # Fail drive D. Array aborts the resync and then immediately restarts it (it seems to checkpoint the mddev->recovery_cp, but I am not sure that it restarts from that checkpoint) # Re-add drive D to the array. It is added as a spare, array continues resyncing # Fail drive C. Array aborts the resync, and then starts rebuilding drive D. At this point sb->resync_offset is some valid value (usually 0, not MaxSectors and not sb->size). # Stop the array. At this point sb->resync offset is sb->size in all the superblocks. Another question I have: when exactly md decides to update the sb->resync_offset in the superblock? I am playing with similar scenarios with raid5, and sometimes I end up with MaxSectors and sometimes with valid values. From the code, it looks like only this logic updates it: if (mddev->in_sync) sb->resync_offset = cpu_to_le64(mddev->recovery_cp); else sb->resync_offset = cpu_to_le64(0); except for resizing and setting through sysfs. But I don't understand how this value should be managed in general. Thanks! Alex. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html