On Tuesday April 3, newsuser@xxxxxxxxxxxxxxx wrote: > Hi, > > I've got a bugreport [0] from a user trying to use raid and uswsusp. He's > using initramfs-tools available in debian. I'll describe the problem > and my analysis, maybe you can comment on what you think. A warning: I only > have a casual understanding of raid, never looked at any code related to it. > > This is a setup where root maybe on raid, but swap isn't. Swap on raid > will be very difficult to support, I think. Nah... shouldn't be a problem.... well, maybe raid5. > > When s2disk is started, nothing special is done to the array. It may be > in an unclean state (just like filesystems). Image is written to disk. > > After the power cycle the kernel boots, devices are discovered, among > which the ones holding raid. Then we try to find the device that holds > swap in case of resume and / in case of a normal boot. > > Now comes a crucial point. The script that finds the raid array, finds > the array in an unclean state and starts syncing. Uhm, so you are finding the device for the root filesystem before you have decided which case it will be (resume or normal boot). Can that be delayed until after the decision. It's probably not important but it seems neater. Or do you need the root device even when resuming (I guess if swap is in a file on the root filesystem....) The trick is to use the 'start_ro' module parameter. echo 1 > /sys/module/md_mod/parameters/start_ro Then md will start arrays assuming read-only. No resync will be started, no superblock will be written. They stay this way until the first write at which point they become normal read-write and any required resync starts. So you can start arrays 'readonly', and resume off a raid1 without any risk of the the resync starting when it shouldn't. It is probably best to 'echo 0 > ....' once you have committed to a normal boot, but it isn't really critical. > > The debian-maintainer of mdadm thinks that the suspend process should > have left the array in a clean state, but this is IMHO impossible. It probably would be best if suspend left the process in a clean state. It shouldn't be too hard, but it needs to be done in the kernel. However it isn't critical to all of this working well. I mentioned above that if swap in on raid5 it might be awkward. This is because raid5 caches some data that is on disk. If you snapshot the raid5 memory, then resume raid5 so it can write to disk, when you come back from suspend you could have old data in the cache. It should be possible to fix this, but it is currently a potential problem that might be worth warning people against. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html