Hi Marc, On 10/05/2016 04:43 PM, Marc Smith wrote:
Hi, First, I believe this issue may have been reported/solved with this thread ("[PATCH 3/3] MD: hold mddev lock for md-cluster receive thread"): http://www.spinics.net/lists/raid/msg53121.html But I'm not totally sure, and I'm looking for confirmation, or maybe this is a new one... I'm trying to hold out for Linux 4.9 in my project, and I am hoping to just cherry pick any patches until then. Testing md-cluster with Linux 4.5.2 (yes, I know its dated)... two nodes connected to shared SAS storage, and I'm using DM Multipath in front of the individual SAS disks (two I/O modules with dual-domain SAS disks).
For cluster-md, there are lots of changes since v4.5, and there are some fixes which are just merged in 4.9 window (I personally think it should be stable).
On tgtnode2 I created the array like this: mdadm --create --verbose --run /dev/md/test4 --name=test4 --level=raid1 --raid-devices=2 --chunk=64 --bitmap=clustered /dev/dm-4 /dev/dm-5 And then, without waiting for the resync to complete, on the second node (tgtnode1) I do this: mdadm --assemble --scan
I can't reproduce it with latest code, pls let me know if you see it again with v4.9 (also send the stack of md related process which are in 'D' state). [snip]
So, again, this may already be fixed, just looking for confirmation if the aforementioned patch / thread is related to this bug (or maybe another).
Not the one you mentioned (it was not merged), actually there are lots of changes for resync, you can bisect them to find the related commit. Thanks, Guoqing -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html