Re: [PATCH 3/3] MD: hold mddev lock for md-cluster receive thread

Guoqing Jiang <gqjiang@xxxxxxxx> · Tue, 2 Aug 2016 17:52:41 +0800

On 08/02/2016 05:45 AM, Shaohua Li wrote:
On Mon, Aug 01, 2016 at 04:38:59PM +0800, Guoqing Jiang wrote:
Hi,

On 07/31/2016 07:54 AM, shli@xxxxxxxxxx wrote:
From: Shaohua Li <shli@xxxxxx>

md-cluster receive thread calls .quiesce too, let it hold mddev lock.
I'd suggest hold on for the patchset, I can find lock problem easily with
the patchset applied. Take a resyncing  clusteed raid1 as example.

md127_raid1 thread held reconfig_mutex then update sb, so it needs dlm
token lock. Meanwhile md127_resync thread got token lock and wants
EX on ack lock but recv_daemon can't release ack lock since recv_daemon
doesn't get reconfig_mutex.
Thansk, I'll drop this one. Other two patches are still safe for md-cluster,
right?

From the latest test, I can't find  lock issues with the first two patches,
but I doubt it would have side effect for the performance of resync.

I really hope to have consistent locking for .quiesce. For the
process_recvd_msg, I'm wondering what's protecting the datas? for example,
md-cluster uses md_find_rdev_nr_rcu, which access the disks list without
locking. Is there a race?

Yes, it should be protected by rcu lock, I will post a patch for it, thanks
for reminder.

Does it work if we move the mddev lock to
process_recvd_msg?

I tried that, but It still have lock issue, eg, when node B and C have 
status
as "resync=PENDING", then try to stop the resyncing array in node A.

Thanks,
Guoqing
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html