Re: iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 14, 2018 at 12:05 PM, Michael Christie <mchristi@xxxxxxxxxx> wrote:
On 03/14/2018 01:27 PM, Michael Christie wrote:
> On 03/14/2018 01:24 PM, Maxim Patlasov wrote:
>> On Wed, Mar 14, 2018 at 11:13 AM, Jason Dillaman <jdillama@xxxxxxxxxx
>> <mailto:jdillama@xxxxxxxxxx>> wrote:
>>
>>     Maxim, can you provide steps for a reproducer?
>>
>>
>> Yes, but it involves adding two artificial delays: one in tcmu-runner
>> and another in kernel iscsi. If you're willing to take pains of
>
> Send the patches for the changes.
>
>> ...
Where you send the patches that add your delays could you send the
target side /var/log/tcmu-runner.log with log_level = 4.
...

Mike, see please patches and /var/log/tcmu-runner.log in attachment.

Time-line was like this:

1) 13:56:31 the client (iscsi-initiator) sends a request leading to "Acquired exclusive lock." on gateway.
2) 13:56:49 tcmu-runner is suspended by SIGSTOP
3) 13:56:50 the client executes:

dd of=/dev/mapper/mpatha if=/dev/zero oflag=direct bs=1536 count=1 seek=10 &
dd of=/dev/mapper/mpatha if=/dev/zero oflag=direct bs=2560 count=1 seek=10 &
dd of=/dev/mapper/mpatha if=/dev/zero oflag=direct bs=3584 count=1 seek=10

4) 13:56:51 gateway is detached from client (and neighbor gateways) by "iptables ... -j DROP"
5) 13:57:06 the client switches to another path, completes these requests above
6) 13:57:07 the client executes (that another path is still active):

dd of=/dev/mapper/mpatha if=/dev/urandom oflag=direct bs=3584 count=1 seek=10

7) 13:57:09 tcmu-runner is resumed by SIGCONT
8) 13:57:15 tcmu-runner successfully processes the third request (zero bs=3584) overwriting newer data.

9) verify that newer data was really overwritten:

# dd if=/dev/mapper/mpatha iflag=direct bs=3584 count=1 skip=10 |od -x
1+0 records in
1+0 records out
3584 bytes (3.6 kB) copied, 0.00232227 s, 1.5 MB/s
0000000 0000 0000 0000 0000 0000 0000 0000 0000

Thanks,
Maxim

diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
index 9eb10d3..f48ee2c 100644
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -1291,6 +1291,13 @@ int iscsit_process_scsi_cmd(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
 
 	immed_ret = iscsit_handle_immediate_data(cmd, hdr,
 					cmd->first_burst_len);
+
+	if (be32_to_cpu(hdr->data_length) == 3584) {
+		u64 end_time = ktime_get_ns() + 25ULL * 1000 * 1000 * 1000;
+		while (ktime_get_ns() < end_time)
+			schedule_timeout_uninterruptible(HZ);
+	}
+
 after_immediate_data:
 	if (immed_ret == IMMEDIATE_DATA_NORMAL_OPERATION) {
 		/*
diff --git a/tcmur_cmd_handler.c b/tcmur_cmd_handler.c
index 6d89a5f..1b77f11 100644
--- a/tcmur_cmd_handler.c
+++ b/tcmur_cmd_handler.c
@@ -2292,6 +2292,9 @@ static int tcmur_cmd_handler(struct tcmu_device *dev, struct tcmulib_cmd *cmd)
 	struct tcmur_device *rdev = tcmu_get_daemon_dev_private(dev);
 	uint8_t *cdb = cmd->cdb;
 
+	if (cmd->iovec[0].iov_len == 2560)
+		sleep(1);
+
 	track_aio_request_start(rdev);
 
 	if (tcmu_dev_in_recovery(dev)) {
2018-03-14 13:56:03.042 953 [DEBUG] main:1056: handler path: /usr/lib64/tcmu-runner
2018-03-14 13:56:03.043 953 [INFO] dyn_config_start:410: Inotify is watching "/etc/tcmu/tcmu.conf", wd: 1, mask: IN_ALL_EVENTS
2018-03-14 13:56:03.059 953 [INFO] load_our_module:537: Inserted module 'target_core_user'
2018-03-14 13:56:03.102 953 [DEBUG] main:1069: 2 runner handlers found
2018-03-14 13:56:03.104 953 [DEBUG] dbus_bus_acquired:441: bus org.kernel.TCMUService1 acquired
2018-03-14 13:56:03.105 953 [DEBUG] dbus_name_acquired:457: name org.kernel.TCMUService1 acquired
2018-03-14 13:56:05.474 953 [DEBUG] handle_netlink:207: cmd 1. Got header version 2. Supported 2.
2018-03-14 13:56:05.474 953 [DEBUG] dev_added:763 rbd/rbd.disk_1: Got block_size 512, size in bytes 2147483648
2018-03-14 13:56:05.474 953 [DEBUG] tcmu_rbd_open:739 rbd/rbd.disk_1: tcmu_rbd_open config rbd/rbd/disk_1;osd_op_timeout=30 block size 512 num lbas 4194304.
2018-03-14 13:56:05.482 953 [DEBUG] timer_check_and_set_def:378 rbd/rbd.disk_1: The cluster's default osd op timeout(0.000000), osd heartbeat grace(20) interval(6)
2018-03-14 13:56:05.505 953 [DEBUG] tcmu_rbd_detect_device_class:295 rbd/rbd.disk_1: Pool rbd using crush rule "replicated_rule"
2018-03-14 13:56:05.506 953 [DEBUG] tcmu_rbd_detect_device_class:311 rbd/rbd.disk_1: SSD not a registered device class.
2018-03-14 13:56:11.933 953 [DEBUG] tcmu_rbd_has_lock:511 rbd/rbd.disk_1: Not owner
2018-03-14 13:56:11.935 953 [DEBUG] tcmu_rbd_has_lock:511 rbd/rbd.disk_1: Not owner
2018-03-14 13:56:11.937 953 [DEBUG] tcmu_rbd_has_lock:511 rbd/rbd.disk_1: Not owner
2018-03-14 13:56:11.938 953 [DEBUG] tcmu_rbd_has_lock:511 rbd/rbd.disk_1: Not owner
2018-03-14 13:56:12.930 953 [DEBUG] tcmu_rbd_has_lock:511 rbd/rbd.disk_1: Not owner
2018-03-14 13:56:12.932 953 [DEBUG] tcmu_rbd_has_lock:511 rbd/rbd.disk_1: Not owner
2018-03-14 13:56:12.935 953 [DEBUG] tcmu_rbd_has_lock:511 rbd/rbd.disk_1: Not owner
2018-03-14 13:56:12.938 953 [DEBUG] tcmu_rbd_has_lock:511 rbd/rbd.disk_1: Not owner
2018-03-14 13:56:31.152 953 [DEBUG] tcmu_rbd_has_lock:511 rbd/rbd.disk_1: Not owner
2018-03-14 13:56:31.161 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 0
2018-03-14 13:56:31.161 953 [DEBUG] tcmu_block_device:403 rbd/rbd.disk_1: blocking kernel device
2018-03-14 13:56:31.161 953 [DEBUG] tcmu_block_device:409 rbd/rbd.disk_1: block done
2018-03-14 13:56:31.162 953 [DEBUG] tcmu_acquire_dev_lock:266 rbd/rbd.disk_1: Waiting for outstanding commands to complete
2018-03-14 13:56:31.162 953 [DEBUG] tcmu_flush_device:383 rbd/rbd.disk_1: waiting for ring to clear
2018-03-14 13:56:31.162 953 [DEBUG] tcmu_flush_device:386 rbd/rbd.disk_1: ring clear
2018-03-14 13:56:31.162 953 [DEBUG] tcmu_acquire_dev_lock:281 rbd/rbd.disk_1: lock call state 2 retries 0
2018-03-14 13:56:31.162 953 [DEBUG] tcmu_rbd_has_lock:511 rbd/rbd.disk_1: Not owner
2018-03-14 13:56:31.164 953 [DEBUG] tcmu_rbd_lock_break:577 rbd/rbd.disk_1: Attempting to break lock from 192.168.122.216:0/3984316315.
2018-03-14 13:56:31.379 953 [WARN] tcmu_rbd_lock:631 rbd/rbd.disk_1: Acquired exclusive lock.
2018-03-14 13:56:31.379 953 [DEBUG] tcmu_acquire_dev_lock:313 rbd/rbd.disk_1: lock call done. lock state 1
2018-03-14 13:56:31.380 953 [DEBUG] tcmu_unblock_device:421 rbd/rbd.disk_1: unblocking kernel device
2018-03-14 13:56:31.380 953 [DEBUG] tcmu_unblock_device:427 rbd/rbd.disk_1: unblock done
2018-03-14 13:56:31.383 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.408 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.410 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.412 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.414 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.415 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.416 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.417 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.418 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.419 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.421 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.422 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.424 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.425 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.426 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.427 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.427 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.428 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.429 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.430 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.431 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.432 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.433 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.433 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.433 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.433 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.434 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.434 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.436 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.436 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.436 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.437 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.438 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.438 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.439 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.439 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.439 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.439 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.440 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.440 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.440 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.442 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.442 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.442 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.442 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.443 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:31.444 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:56:32.948 953 [DEBUG] tcmu_rbd_has_lock:508 rbd/rbd.disk_1: Is owner
2018-03-14 13:56:38.750 953 [DEBUG] tcmu_rbd_has_lock:508 rbd/rbd.disk_1: Is owner
2018-03-14 13:56:38.753 953 [DEBUG] tcmu_rbd_has_lock:508 rbd/rbd.disk_1: Is owner
2018-03-14 13:56:38.755 953 [DEBUG] tcmu_rbd_has_lock:508 rbd/rbd.disk_1: Is owner
2018-03-14 13:56:38.757 953 [DEBUG] tcmu_rbd_has_lock:508 rbd/rbd.disk_1: Is owner
2018-03-14 13:56:38.759 953 [DEBUG] tcmu_rbd_has_lock:508 rbd/rbd.disk_1: Is owner
2018-03-14 13:57:09.644 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
2018-03-14 13:57:09.645 953 [WARN] tcmu_notify_lock_lost:199 rbd/rbd.disk_1: Async lock drop. Old state 1
2018-03-14 13:57:10.645 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 0
2018-03-14 13:57:10.645 953 [DEBUG] tcmu_block_device:403 rbd/rbd.disk_1: blocking kernel device
2018-03-14 13:57:10.645 953 [DEBUG] tcmu_block_device:409 rbd/rbd.disk_1: block done
2018-03-14 13:57:10.645 953 [DEBUG] tcmu_acquire_dev_lock:266 rbd/rbd.disk_1: Waiting for outstanding commands to complete
2018-03-14 13:57:10.645 953 [DEBUG] tcmu_flush_device:383 rbd/rbd.disk_1: waiting for ring to clear
2018-03-14 13:57:10.645 953 [DEBUG] tcmu_flush_device:386 rbd/rbd.disk_1: ring clear
2018-03-14 13:57:10.646 953 [DEBUG] tcmu_acquire_dev_lock:281 rbd/rbd.disk_1: lock call state 2 retries 0
2018-03-14 13:57:10.649 953 [DEBUG] tcmu_acquire_dev_lock:297 rbd/rbd.disk_1: Try to reopen device.
2018-03-14 13:57:10.649 953 [DEBUG] __tcmu_reopen_dev:52 rbd/rbd.disk_1: Waiting for outstanding commands to complete
2018-03-14 13:57:10.649 953 [DEBUG] __tcmu_reopen_dev:83 rbd/rbd.disk_1: Closing device.
2018-03-14 13:57:10.658 953 [DEBUG] __tcmu_reopen_dev:92 rbd/rbd.disk_1: Opening device.
2018-03-14 13:57:10.658 953 [DEBUG] tcmu_rbd_open:739 rbd/rbd.disk_1: tcmu_rbd_open config rbd/rbd/disk_1;osd_op_timeout=30 block size 512 num lbas 4194304.
2018-03-14 13:57:10.667 953 [DEBUG] timer_check_and_set_def:378 rbd/rbd.disk_1: The cluster's default osd op timeout(0.000000), osd heartbeat grace(20) interval(6)
2018-03-14 13:57:10.692 953 [DEBUG] tcmu_rbd_detect_device_class:295 rbd/rbd.disk_1: Pool rbd using crush rule "replicated_rule"
2018-03-14 13:57:10.693 953 [DEBUG] tcmu_rbd_detect_device_class:311 rbd/rbd.disk_1: SSD not a registered device class.
2018-03-14 13:57:10.723 953 [DEBUG] tcmu_acquire_dev_lock:281 rbd/rbd.disk_1: lock call state 2 retries 1
2018-03-14 13:57:10.723 953 [DEBUG] tcmu_rbd_has_lock:511 rbd/rbd.disk_1: Not owner
2018-03-14 13:57:10.723 953 [DEBUG] tcmu_rbd_lock_break:577 rbd/rbd.disk_1: Attempting to break lock from 192.168.122.216:0/1192112454.
2018-03-14 13:57:11.023 953 [WARN] tcmu_rbd_lock:631 rbd/rbd.disk_1: Acquired exclusive lock.
2018-03-14 13:57:11.023 953 [DEBUG] tcmu_acquire_dev_lock:313 rbd/rbd.disk_1: lock call done. lock state 1
2018-03-14 13:57:11.023 953 [DEBUG] tcmu_unblock_device:421 rbd/rbd.disk_1: unblocking kernel device
2018-03-14 13:57:11.023 953 [DEBUG] tcmu_unblock_device:427 rbd/rbd.disk_1: unblock done
2018-03-14 13:57:15.144 953 [DEBUG] alua_implicit_transition:447 rbd/rbd.disk_1: lock state 1
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux