Re: MD: Long delay for container drive removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 18.06.2024 16:24, Mateusz Kusiak wrote:
Hi all,
we have an issue submitted for SLES15SP6 that is caused by huge delays when trying to remove drive from a container.

The scenario is as follows:
1. Create two drive imsm container
# mdadm --create --run /dev/md/imsm --metadata=imsm --raid-devices=2 /dev/nvme[0-1]n1
2. Remove single drive from container
# mdadm /dev/md127 --remove /dev/nvme0n1

The problem is that drive removal may take up to 7 seconds, which causes timeouts for other components that are mdadm dependent.

We narrowed it down to be MD related. We tested this with inbox mdadm-4.3 and mdadm-4.2 on SP6 and delay time is pretty much the same. SP5 is free of this issue.

I also tried RHEL 8.9 and drive removal is almost instant.

Is it default behavior now, or should we treat this as an issue?

Thanks,
Mateusz


I dug into this more. I retested this on:
- Ubuntu 24.04 with inbox kernel 6.6.0: No reproduction
- RHEL 9.4 with usptream kernel: 6.9.5-1: Got reproduction
(Note that SLES15SP6 comes with 6.8.0-rc4 inbox)

I plugged into mdadm with gdb and found out that ioctl call in hot_remove_disk() fails and it's causing a delay. The function looks as follows:

int hot_remove_disk(int mdfd, unsigned long dev, int force)
{
	int cnt = force ? 500 : 5;
	int ret;

	/* HOT_REMOVE_DISK can fail with EBUSY if there are
	 * outstanding IO requests to the device.
	 * In this case, it can be helpful to wait a little while,
	 * up to 5 seconds if 'force' is set, or 50 msec if not.
	 */
	while ((ret = ioctl(mdfd, HOT_REMOVE_DISK, dev)) == -1 &&
	       errno == EBUSY &&
	       cnt-- > 0)
		sleep_for(0, MSEC_TO_NSEC(10), true);

	return ret;
}
... if it fails, then it defaults to removing drive via sysfs call.

Looks like a kernel ioctl issue...

Thanks,
Mateusz




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux