Re: The dev node can't be released at once after stopping raid

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 30 2017, Xiao Ni wrote:

> Hi Neil
>
> I have searched in history emails and there have many topics like this. Sorry for talking
> about this again. But it looks like the situation I encountered is different. There is 1 second
> window between stop the raid device and delete the node /dev/md0. The /dev/md0 node can be
> removed successfully after 1 second. 

I think you are saying that /dev/md0 gets deleted 1 second after the
device is stopped.  I assume that is a delay in udev processing of
events.

When you say "can be"  I assume you mean "is being".
ie. if you say
   "The node can be removed after 1 second", it seems to imply that if
   you try to remove it earlier, the unlink() will fail.
If you say
  "The node is being removed after 1 seconds", that suggests that the
  removal happens automatically, but there is a delay between the device
  stopping and the removal happening.

>
> There is no process that open the /dev/md0 after mdadm -S /dev/md0: 
>
> mdadm -CR /dev/md0 -l1 -n2 /dev/loop0 /dev/loop1 --assume-clean
> dmesg:
> [36416.860525] Opened by mdadm, pid is 3523
> [36416.984160] md/raid1:md0: active with 2 out of 2 mirrors
> [36416.984181] md0: detected capacity change from 0 to 523239424
> [36416.984219] Released by mdadm, pid is 3523
> [36416.984228] remove_and_add_spares
> [36416.991588] Opened by mdadm, pid is 3541
> [36416.997183] Released by mdadm, pid is 3541
> [36417.001376] Opened by systemd-udevd, pid is 3525
> [36417.007128] Released by systemd-udevd, pid is 3525
>
> udev:
> KERNEL[36419.830817] add      /devices/virtual/bdi/9:0 (bdi)
> KERNEL[36419.831045] add      /devices/virtual/block/md0 (block)
> UDEV  [36419.832911] add      /devices/virtual/bdi/9:0 (bdi)
> UDEV  [36419.836380] add      /devices/virtual/block/md0 (block)
> KERNEL[36419.877705] change   /devices/virtual/block/loop0 (block)
> KERNEL[36419.878057] change   /devices/virtual/block/loop0 (block)
> KERNEL[36419.926761] change   /devices/virtual/block/loop1 (block)
> KERNEL[36419.927015] change   /devices/virtual/block/loop1 (block)
> UDEV  [36419.953112] change   /devices/virtual/block/loop0 (block)
> UDEV  [36419.953141] change   /devices/virtual/block/loop1 (block)
> KERNEL[36419.954765] change   /devices/virtual/block/md0 (block)
> UDEV  [36419.955973] change   /devices/virtual/block/loop0 (block)
> UDEV  [36419.962799] change   /devices/virtual/block/loop1 (block)
> UDEV  [36419.982934] change   /devices/virtual/block/md0 (block)
>
> mdadm -S /dev/md0
> dmesg:
> [36493.068054] Opened by mdadm, pid is 3552
> [36493.072051] Released by mdadm, pid is 3552
> [36493.076123] Opened by mdadm, pid is 3552
> [36493.080073] md0: detected capacity change from 523239424 to 0
> [36493.080077] md: md0 stopped.
> [36493.273011] Released by mdadm, pid is 3552
> udev:
> KERNEL[36496.300219] remove   /devices/virtual/bdi/9:0 (bdi)
> KERNEL[36496.300335] remove   /devices/virtual/block/md0 (block)
> UDEV  [36496.300736] remove   /devices/virtual/bdi/9:0 (bdi)
> UDEV  [36496.301812] remove   /devices/virtual/block/md0 (block)

I don't see any 1 second delay here.
I can see a 3 second delay between "Released by mdadm, pid = 3552" and
the UDEV remove event.  Is that what you are referring to?

>
> There are only REMOVE events during command mdadm -S /dev/md0.

The remove events seems to happen *after* "mdadm -S /dev/md0", or did
"mdadm -S /dev/md0" take 3 seconds to run?

>
> I tried to create a lvm and remove it to check whether lvm has this problem or not. 
>
> pvcreate /dev/md0 
> vgcreate vg /dev/md0 
> lvcreate -L 100M -n test vg
> lvremove vg/test -y
> ls /dev/mapper/vg-test
> ls /dev/dm-3
>
> The node /dev/mapper/vg-test and /dev/dm-3 can be removed in time. There is no time
> window. So it looks like it's a problem of md. Could you give some suggestions about
> this? What should I do next? 

Maybe lvremove explicitly unlinks the files in /dev, I don't know.

>
> If it's not a bug, why there is a 1 second window?

As I said, probably because udev is slow.
Why do you think this is a problem?  Why do you care about 1 second
window.  If I don't know how why this matters, I cannot help you.

NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux