mdadm garbage file remains in container environment

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello all.

I have an issue on constructing s/w raid with mdadm in container environment.
When I create a raid-1 device through mdadm and stop it in privileged
container, garbage files remain in /dev and /sys paths. It causes a
problem when I restart the same raid device.

Commands are like below.

$ docker run --privileged --cap-add=ALL -it --name testvol -v
/dev:/dev:rshared -v /sys:/sys:rshared -v /lib:/lib:rshared centos:7
/bin/bash

$ capsh --print | grep "Current:" | cut -d' ' -f3
cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,cap_syslog,35,36,37+ep

$ yum install -y nvme-cli nvmetcli mdadm
…

$ nvme connect --transport tcp --traddr 1.2.3.4 --trsvcid 4420 ...

$ nvme connect --transport tcp --traddr 1.3.5.7 --trsvcid 4420 ...

$ lsblk
NAME    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
nvme0n1 259:0    0   11G  0 disk
nvme1n1 259:1    0   11G  0 disk
…

$ /sbin/mdadm --create /dev/md/testvol --assume-clean --failfast
--bitmap=internal --level=1 --raid-devices=2 /dev/nvme0n1 /dev/nvme1n1
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
Continue creating array? yes
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md/testvol started.

$ lsblk
NAME    MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT
nvme0n1 259:0    0   11G  0 disk
`-md127   9:127  0   11G  0 raid1
nvme1n1 259:1    0   11G  0 disk
`-md127   9:127  0   11G  0 raid1
…

$ ls /dev/md*
/dev/md127

/dev/md:
testvol


After creating, I stopped the raid device and I can find garbage files
still remaining like below.

$ /sbin/mdadm --manage /dev/md/testvol --stop
mdadm: stopped /dev/md/testvol

$ lsblk
NAME    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
nvme0n1 259:0    0   11G  0 disk
nvme1n1 259:1    0   11G  0 disk
…

$ ls /dev/md*
/dev/md127

/dev/md:
testvol

## All remaining garbage files
$ find / -name md127
/dev/md127
/sys/class/block/md127
/sys/devices/virtual/block/md127
/sys/block/md127


When I did the above command equally in host, there was no garbage file.
There are some differences on kernel events in host and privileged container.

First of all, below is the kernel event when I do mdadm --stop in
host. I deliver only first line of kernel events for readability.

$ udevadm monitor -p

KERNEL[1245.708850] remove   /devices/virtual/bdi/9:127 (bdi)

KERNEL[1245.709330] remove   /devices/virtual/block/md127 (block)

UDEV  [1245.710526] remove   /devices/virtual/bdi/9:127 (bdi)

UDEV  [1245.716664] remove   /devices/virtual/block/md127 (block)


However, in privileged container, mdadm --stop command incurs "add"
kernel events as well as "remove" events.

$ udevadm monitor -p

KERNEL[92803.756048] remove   /devices/virtual/bdi/9:127 (bdi)

KERNEL[92803.756077] remove   /devices/virtual/block/md127 (block)

UDEV  [92803.757334] remove   /devices/virtual/bdi/9:127 (bdi)

KERNEL[92803.762387] add      /devices/virtual/bdi/9:127 (bdi)

KERNEL[92803.762497] add      /devices/virtual/block/md127 (block)

UDEV  [92803.762865] add      /devices/virtual/bdi/9:127 (bdi)

UDEV  [92803.764697] remove   /devices/virtual/block/md127 (block)

UDEV  [92803.768186] add      /devices/virtual/block/md127 (block)


This may be the direct cause of garbage files, but I don't understand
why "add" kernel events exist when I stop raid device in container.
(The interesting point is that privileged container does not include
"add" kernel events when mdadm --create command is done. If you need
it in analysis, I will share it too)

(JFI)
For bidirectional communication between host and container in
bind-mounted paths, I added the shared flag to docker daemon's
MountFlags parameter.

$ cat /etc/systemd/system/multi-user.target.wants/docker.service
…
[Service]
…
MountFlags=shared
…


Please give me a hand.

Best regards,
Taeuk Kim




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux