mdadm/Create wait_for_zero_forks is stuck

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Logan

I'm trying to fix errors of mdadm regression failures. There is a
failure in 00raid5-zero sometimes. I added some logs:

In function write_zeroes_fork:
                if (fallocate(fd, FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE,
                              offset_bytes, sz)) {
                        pr_err("zeroing %s failed: %s\n", dv->devname,
                               strerror(errno));
                        ret = 1;
                        break;
                } else
                        printf("zeroing good\n");

In function wait_for_zero_forks:
                if (fdsi.ssi_signo == SIGINT) {
                        printf("\n");
                        pr_info("Interrupting zeroing processes,
please wait...\n");
                        interrupted = true;
                        break;
                } else if (fdsi.ssi_signo == SIGCHLD) {
                        printf("one child finishes, wait count %d\n",
wait_count);
                        if (!--wait_count)
                                break;
                }

while [ 1 ]; do
  /usr/sbin/mdadm -CfR /dev/md0 -l 5 -n3 /dev/loop0 /dev/loop1
/dev/loop2 --write-zeroes --auto=yes -v
  mdadm --wait /dev/md0
  mdadm -Ss
  sleep 1
done

zeroing good
zeroing good
zeroing good
one child finishes, wait count 3
one child finishes, wait count 2

It looks like the farther process misses one child signal.

root      174247  0.0  0.0   3628  2552 pts/0    S+   02:52   0:00  |
             \_ /usr/sbin/mdadm -CfR /dev/md0 -l 5 -n3 /dev/loop0
/dev/loop1 /dev/loop2 --write-zeroes --auto=yes -v
root      174248  0.0  0.0      0     0 pts/0    Z+   02:52   0:00  |
                 \_ [mdadm] <defunct>
root      174249  0.0  0.0      0     0 pts/0    Z+   02:52   0:00  |
                 \_ [mdadm] <defunct>
root      174250  0.0  0.0      0     0 pts/0    Z+   02:52   0:00  |
                 \_ [mdadm] <defunct>

]# cat /proc/174247/stack
[<0>] signalfd_dequeue+0x14d/0x170
[<0>] signalfd_read_iter+0x7b/0xd0
[<0>] vfs_read+0x201/0x330
[<0>] ksys_read+0x5f/0xe0
[<0>] do_syscall_64+0x7b/0x160
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e

Any ideas for this?

Best Regards
Xiao





[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux