Re: raid5 reshape is stuck

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 05/28/2015 02:49 PM, NeilBrown wrote:
On Thu, 28 May 2015 02:32:51 -0400 (EDT) Xiao Ni <xni@xxxxxxxxxx> wrote:


----- Original Message -----
From: "NeilBrown" <neilb@xxxxxxx>
To: "Xiao Ni" <xni@xxxxxxxxxx>
Cc: linux-raid@xxxxxxxxxxxxxxx
Sent: Thursday, May 28, 2015 6:59:58 AM
Subject: Re: raid5 reshape is stuck

On Wed, 27 May 2015 08:04:24 -0400 (EDT) Xiao Ni <xni@xxxxxxxxxx> wrote:


----- Original Message -----
From: "NeilBrown" <neilb@xxxxxxx>
To: "Xiao Ni" <xni@xxxxxxxxxx>
Cc: linux-raid@xxxxxxxxxxxxxxx
Sent: Wednesday, May 27, 2015 7:34:49 PM
Subject: Re: raid5 reshape is stuck

On Wed, 27 May 2015 07:28:04 -0400 (EDT) Xiao Ni <xni@xxxxxxxxxx> wrote:


[root@intel-waimeabay-hedt-01 mdadm]# cat
/usr/lib/systemd/system/mdadm-grow-continue\@.service
#  This file is part of mdadm.
#
#  mdadm is free software; you can redistribute it and/or modify it
#  under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.

[Unit]
Description=Manage MD Reshape on /dev/%I
DefaultDependencies=no

[Service]
ExecStart=/usr/sbin/mdadm --grow --continue /dev/%I
--backup-file=/root/tmp0
Please remove the ---backup-file=/root/tmp0 for further testing.  The
patch I
provided should make that unnecessary.

StandardInput=null
StandardOutput=null
StandardError=null
Could you try removing these - that might allow error messages to appear.
I wonder why I included them - they shouldn't be needed.

Thanks,
NeilBrown


[root@intel-waimeabay-hedt-01 mdadm]# mdadm -CR /dev/md0 -l5 -n4
/dev/loop[0-3] --assume-clean
mdadm: /dev/loop0 appears to be part of a raid array:
        level=raid5 devices=5 ctime=Wed May 27 02:45:08 2015
mdadm: /dev/loop1 appears to be part of a raid array:
        level=raid5 devices=5 ctime=Wed May 27 02:45:08 2015
mdadm: /dev/loop2 appears to be part of a raid array:
        level=raid5 devices=5 ctime=Wed May 27 02:45:08 2015
mdadm: /dev/loop3 appears to be part of a raid array:
        level=raid5 devices=5 ctime=Wed May 27 02:45:08 2015
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
[root@intel-waimeabay-hedt-01 mdadm]# mdadm /dev/md0 -a /dev/loop4
mdadm: added /dev/loop4
[root@intel-waimeabay-hedt-01 mdadm]# mdadm --grow /dev/md0
--raid-devices=5
mdadm: Need to backup 6144K of critical section..
[root@intel-waimeabay-hedt-01 mdadm]# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 loop4[4] loop3[3] loop2[2] loop1[1] loop0[0]
       1532928 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5]
       [UUUUU]
       [>....................]  reshape =  0.0% (0/510976) finish=532.2min
       speed=0K/sec
unused devices: <none>
[root@intel-waimeabay-hedt-01 mdadm]# cat
/usr/lib/systemd/system/mdadm-grow-continue\@.service
#  This file is part of mdadm.
#
#  mdadm is free software; you can redistribute it and/or modify it
#  under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.

[Unit]
Description=Manage MD Reshape on /dev/%I
DefaultDependencies=no

[Service]
ExecStart=/usr/sbin/mdadm --grow --continue /dev/%I
#StandardInput=null
#StandardOutput=null
#StandardError=null
KillMode=none


The problem still exist. And there are messages in /var/log/messages

May 27 08:03:29 intel-waimeabay-hedt-01 systemd:
mdadm-grow-continue@md0.service: main process exited, code=exited,
status=1/FAILURE
May 27 08:03:29 intel-waimeabay-hedt-01 systemd: Unit
mdadm-grow-continue@md0.service entered failed state.

Does
   systemctl status -l mdadm-grow-continue@md0.service

report anything different.  That was the result I expected from removing the
Standard*=null lines.

I assume the new mdadm is installed in /usr/sbin/mdadm.

Thanks,
NeilBrown

Yes! There are some new messages:
[root@intel-waimeabay-hedt-01 ~]# systemctl status -l mdadm-grow-continue@md0.service
mdadm-grow-continue@md0.service - Manage MD Reshape on /dev/md0
    Loaded: loaded (/usr/lib/systemd/system/mdadm-grow-continue@.service; static)
    Active: failed (Result: exit-code) since Thu 2015-05-28 02:30:50 EDT; 2s ago
   Process: 26618 ExecStart=/usr/sbin/mdadm --grow --continue /dev/%I (code=exited, status=1/FAILURE)
  Main PID: 26618 (code=exited, status=1/FAILURE)

May 28 02:30:50 intel-waimeabay-hedt-01.lab.eng.rdu.redhat.com systemd[1]: Started Manage MD Reshape on /dev/md0.
May 28 02:30:50 intel-waimeabay-hedt-01.lab.eng.rdu.redhat.com mdadm[26618]: mdadm: Need to backup 6144K of critical section..
May 28 02:30:50 intel-waimeabay-hedt-01.lab.eng.rdu.redhat.com mdadm[26618]: mdadm: array: cannot open component /dev/vcs6
May 28 02:30:50 intel-waimeabay-hedt-01.lab.eng.rdu.redhat.com systemd[1]: mdadm-grow-continue@md0.service: main process exited, code=exited, status=1/FAILURE
May 28 02:30:50 intel-waimeabay-hedt-01.lab.eng.rdu.redhat.com systemd[1]: Unit mdadm-grow-continue@md0.service entered failed state.
any idea why it cannot open it?

The message is probably coming from reshape_prepare_fdlist()
Could you get those "pr_err"s to print out errno as well?
The device really has to exist, because mdadm has managed to find that name
in /dev.  Could this be a 'selinux' related issue?  I can only think that it
might be a permission problem but root shouldn't have those.

Thanks,
NeilBrown
Sorry for late reply. It can get so much knowledge from one problem. As you said, it's really the permission problem.

May 29 06:47:41 intel-waimeabay-hedt-01.lab.eng.rdu.redhat.com mdadm[28636]: mdadm: array: cannot open component /dev/vcs6 May 29 06:47:41 intel-waimeabay-hedt-01.lab.eng.rdu.redhat.com mdadm[28636]: mdadm: errno is 13, err is Permission denied

And it's really the problem about selinux. The patch works after the command "setenforce 0". It need the patch you gave. Is it right
to setenforce 0 every time? I'll read the doc about selinux.

Best Regards
Xiao

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux