Re: Need some information and help on mdadm in order to support it on IBM z Systems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Bill,

the scenario actually involves simulating a hardware connection issue for 
a few seconds and bring it back online. But once the hardware comes back 
online it is still do not come back into the array an remains marked 
"faulty spare". Moreover, if you then reboot, the mirror comes up and you 
can mount it but it is degraded and my "faulty spare" is now removed:

    Number   Major   Minor   RaidDevice State
       0       0        0        0      removed
       1       8       17        1      active sync   /dev/sdb1

Is there a way maybe using a udev rule to mark the device clean so it can 
be readded automatically into the array ?

Best regards / Mit freundlichen Gruessen / Cordialement / Cordiali Saluti 

Jean-Baptiste Joret - Linux on System Z 
Phone: +49 7031 16-3278 / ITN: 39203278 - eMail: joret@xxxxxxxxxx 

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter
Geschäftsführung: Herbert Kircher
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



From:
Bill Davidsen <davidsen@xxxxxxx>
To:
Jean-Baptiste Joret/Germany/IBM@IBMDE, Linux RAID 
<linux-raid@xxxxxxxxxxxxxxx>
Date:
15.04.2008 20:50
Subject:
Re: Need some information and help on mdadm in order to support it on IBM 
z Systems



I have added the list back into the addresses, you can use "reply all" to 
keep the discussion where folks can easily contribute.

Jean-Baptiste Joret wrote: 
Hi Bill,

I have created the array with "mdadm --create /dev/md0 --level=1 
--raid-devices=2 /dev/dasd[ef]1 --metadata=1.2 --bitmap=internal" using as 

you can see version 1.2 of the Metatata format. The Kernel ist the SuSE 
standard kernel 2.6.16.60-0.9-default on s390x (SLES 10 SP2 RC1). I have 
this issue with RC2 too.

What I would like to have a more documentation about the Metadata and how 
they are used if you have or know someone who can provide this.

 
The best (only) description of the metadata is in the md portion of the 
kernel or in the mdadm source code. I am guessing that there is a fix for 
your problem in more recent kernels, since a similar thing was mentioned 
on the mailing list recently. Older versions of the kernel require some 
event to start the rebuild, at which point the spare will be put back into 
the array. Unfortunately I didn't find it quickly, although memory tells 
me that it has been fixed in the latest kernel.

I think you need to look carefully at any hardware or connection issues 
which cause the device to drop out of the array in the first place. The 
fact that it comes in as a faulty spare indicates a problem, but I don't 
quite see what that is. Your remove and reinsert will get it going again, 
is it possible that the devices is not ready at boot time for some reason?

There may be log messages from the time when the drive was kicked from 
that array which will tell you more.
Thank you in advance.

Best regards / Mit freundlichen Gruessen / Cordialement / Cordiali Saluti 

Jean-Baptiste Joret - Linux on System Z 
Phone: +49 7031 16-3278 / ITN: 39203278 - eMail: joret@xxxxxxxxxx 

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter
Geschäftsführung: Herbert Kircher
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



From:
Bill Davidsen <davidsen@xxxxxxx>
To:
Jean-Baptiste Joret/Germany/IBM@IBMDE
Cc:
linux-raid@xxxxxxxxxxxxxxx
Date:
11.04.2008 16:35
Subject:
Re: Need some information and help on mdadm in order to support it on IBM 
z Systems



Jean-Baptiste Joret wrote:
 
Hello,

I am trying to obtain information such as design document or anything 
 
that 
 
would describe the content of the metadata. I am evaluating the solution 
 

 
to determinate whether it is entreprise ready for use as a mirror 
 
solution 
 
and if we can support it at IBM. 

Also I am currently having quite a show stopper issue, where help would 
 
be 
 
appreciated. I have a RAID1 with 2 Harddisks, when I remove one hardisk 
 
(I 
 
put the chpids offline which is equivalent to telling the system that 
 
the 
 
drive is currently not available), the missing disk is marked as "faulty 
 

 
spare" when calling mdadm -D /dev/md0. 

/dev/md0:
        Version : 01.02.03
  Creation Time : Fri Apr 11 11:11:59 2008
     Raid Level : raid1
     Array Size : 2403972 (2.29 GiB 2.46 GB)
  Used Dev Size : 2403972 (2.29 GiB 2.46 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Fri Apr 11 11:23:04 2008
          State : active, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 1
  Spare Devices : 0

           Name : 0
           UUID : 9a0a6e30:4b8bbe7f:bc0cad81:9fd46804
         Events : 8

    Number   Major   Minor   RaidDevice State
       0       0        0        0      removed
       1      94       21        1      active sync   /dev/dasdf1

       0      94       17        -      faulty spare   /dev/dasde1

When I put the disk back online it is not automatically reinserted into 
the array. The only thing that I have tried that worked was to do a hot 
remove followed by a hot add (mdadm /dev/md0 -r /dev/dasde1 and then 
 
mdadm 
 
/dev/md0 -a /dev/dasde1). Is that the correct way or is there any option 
 

 
to tell the disk is back an clean ? I don't like my solution verymuch as 
 

 
somtimes I get an error saying the superblock cannot be written.

Thank you very much for any help you can provide.


 
Start by detailing the versions of the kernel, mdadm, which superblock 
you use, and your bitmap configuration (or lack of it).

 
Best regards / Mit freundlichen Gruessen / Cordialement / Cordiali 
 
Saluti 
 
Jean-Baptiste Joret - Linux on System Z 
Phone: +49 7031 16-3278 / ITN: 39203278 - eMail: joret@xxxxxxxxxx 

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter
Geschäftsführung: Herbert Kircher
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


 


 


-- 
Bill Davidsen <davidsen@xxxxxxx>
  "Woe unto the statesman who makes war without a reason that will still
  be valid when the war is over..." Otto von Bismark 




--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux