md raid sync and ext3 formatting on xen hvm guest causing kernel crash and device offline

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Phil,

This problem is related to mirror raid resyncing when doing installation through anaconda of CentOS 6.6 systems as a xen hvm guest.

Base xen system - xen kernel version - 4.1.18-1.el6xen.x86_64
Guest System - CentOS 6.6 - kernel version -  2.6.32-504.16.2.el6

Drive exposed on host system, for hvm guest = /dev/sdb - 2TB
partitioned as
/dev/sdb1 - primary  - 1024MB    - 262144MB = 256GB
/dev/sdb2 - primary  - 262144MB  - 524288MB = 256GB
/dev/sdb3 - primary  - 524288MB  - 786432MB = 256GB
/dev/sda4 - extended - 786432MB  - (-1)
/dev/sda5 - logical  - 786432MB  - 1048576MB = 256GB
/dev/sda6 - logical  - 1048576MB - (-1)

The above partition layout was exposed to hvm guest as follows
-------------------
builder = "hvm"
name = "centos_md_sync"
memory = 2048
vcpus = 4
vif = ['bridge=xenbr0']
disk = ['phy:/dev/sdb1,sda,w','phy:/dev/sdb2,sdb,w','phy:/dev/sdb3,sdc,w','phy:/dev/sdb5,sdd,w']
vnc = 1
boot="c"
---------------------

When anaconda installation started, I partitioned drives mentioned above as follows
Host System  ->  Guest System -> Partition layout
/dev/sdb1    -> /dev/sda      -> /dev/sda1, /dev/sda2 ..... /dev/sda12
/dev/sdb2    -> /dev/sdb      -> /dev/sdb1, /dev/sdb2 ..... /dev/sdb12
/dev/sdb3    -> /dev/sdc      -> /dev/sdc1, /dev/sdc2 ..... /dev/sdc12
/dev/sdb5    -> /dev/sdd      -> /dev/sdd1, /dev/sdd2 ..... /dev/sdd12

Now in the HVM guest OS we doing RAID 1 mirroring as follows (done during installation itself, from anaconda)
/dev/sd[ab]1 = /dev/md0
/dev/sd[ab]2 = /dev/md1
|.
|.
|.
/dev/sd[cd]1 = /dev/mdX
/dev/sd[cd]2 = /dev/mdY ....etc.

Now these md(s) get created properly, and as soon as the creation ends, resyncing starts. Now when /dev/md0 is resyncing, other partitions on /dev/sda & /dev/sdb go in DELAYED state, that is expected, I understand.
Similarly with /dev/sdc and /dev/sdd. However after sometime, the
/dev/sd[abcd] drives start to go offline and eventually kernel crashes.
I checked /sys/block/sda/device/state information on Guest OS while the installation was going on, and it says "offline"

I picked up some snapshots and they are kept here:

https://drive.google.com/folderview?id=0B3b5lkAlTOf9eGVFUTVOeWxoTms&usp=sharing

Some important points,
1. I installed a Linux CentOS 6.6, without having these SW RAID partitions being created from within anaconda. 2. When the Guest System came up, I created md raids from within a running system, and similar issue were seen. The problem was same as to what happened during installation, devices went offline, and then kernel crashed.

Everytime, a RAID1 sync starts for a large drive in Guest OS
(say > 20GB), after sometime, devices start to go offline and then kernel crashes. Whether during installation or else otherwise as well.

Could you please help in this.
If you want some more snapshots or error messages do let me know.

Regards
Anugraha Sinha
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux