Thanks Neil for your assistance. We have root caused the issue. There was a problem in setting PhysicalRefNo and Starting Block in Config Record. Now the wrong LBA is not seen. Thanks for your response. Regards, Arka On Tue, Nov 22, 2016 at 3:00 PM, Arka Sharma <arka.sw1988@xxxxxxxxx> wrote: > I have observed that following block > else if (!mddev->bitmap) > j = mddev->recovery_cp; > is getting executed in md_do_sync. I performed to test. In case 1 I > filled the entire 32 mb of physical disks with FF and then wrote the > metadata. And in the following case we filled the 32 mb with zeros and > then wrote the metadata. In both the cases we receive md/raid1:md126: > not clean -- starting background reconstruction message from md when > there is access to LBA 1000182866. However when I create raid 1 using > mdadm and reboot the system there is no access to LBA 1000182866. > Also when I read that sector after creating raid 1 with mdadm we see > this block contains FF. As we have confirmed that mdadm also writing > the config data at 1000182610. Only in case of raid created through > our application results access at that offset. > > Regards, > Arka > > On Tue, Nov 22, 2016 at 5:24 AM, NeilBrown <neilb@xxxxxxxx> wrote: >> On Tue, Nov 22 2016, Arka Sharma wrote: >> >>> ---------- Forwarded message ---------- >>> From: "Arka Sharma" <arka.sw1988@xxxxxxxxx> >>> Date: 21 Nov 2016 12:57 p.m. >>> Subject: Re: mdadm I/O error with Ddf RAID >>> To: "NeilBrown" <neilb@xxxxxxxx> >>> Cc: <linux-raid@xxxxxxxxxxxxxxx> >>> >>> I have run mdadm --examine on both the component devices as well as on >>> the container. This shows that one of the component disk is marked as >>> offline and status is failed. When I run mdadm --detail on the RAID >>> device it shows the component disk 0 state as removed. Since I am very >>> much new to md and linux in general I am not able to fully root cause >>> this issue. I have made couple of observation though, that before the >>> invalid sector 18446744073709551615 is sent, the sector 1000182866 is >>> accessed after which mdraid reports as not clean starts background >>> reconstruction. I read the LBA 1000182866 and this block contains FF. >>> So is md expecting something in the metadata we are not populating ? >>> Please find the attached md127.txt which is the output of the mdadm >>> --examine <container>, blk-core_diff.txt which contains the printk's >>> and dmesg.txt, also DDF_Header0.txt and DDF_Header1.txt are the dump >>> of ddf headers for both the disks. >> >> Thanks for providing more details. >> >> Sector 1000182866 is 256 sectors into the config section. >> It starts reading the config section at 1000182610 and gets 256 sectors, >> so it reads the rest from 1000182866 and then starts the array. >> >> My guess is that md is getting confused about resync and recovery. >> It tries a resync, but as the array appears degraded, this code: >> if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery)) >> j = mddev->resync_min; >> else if (!mddev->bitmap) >> j = mddev->recovery_cp; >> >> in md_do_sync() sets 'j' to MaxSector, which is effectively "-1". It >> then starts resync from there and goes crazy. You could put a printk in >> there to confirm. >> >> I don't know why. Something about the config makes mdadm think the >> array is degraded. I might try to find time to dig into it again later. >> >> NeilBrown -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html