On Wed, 5 Dec 2018 at 14:27, Benjamin Smith <lists@xxxxxxxxxxxxxxxxxx> wrote: > > My gut feeling is that this is related to a RAID1 issue I'm seeing with 7.6. > See email thread "CentOS 7.6: Software RAID1 fails the only meaningful test" > You might want to point out which list you posted it on since it doesn't seem to be this one. > I suggest trying to boot from an earlier kernel. Good luck! > > Ben S > > > On Wednesday, December 5, 2018 9:27:22 AM PST Gordon Messmer wrote: > > I've started updating systems to CentOS 7.6, and so far I have one failure. > > > > This system has two peculiarities which might have triggered the > > problem. The first is that one of the software RAID arrays on this > > system is degraded. While troubleshooting the problem, I saw similar > > error messages mentioned in bug reports indicating that sGNU/Linux > > ystems would not boot with degraded software RAID arrays. The other > > peculiar aspect is that the system uses dm-cache. > > > > Logs from some of the early failed boots are not available, but before I > > completely fixed the problem, I was able to bring the system up once, > > and captured logs which look substantially similar to the initial boot. > > The content of /var/log/messages is here: > > https://paste.fedoraproject.org/paste/n-E6X76FWIKzIvzPOw97uw > > > > The output of lsblk (minus some VM logical volumes) is here: > > https://paste.fedoraproject.org/paste/OizFvMeGn81vF52VEvUbyg > > > > As best I can tell, the LVM tools were treating software RAID component > > devices as PVs, and detecting a conflict between those and the assembled > > RAID volume. When running "pvs" on the broken system, no RAID volumes > > were listed, only component devices. At the moment, I don't know if the > > LVs that were activated by the initrd were backed by component devices > > or the RAID devices, so it's possible that this bug might corrupt > > software RAID arrays. > > > > In order to correct the problem, I had to add a global_filter to > > /etc/lvm/lvm.conf and rebuild the initrd (dracut -f): > > global_filter = [ "r|vm_.*_data|", "a|sdd1|", "r|sd..|" ] > > > > This filter excludes the LVs that contain VM data, accepts "/dev/sdd1" > > which is the dm-cache device, and rejects all other partitions on > > SCSI(SATA) device nodes, as all of those are RAID component devices. > > > > I'm still working on the details of the problem, but I wanted to share > > what I know now in case anyone else might be affected. > > > > After updating, look at the output of "pvs" if you use LVM on software RAID. > > _______________________________________________ > > CentOS mailing list > > CentOS@xxxxxxxxxx > > https://lists.centos.org/mailman/listinfo/centos > > > > > _______________________________________________ > CentOS mailing list > CentOS@xxxxxxxxxx > https://lists.centos.org/mailman/listinfo/centos -- Stephen J Smoogen. _______________________________________________ CentOS mailing list CentOS@xxxxxxxxxx https://lists.centos.org/mailman/listinfo/centos